utils: I/O utilities

I/O utility functions

NEXUS_HDF5_EXT = ['.h5', '.nx5', '.nxs', '.hdf', '.hdf5', '.cxi']

List of possible extensions for HDF5 file formats.

class H5Type(value)[source]

Identify a set of HDF5 concepts

supported_extensions(flat_formats=True)[source]

Returns the list file extensions supported by silx.open.

The result filter out formats when the expected module is not available.

Parameters

flat_formats (bool) – If true, also include flat formats like npy or edf (while the expected module is available)

Returns

A dictionary indexed by file description and containing a set of extensions (an extension is a string like “*.ext”).

Return type

Dict[str, Set[str]]

save1D(fname, x, y, xlabel=None, ylabels=None, filetype=None, fmt='%.7g', csvdelim=';', newline='\n', header='', footer='', comments='#', autoheader=False)[source]

Saves any number of curves to various formats: Specfile, CSV, txt or npy. All curves must have the same number of points and share the same x values.

Parameters
  • fname – Output file path, or file handle open in write mode. If fname is a path, file is opened in w mode. Existing file with a same name will be overwritten.

  • x – 1D-Array (or list) of abscissa values.

  • y – 2D-array (or list of lists) of ordinates values. First index is the curve index, second index is the sample index. The length of the second dimension (number of samples) must be equal to len(x). y can be a 1D-array in case there is only one curve to be saved.

  • filetype – Filetype: "spec", "csv", "txt", "ndarray". If None, filetype is detected from file name extension (.dat, .csv, .txt, .npy).

  • xlabel – Abscissa label

  • ylabels – List of y labels

  • fmt – Format string for data. You can specify a short format string that defines a single format for both x and y values, or a list of two different format strings (e.g. ["%d", "%.7g"]). Default is "%.7g". This parameter does not apply to the npy format.

  • csvdelim – String or character separating columns in txt and CSV formats. The user is responsible for ensuring that this delimiter is not used in data labels when writing a CSV file.

  • newline – String or character separating lines/records in txt format (default is line break character \n).

  • header – String that will be written at the beginning of the file in txt format.

  • footer – String that will be written at the end of the file in txt format.

  • comments – String that will be prepended to the header and footer strings, to mark them as comments. Default: #.

  • autoheader – In CSV or txt, True causes the first header line to be written as a standard CSV header line with column labels separated by the specified CSV delimiter.

When saving to Specfile format, each curve is saved as a separate scan with two data columns (x and y).

CSV and txt formats are similar, except that the txt format allows user defined header and footer text blocks, whereas the CSV format has only a single header line with columns labels separated by field delimiters and no footer. The txt format also allows defining a record separator different from a line break.

The npy format is written with numpy.save and can be read back with numpy.load. If xlabel and ylabels are undefined, data is saved as a regular 2D numpy.ndarray (contatenation of x and y). If both xlabel and ylabels are defined, the data is saved as a numpy.recarray after being transposed and having labels assigned to columns.

savetxt(fname, X, fmt='%.7g', delimiter=';', newline='\n', header='', footer='', comments='#')[source]

numpy.savetxt backport of header and footer arguments from numpy=1.7.0.

See numpy.savetxt help: http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.savetxt.html

savespec(specfile, x, y, xlabel='X', ylabel='Y', fmt='%.7g', scan_number=1, mode='w', write_file_header=True, close_file=False)[source]

Saves one curve to a SpecFile.

The curve is saved as a scan with two data columns. To save multiple curves to a single SpecFile, call this function for each curve by providing the same file handle each time.

Parameters
  • specfile – Output SpecFile name, or file handle open in write or append mode. If a file name is provided, a new file is open in write mode (existing file with the same name will be lost)

  • x – 1D-Array (or list) of abscissa values

  • y – 1D-array (or list) of ordinates values

  • xlabel – Abscissa label (default "X")

  • ylabel – Ordinate label

  • fmt – Format string for data. You can specify a short format string that defines a single format for both x and y values, or a list of two different format strings (e.g. ["%d", "%.7g"]). Default is "%.7g".

  • scan_number – Scan number (default 1).

  • mode – Mode for opening file: w (default), a, r+, w+, a+. This parameter is only relevant if specfile is a path.

  • write_file_header – If True, write a file header before writing the scan (#F and #D line).

  • close_file – If True, close the file after saving curve.

Returns

None if close_file is True, else return the file handle.

h5ls(h5group, lvl=0)[source]

Return a simple string representation of a HDF5 tree structure.

Parameters
  • h5group – Any h5py.Group or h5py.File instance, or a HDF5 file name

  • lvl – Number of tabulations added to the group. lvl is incremented as we recursively process sub-groups.

Returns

String representation of an HDF5 tree structure

Group names and dataset representation are printed preceded by a number of tabulations corresponding to their depth in the tree structure. Datasets are represented as h5py.Dataset objects.

Example:

>>> print(h5ls("Downloads/sample.h5"))
+fields
    +fieldB
        <HDF5 dataset "z": shape (256, 256), type "<f4">
    +fieldE
        <HDF5 dataset "x": shape (256, 256), type "<f4">
        <HDF5 dataset "y": shape (256, 256), type "<f4">

Note

This function requires h5py to be installed.

open(filename)[source]

Open a file as an h5py-like object.

Format supported: - h5 files, if h5py module is installed - SPEC files exposed as a NeXus layout - raster files exposed as a NeXus layout (if fabio is installed) - Numpy files (‘npy’ and ‘npz’ files)

The filename can be trailled an HDF5 path using the separator ::. In this case the object returned is a proxy to the target node, implementing the close function and supporting with context.

The file is opened in read-only mode.

Parameters

filename (str) – A filename which can containt an HDF5 path by using :: separator.

Raises

IOError if the file can’t be loaded or path can’t be found

Return type

h5py-like node

get_h5_class(obj=None, class_=None)[source]

Returns the HDF5 type relative to the object or to the class.

Parameters
  • obj – Instance of an object

  • class – A class

Return type

H5Type

h5type_to_h5py_class(type_)[source]

Returns an h5py class from an H5Type. None if nothing found.

Parameters

type (H5Type) –

Return type

H5py class

get_h5py_class(obj)[source]

Returns the h5py class from an object.

If it is an h5py object or an h5py-like object, an h5py class is returned. If the object is not an h5py-like object, None is returned.

Parameters

obj – An object

Returns

An h5py object

is_file(obj)[source]

True is the object is an h5py.File-like object.

Parameters

obj – An object

is_group(obj)[source]

True if the object is a h5py.Group-like object. A file is a group.

Parameters

obj – An object

is_dataset(obj)[source]

True if the object is a h5py.Dataset-like object.

Parameters

obj – An object

True if the object is a h5py.SoftLink-like object.

Parameters

obj – An object

get_data(url)[source]

Returns a numpy data from an URL.

Examples:

>>> # 1st frame from an EDF using silx.io.open
>>> data = silx.io.get_data("silx:/users/foo/image.edf::/scan_0/instrument/detector_0/data[0]")
>>> # 1st frame from an EDF using fabio
>>> data = silx.io.get_data("fabio:/users/foo/image.edf::[0]")

Yet 2 schemes are supported by the function.

  • If silx scheme is used, the file is opened using

    silx.io.open() and the data is reach using usually NeXus paths.

  • If fabio scheme is used, the file is opened using fabio.open()

    from the FabIO library. No data path have to be specified, but each frames can be accessed using the data slicing. This shortcut of silx.io.open() allow to have a faster access to the data.

Parameters

Union[str,silx.io.url.DataUrl] – A data URL

Return type

Union[numpy.ndarray, numpy.generic]

Raises
  • ImportError – If the mandatory library to read the file is not available.

  • ValueError – If the URL is not valid or do not match the data

  • IOError – If the file is not found or in case of internal error of fabio.open() or silx.io.open(). In this last case more informations are displayed in debug mode.

rawfile_to_h5_external_dataset(bin_file, output_url, shape, dtype, overwrite=False)[source]

Create a HDF5 dataset at output_url pointing to the given vol_file.

Either shape or info_file must be provided.

Parameters
  • bin_file (str) – Path to the .vol file

  • output_url (DataUrl) – HDF5 URL where to save the external dataset

  • shape (tuple) – Shape of the volume

  • dtype (numpy.dtype) – Data type of the volume elements (default: float32)

  • overwrite (bool) – True to allow overwriting (default: False).

vol_to_h5_external_dataset(vol_file, output_url, info_file=None, vol_dtype=<class 'numpy.float32'>, overwrite=False)[source]

Create a HDF5 dataset at output_url pointing to the given vol_file.

If the vol_file.info containing the shape is not on the same folder as the

vol-file then you should specify her location.

Parameters
  • vol_file (str) – Path to the .vol file

  • output_url (DataUrl) – HDF5 URL where to save the external dataset

  • info_file (Union[str,None]) – .vol.info file name written by pyhst and containing the shape information

  • vol_dtype (numpy.dtype) – Data type of the volume elements (default: float32)

  • overwrite (bool) – True to allow overwriting (default: False).

Raises

ValueError – If fails to read shape from the .vol.info file