utils
: I/O utilities¶
I/O utility functions
- NEXUS_HDF5_EXT = ['.h5', '.nx5', '.nxs', '.hdf', '.hdf5', '.cxi']¶
List of possible extensions for HDF5 file formats.
- supported_extensions(flat_formats=True)[source]¶
Returns the list file extensions supported by silx.open.
The result filter out formats when the expected module is not available.
- Parameters
flat_formats (bool) – If true, also include flat formats like npy or edf (while the expected module is available)
- Returns
A dictionary indexed by file description and containing a set of extensions (an extension is a string like “*.ext”).
- Return type
Dict[str, Set[str]]
- save1D(fname, x, y, xlabel=None, ylabels=None, filetype=None, fmt='%.7g', csvdelim=';', newline='\n', header='', footer='', comments='#', autoheader=False)[source]¶
Saves any number of curves to various formats: Specfile, CSV, txt or npy. All curves must have the same number of points and share the same
x
values.- Parameters
fname – Output file path, or file handle open in write mode. If
fname
is a path, file is opened inw
mode. Existing file with a same name will be overwritten.x – 1D-Array (or list) of abscissa values.
y – 2D-array (or list of lists) of ordinates values. First index is the curve index, second index is the sample index. The length of the second dimension (number of samples) must be equal to
len(x)
.y
can be a 1D-array in case there is only one curve to be saved.filetype – Filetype:
"spec", "csv", "txt", "ndarray"
. IfNone
, filetype is detected from file name extension (.dat, .csv, .txt, .npy
).xlabel – Abscissa label
ylabels – List of y labels
fmt – Format string for data. You can specify a short format string that defines a single format for both
x
andy
values, or a list of two different format strings (e.g.["%d", "%.7g"]
). Default is"%.7g"
. This parameter does not apply to the npy format.csvdelim – String or character separating columns in txt and CSV formats. The user is responsible for ensuring that this delimiter is not used in data labels when writing a CSV file.
newline – String or character separating lines/records in txt format (default is line break character
\n
).header – String that will be written at the beginning of the file in txt format.
footer – String that will be written at the end of the file in txt format.
comments – String that will be prepended to the
header
andfooter
strings, to mark them as comments. Default:#
.autoheader – In CSV or txt,
True
causes the first header line to be written as a standard CSV header line with column labels separated by the specified CSV delimiter.
When saving to Specfile format, each curve is saved as a separate scan with two data columns (
x
andy
).CSV and txt formats are similar, except that the txt format allows user defined header and footer text blocks, whereas the CSV format has only a single header line with columns labels separated by field delimiters and no footer. The txt format also allows defining a record separator different from a line break.
The npy format is written with
numpy.save
and can be read back withnumpy.load
. Ifxlabel
andylabels
are undefined, data is saved as a regular 2Dnumpy.ndarray
(contatenation ofx
andy
). If bothxlabel
andylabels
are defined, the data is saved as anumpy.recarray
after being transposed and having labels assigned to columns.
- savetxt(fname, X, fmt='%.7g', delimiter=';', newline='\n', header='', footer='', comments='#')[source]¶
numpy.savetxt
backport of header and footer arguments from numpy=1.7.0.See
numpy.savetxt
help: http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.savetxt.html
- savespec(specfile, x, y, xlabel='X', ylabel='Y', fmt='%.7g', scan_number=1, mode='w', write_file_header=True, close_file=False)[source]¶
Saves one curve to a SpecFile.
The curve is saved as a scan with two data columns. To save multiple curves to a single SpecFile, call this function for each curve by providing the same file handle each time.
- Parameters
specfile – Output SpecFile name, or file handle open in write or append mode. If a file name is provided, a new file is open in write mode (existing file with the same name will be lost)
x – 1D-Array (or list) of abscissa values
y – 1D-array (or list), or list of them of ordinates values. All dataset must have the same length as x
xlabel – Abscissa label (default
"X"
)ylabel – Ordinate label, may be a list of labels when multiple curves are to be saved together.
fmt – Format string for data. You can specify a short format string that defines a single format for both
x
andy
values, or a list of two different format strings (e.g.["%d", "%.7g"]
). Default is"%.7g"
.scan_number – Scan number (default 1).
mode – Mode for opening file:
w
(default),a
,r+
,w+
,a+
. This parameter is only relevant ifspecfile
is a path.write_file_header – If
True
, write a file header before writing the scan (#F
and#D
line).close_file – If
True
, close the file after saving curve.
- Returns
None
ifclose_file
isTrue
, else return the file handle.
- h5ls(h5group, lvl=0)[source]¶
Return a simple string representation of a HDF5 tree structure.
- Parameters
h5group – Any
h5py.Group
orh5py.File
instance, or a HDF5 file namelvl – Number of tabulations added to the group.
lvl
is incremented as we recursively process sub-groups.
- Returns
String representation of an HDF5 tree structure
Group names and dataset representation are printed preceded by a number of tabulations corresponding to their depth in the tree structure. Datasets are represented as
h5py.Dataset
objects.Example:
>>> print(h5ls("Downloads/sample.h5")) +fields +fieldB <HDF5 dataset "z": shape (256, 256), type "<f4"> +fieldE <HDF5 dataset "x": shape (256, 256), type "<f4"> <HDF5 dataset "y": shape (256, 256), type "<f4">
Note
This function requires h5py to be installed.
- open(filename)[source]¶
Open a file as an h5py-like object.
Format supported: - h5 files, if h5py module is installed - SPEC files exposed as a NeXus layout - raster files exposed as a NeXus layout (if fabio is installed) - fio files exposed as a NeXus layout - Numpy files (‘npy’ and ‘npz’ files)
The filename can be trailled an HDF5 path using the separator ::. In this case the object returned is a proxy to the target node, implementing the close function and supporting with context.
The file is opened in read-only mode.
- Parameters
filename (str) – A filename which can containt an HDF5 path by using :: separator.
- Raises
IOError if the file can’t be loaded or path can’t be found
- Return type
h5py-like node
- get_h5_class(obj=None, class_=None)[source]¶
Returns the HDF5 type relative to the object or to the class.
- Parameters
obj – Instance of an object
class – A class
- Return type
- h5type_to_h5py_class(type_)[source]¶
Returns an h5py class from an H5Type. None if nothing found.
- Parameters
type (H5Type) –
- Return type
H5py class
- get_h5py_class(obj)[source]¶
Returns the h5py class from an object.
If it is an h5py object or an h5py-like object, an h5py class is returned. If the object is not an h5py-like object, None is returned.
- Parameters
obj – An object
- Returns
An h5py object
- is_group(obj)[source]¶
True if the object is a h5py.Group-like object. A file is a group.
- Parameters
obj – An object
- is_dataset(obj)[source]¶
True if the object is a h5py.Dataset-like object.
- Parameters
obj – An object
- is_softlink(obj)[source]¶
True if the object is a h5py.SoftLink-like object.
- Parameters
obj – An object
- is_externallink(obj)[source]¶
True if the object is a h5py.ExternalLink-like object.
- Parameters
obj – An object
- visitall(item)[source]¶
Visit entity recursively including links.
It does not follow links. This is a generator yielding (relative path, object) for visited items.
- Parameters
item – The item to visit.
- match(group, path_pattern)[source]¶
Generator of paths inside given h5py-like group matching path_pattern
- Parameters
path_pattern (
str
) –- Return type
Generator
[str
,None
,None
]
- get_data(url)[source]¶
Returns a numpy data from an URL.
Examples:
>>> # 1st frame from an EDF using silx.io.open >>> data = silx.io.get_data("silx:/users/foo/image.edf::/scan_0/instrument/detector_0/data[0]")
>>> # 1st frame from an EDF using fabio >>> data = silx.io.get_data("fabio:/users/foo/image.edf::[0]")
Yet 2 schemes are supported by the function.
- If silx scheme is used, the file is opened using
silx.io.open()
and the data is reach using usually NeXus paths.
- If fabio scheme is used, the file is opened using
fabio.open()
from the FabIO library. No data path have to be specified, but each frames can be accessed using the data slicing. This shortcut of
silx.io.open()
allow to have a faster access to the data.
- If fabio scheme is used, the file is opened using
See also
- Parameters
Union[str,silx.io.url.DataUrl] – A data URL
- Return type
Union[numpy.ndarray, numpy.generic]
- Raises
ImportError – If the mandatory library to read the file is not available.
ValueError – If the URL is not valid or do not match the data
IOError – If the file is not found or in case of internal error of
fabio.open()
orsilx.io.open()
. In this last case more informations are displayed in debug mode.
- rawfile_to_h5_external_dataset(bin_file, output_url, shape, dtype, overwrite=False)[source]¶
Create a HDF5 dataset at output_url pointing to the given vol_file.
Either shape or info_file must be provided.
- Parameters
bin_file (str) – Path to the .vol file
output_url (DataUrl) – HDF5 URL where to save the external dataset
shape (tuple) – Shape of the volume
dtype (numpy.dtype) – Data type of the volume elements (default: float32)
overwrite (bool) – True to allow overwriting (default: False).
- vol_to_h5_external_dataset(vol_file, output_url, info_file=None, vol_dtype=<class 'numpy.float32'>, overwrite=False)[source]¶
Create a HDF5 dataset at output_url pointing to the given vol_file.
- If the vol_file.info containing the shape is not on the same folder as the
vol-file then you should specify her location.
- Parameters
vol_file (str) – Path to the .vol file
output_url (DataUrl) – HDF5 URL where to save the external dataset
info_file (Union[str,None]) – .vol.info file name written by pyhst and containing the shape information
vol_dtype (numpy.dtype) – Data type of the volume elements (default: float32)
overwrite (bool) – True to allow overwriting (default: False).
- Raises
ValueError – If fails to read shape from the .vol.info file
- h5py_decode_value(value, encoding='utf-8', errors='surrogateescape')[source]¶
Keep bytes when value cannot be decoded
- Parameters
value – bytes or array of bytes
str (errors) –
str –
- h5py_encode_value(value, encoding='utf-8', errors='surrogateescape')[source]¶
Keep string when value cannot be encoding
- Parameters
value – string or array of strings
str (errors) –
str –
- class H5pyDatasetReadWrapper(dset, decode_ascii=False)[source]¶
Wrapper to handle H5T_STRING decoding on-the-fly when reading a dataset. Uniform behaviour for h5py 2.x and h5py 3.x
h5py abuses H5T_STRING with ASCII character set to store bytes: dset[()] = b”…” Therefore an H5T_STRING with ASCII encoding is not decoded by default.
- class H5pyAttributesReadWrapper(attrs, decode_ascii=False)[source]¶
Wrapper to handle H5T_STRING decoding on-the-fly when reading an attribute. Uniform behaviour for h5py 2.x and h5py 3.x
h5py abuses H5T_STRING with ASCII character set to store bytes: dset[()] = b”…” Therefore an H5T_STRING with ASCII encoding is not decoded by default.
- h5py_read_dataset(dset, index=(), decode_ascii=False)[source]¶
Read data from dataset object. UTF-8 strings will be decoded while ASCII strings will only be decoded when decode_ascii=True.
- Parameters
dset (h5py.Dataset) –
index – slicing (all by default)
decode_ascii (bool) –