spech5: h5py-like API to SpecFile¶
This module provides a h5py-like API to access SpecFile data.
API description¶
Specfile data structure exposed by this API:
/
    1.1/
        title = "…"
        start_time = "…"
        instrument/
            specfile/
                file_header = "…"
                scan_header = "…"
            positioners/
                motor_name = value
                …
            mca_0/
                data = …
                calibration = …
                channels = …
                preset_time = …
                elapsed_time = …
                live_time = …
            mca_1/
                …
            …
        measurement/
            colname0 = …
            colname1 = …
            …
            mca_0/
                 data -> /1.1/instrument/mca_0/data
                 info -> /1.1/instrument/mca_0/
            …
        sample/
            ub_matrix = …
            unit_cell = …
            unit_cell_abc = …
            unit_cell_alphabetagamma = …
    2.1/
        …
file_header and scan_header are the raw headers as they
appear in the original file, as a string of lines separated by newline (\n) characters.
The title is the content of the #S scan header line without the leading
#S (e.g "1  ascan  ss1vo -4.55687 -0.556875  40 0.2").
The start time is converted to ISO8601 format ("2016-02-23T22:49:05Z"),
if the original date format is standard.
Numeric datasets are stored in float32 format, except for scalar integers which are stored as int64.
Motor positions (e.g. /1.1/instrument/positioners/motor_name) can be
1D numpy arrays if they are measured as scan data, or else scalars as defined
on #P scan header lines. A simple test is done to check if the motor name
is also a data column header defined in the #L scan header line.
Scan data  (e.g. /1.1/measurement/colname0) is accessed by column,
the dataset name colname0 being the column label as defined in the #L
scan header line.
If a / character is present in a column label or in a motor name in the
original SPEC file, it will be substituted with a % character in the
corresponding dataset name.
MCA data is exposed as a 2D numpy array containing all spectra for a given analyser. The number of analysers is calculated as the number of MCA spectra per scan data line. Demultiplexing is then performed to assign the correct spectra to a given analyser.
MCA calibration is an array of 3 scalars, from the #@CALIB header line.
It is identical for all MCA analysers, as there can be only one
#@CALIB line per scan.
MCA channels is an array containing all channel numbers. This information is
computed from the #@CHANN scan header line (if present), or computed from
the shape of the first spectrum in a scan ([0, … len(first_spectrum] - 1]).
Accessing data¶
Data and groups are accessed in h5py fashion:
from silx.io.spech5 import SpecH5
# Open a SpecFile
sfh5 = SpecH5("test.dat")
# using SpecH5 as a regular group to access scans
scan1group = sfh5["1.1"]
instrument_group = scan1group["instrument"]
# alternative: full path access
measurement_group = sfh5["/1.1/measurement"]
# accessing a scan data column by name as a 1D numpy array
data_array = measurement_group["Pslit HGap"]
# accessing all mca-spectra for one MCA device
mca_0_spectra = measurement_group["mca_0/data"]
SpecH5 files and groups provide a keys() method:
>>> sfh5.keys()
['96.1', '97.1', '98.1']
>>> sfh5['96.1'].keys()
['title', 'start_time', 'instrument', 'measurement']
They can also be treated as iterators:
from silx.io import is_dataset
for scan_group in SpecH5("test.dat"):
    dataset_names = [item.name in scan_group["measurement"] if
                     is_dataset(item)]
    print("Found data columns in scan " + scan_group.name)
    print(", ".join(dataset_names))
You can test for existence of data or groups:
>>> "/1.1/measurement/Pslit HGap" in sfh5
True
>>> "positioners" in sfh5["/2.1/instrument"]
True
>>> "spam" in sfh5["1.1"]
False
Strings are stored encoded as numpy.string_, as recommended by
the h5py documentation.
This ensures maximum compatibility with third party software libraries,
when saving a SpecH5 to a HDF5 file using silx.io.spectoh5.
The type numpy.string_ is a byte-string format. The consequence of this
is that you should decode strings before using them in Python 3:
>>> from silx.io.spech5 import SpecH5
>>> sfh5 = SpecH5("31oct98.dat")
>>> sfh5["/68.1/title"]
b'68  ascan  tx3 -28.5 -24.5  20 0.5'
>>> sfh5["/68.1/title"].decode()
'68  ascan  tx3 -28.5 -24.5  20 0.5'
Classes¶
- 
class silx.io.spech5.SpecH5(filename)[source]¶
- Bases: - silx.io.commonh5.File,- silx.io.spech5.SpecH5Group- This class opens a SPEC file and exposes it as a h5py.File. - It inherits - silx.io.commonh5.Group(via- commonh5.File), which implements most of its API.- 
filename¶
 - 
attrs¶
- Returns HDF5 attributes of this node. - Return type: - dict 
 - 
__contains__(name)¶
- Returns true if name is an existing child of this group. - Return type: - bool 
 - 
__enter__()¶
 - 
__exit__(exc_type, exc_val, exc_tb)¶
 - 
__getitem__(name)¶
- Return a child from his name. - Parameters: - str (name) – name of a member or a path throug members using ‘/’ separator. A ‘/’ as a prefix access to the root item of the tree. - Return type: - Node 
 - 
__iter__()¶
- Iterate over member names 
 - 
__len__()¶
- Returns the number of children contained in this group. - Return type: - int 
 - 
basename¶
- Returns the HDF5 basename of this node. 
 - 
create_dataset(name, shape=None, dtype=None, data=None, **kwds)¶
- Create and return a sub dataset. - Parameters: - name (str) – Name of the dataset.
- shape – Dataset shape. Use “()” for scalar datasets. Required if “data” isn’t provided.
- dtype – Numpy dtype or string. If omitted, dtype(‘f’) will be used. Required if “data” isn’t provided; otherwise, overrides data array’s dtype.
- data (numpy.ndarray) – Provide data to initialize the dataset. If used, you can omit shape and dtype arguments.
- kwds – Extra arguments. Nothing yet supported.
 
 - 
create_group(name)¶
- Create and return a new subgroup. - Name may be absolute or relative. Fails if the target name already exists. - Parameters: - name (str) – Name of the new group 
 - 
file¶
- Returns the file node of this node. - Return type: - Node 
 - 
get(name, default=None, getclass=False, getlink=False)¶
- Retrieve an item or other information. - If getlink only is true, the returned value is always h5py.HardLink, because this implementation do not use links. Like the original implementation. - Parameters: - name (str) – name of the item
- default (object) – default value returned if the name is not found
- getclass (bool) – if true, the returned object is the class of the object found
- getlink (bool) – if true, links object are returned instead of the target
 - Returns: - An object, else None - Return type: - object 
 - 
h5py_class¶
- Returns the - h5py.Fileclass
 - 
items()¶
- Returns a list of tuples containing (name, node) pairs. 
 - 
keys()¶
- Returns a list of the children’s names. 
 - 
mode¶
 - 
name¶
- Returns the HDF5 name of this node. 
 - 
parent¶
- Returns the parent of the node. - Return type: - Node 
 - 
values()¶
- Returns a list of the children nodes (groups and datasets). - New in version 0.6. 
 - 
visit(func, visit_links=False)¶
- Recursively visit all names in this group and subgroups. See the documentation for h5py.Group.visit for more help. - Parameters: - func (function) – Callable (function, method or callable object) 
 - 
visititems(func, visit_links=False)¶
- Recursively visit names and objects in this group. See the documentation for h5py.Group.visititems for more help. - Parameters: - func (function) – Callable (function, method or callable object)
- visit_links (bool) – If False, ignore links. If True, call func(name) for links and recurse into target groups.
 
 
- 
- 
class silx.io.spech5.SpecH5Group[source]¶
- Bases: - object- This convenience class is to be inherited by all groups, for compatibility purposes with code that tests for - isinstance(obj, SpecH5Group).- This legacy behavior is deprecated. The correct way to test if an object is a group is to use - silx.io.utils.is_group().- Groups must also inherit - silx.io.commonh5.Group, which actually implements all the methods and attributes.
- 
class silx.io.commonh5.Group(name, parent=None, attrs=None)[source]¶
- Bases: - silx.io.commonh5.Node- This class mimics a h5py.Group. - 
h5py_class¶
- Returns the h5py classes which is mimicked by this class. - It returns h5py.Group - Return type: - Class 
 - 
get(name, default=None, getclass=False, getlink=False)[source]¶
- Retrieve an item or other information. - If getlink only is true, the returned value is always h5py.HardLink, because this implementation do not use links. Like the original implementation. - Parameters: - name (str) – name of the item
- default (object) – default value returned if the name is not found
- getclass (bool) – if true, the returned object is the class of the object found
- getlink (bool) – if true, links object are returned instead of the target
 - Returns: - An object, else None - Return type: - object 
 - 
visit(func, visit_links=False)[source]¶
- Recursively visit all names in this group and subgroups. See the documentation for h5py.Group.visit for more help. - Parameters: - func (function) – Callable (function, method or callable object) 
 - 
visititems(func, visit_links=False)[source]¶
- Recursively visit names and objects in this group. See the documentation for h5py.Group.visititems for more help. - Parameters: - func (function) – Callable (function, method or callable object)
- visit_links (bool) – If False, ignore links. If True, call func(name) for links and recurse into target groups.
 
 - 
attrs¶
- Returns HDF5 attributes of this node. - Return type: - dict 
 - 
basename¶
- Returns the HDF5 basename of this node. 
 - 
file¶
- Returns the file node of this node. - Return type: - Node 
 - 
name¶
- Returns the HDF5 name of this node. 
 - 
parent¶
- Returns the parent of the node. - Return type: - Node 
 
- 
- 
class silx.io.spech5.SpecH5Dataset[source]¶
- Bases: - object- This convenience class is to be inherited by all datasets, for compatibility purpose with code that tests for - isinstance(obj, SpecH5Dataset).- This legacy behavior is deprecated. The correct way to test if an object is a dataset is to use - silx.io.utils.is_dataset().- Datasets must also inherit - SpecH5NodeDatasetor- SpecH5LazyNodeDatasetwhich actually implement all the API.
- 
class silx.io.spech5.SpecH5NodeDataset(name, data, parent=None, attrs=None)[source]¶
- Bases: - silx.io.commonh5.Dataset,- silx.io.spech5.SpecH5Dataset- This class inherits - commonh5.Dataset, to which it adds little extra functionality. The main additional functionality is the proxy behavior that allows to mimic the numpy array stored in this class.- 
value¶
- Returns the data exposed by this dataset. - Deprecated by h5py. It is prefered to use indexing [()]. - Return type: - numpy.ndarray 
 - 
__getitem__(item)¶
- Returns the slice of the data exposed by this dataset. - Return type: - numpy.ndarray 
 - 
__iter__()¶
- Iterate over the first axis. TypeError if scalar. 
 - 
__len__()¶
- Returns the size of the data exposed by this dataset. - Return type: - int 
 - 
attrs¶
- Returns HDF5 attributes of this node. - Return type: - dict 
 - 
basename¶
- Returns the HDF5 basename of this node. 
 - 
chunks¶
- Returns chunks as provided by h5py.Dataset. - There is no chunks. 
 - 
compression¶
- Returns compression as provided by h5py.Dataset. - There is no compression. 
 - 
compression_opts¶
- Returns compression options as provided by h5py.Dataset. - There is no compression. 
 - 
dtype¶
- Returns the numpy datatype exposed by this dataset. - Return type: - numpy.dtype 
 - 
file¶
- Returns the file node of this node. - Return type: - Node 
 - 
h5py_class¶
- Returns the h5py classes which is mimicked by this class. It can be one of h5py.File, h5py.Group or h5py.Dataset - Return type: - Class 
 - 
name¶
- Returns the HDF5 name of this node. 
 - 
parent¶
- Returns the parent of the node. - Return type: - Node 
 - 
shape¶
- Returns the shape of the data exposed by this dataset. - Return type: - tuple 
 - 
size¶
- Returns the size of the data exposed by this dataset. - Return type: - int 
 
- 
