spech5: h5py-like API to SpecFile

This module provides a h5py-like API to access SpecFile data.

API description

Specfile data structure exposed by this API:

/
    1.1/
        title = "…"
        start_time = "…"
        instrument/
            specfile/
                file_header = "…"
                scan_header = "…"
            positioners/
                motor_name = value
                …
            mca_0/
                data = …
                calibration = …
                channels = …
                preset_time = …
                elapsed_time = …
                live_time = …

            mca_1/
                …
            …
        measurement/
            colname0 = …
            colname1 = …
            …
            mca_0/
                 data -> /1.1/instrument/mca_0/data
                 info -> /1.1/instrument/mca_0/
            …
    2.1/
        …

file_header and scan_header are the raw headers as they appear in the original file, as a string of lines separated by newline (\n) characters.

The title is the content of the #S scan header line without the leading #S (e.g "1  ascan  ss1vo -4.55687 -0.556875  40 0.2").

The start time is converted to ISO8601 format ("2016-02-23T22:49:05Z"), if the original date format is standard.

Numeric datasets are stored in float32 format, except for scalar integers which are stored as int64.

Motor positions (e.g. /1.1/instrument/positioners/motor_name) can be 1D numpy arrays if they are measured as scan data, or else scalars as defined on #P scan header lines. A simple test is done to check if the motor name is also a data column header defined in the #L scan header line.

Scan data (e.g. /1.1/measurement/colname0) is accessed by column, the dataset name colname0 being the column label as defined in the #L scan header line.

MCA data is exposed as a 2D numpy array containing all spectra for a given analyser. The number of analysers is calculated as the number of MCA spectra per scan data line. Demultiplexing is then performed to assign the correct spectra to a given analyser.

MCA calibration is an array of 3 scalars, from the #@CALIB header line. It is identical for all MCA analysers, as there can be only one #@CALIB line per scan.

MCA channels is an array containing all channel numbers. This information is computed from the #@CHANN scan header line (if present), or computed from the shape of the first spectrum in a scan ([0, len(first_spectrum] - 1]).

Accessing data

Data and groups are accessed in h5py fashion:

from silx.io.spech5 import SpecH5

# Open a SpecFile
sfh5 = SpecH5("test.dat")

# using SpecH5 as a regular group to access scans
scan1group = sfh5["1.1"]
instrument_group = scan1group["instrument"]

# alternative: full path access
measurement_group = sfh5["/1.1/measurement"]

# accessing a scan data column by name as a 1D numpy array
data_array = measurement_group["Pslit HGap"]

# accessing all mca-spectra for one MCA device
mca_0_spectra = measurement_group["mca_0/data"]

SpecH5 and SpecH5Group provide a SpecH5Group.keys() method:

>>> sfh5.keys()
['96.1', '97.1', '98.1']
>>> sfh5['96.1'].keys()
['title', 'start_time', 'instrument', 'measurement']

They can also be treated as iterators:

for scan_group in SpecH5("test.dat"):
    dataset_names = [item.name in scan_group["measurement"] if
                     isinstance(item, SpecH5Dataset)]
    print("Found data columns in scan " + scan_group.name)
    print(", ".join(dataset_names))

You can test for existence of data or groups:

>>> "/1.1/measurement/Pslit HGap" in sfh5
True
>>> "positioners" in sfh5["/2.1/instrument"]
True
>>> "spam" in sfh5["1.1"]
False

Strings are stored encoded as numpy.string_, as recommended by the h5py documentation. This ensures maximum compatibility with third party software libraries, when saving a SpecH5 to a HDF5 file using silx.io.spectoh5.

The type numpy.string_ is a byte-string format. The consequence of this is that you should decode strings before using them in Python 3:

>>> from silx.io.spech5 import SpecH5
>>> sfh5 = SpecH5("31oct98.dat")
>>> sfh5["/68.1/title"]
b'68  ascan  tx3 -28.5 -24.5  20 0.5'
>>> sfh5["/68.1/title"].decode()
'68  ascan  tx3 -28.5 -24.5  20 0.5'

Classes

silx.io.spech5.is_group(name)[source]

Check if name matches a valid group name pattern in a SpecH5.

Parameters:name (str) – Full name of member

For example:

  • is_group("/123.456/instrument/") returns True.
  • is_group("spam") returns False because "spam" is not at all a valid group name.
  • is_group("/1.2/instrument/positioners/xyz") returns False because this key would point to a motor position, which is a dataset and not a group.
silx.io.spech5.is_dataset(name)[source]

Check if name matches a valid dataset name pattern in a SpecH5.

Parameters:name (str) – Full name of member

For example:

  • is_dataset("/1.2/instrument/positioners/xyz") returns True because this name could be the key to the dataset recording motor positions for motor xyz in scan 1.2.
  • is_dataset("/123.456/instrument/") returns False because this name points to a group.
  • is_dataset("spam") returns False because "spam" is not at all a valid dataset name.

Check if name is a valid link to a group in a SpecH5. Return True or False

Parameters:name (str) – Full name of member

Check if name is a valid link to a dataset in a SpecH5. Return True or False

Parameters:name (str) – Full name of member
silx.io.spech5.spec_date_to_iso8601(date, zone=None)[source]

Convert SpecFile date to Iso8601.

Parameters:
  • date (str) – Date (see supported formats below)
  • zone – Time zone as it appears in a ISO8601 date

Supported formats:

  • DDD MMM dd hh:mm:ss YYYY
  • DDD YYYY/MM/dd hh:mm:ss YYYY

where DDD is the abbreviated weekday, MMM is the month abbreviated name, MM is the month number (zero padded), dd is the weekday number (zero padded) YYYY is the year, hh the hour (zero padded), mm the minute (zero padded) and ss the second (zero padded). All names are expected to be in english.

Examples:

>>> spec_date_to_iso8601("Thu Feb 11 09:54:35 2016")
'2016-02-11T09:54:35'

>>> spec_date_to_iso8601("Sat 2015/03/14 03:53:50")
'2015-03-14T03:53:50'
class silx.io.spech5.SpecH5Dataset(value, name, file_, parent)[source]

Bases: object

Emulate h5py.Dataset for a SpecFile object.

A SpecH5Dataset instance is basically a proxy for the numpy array value attribute, with additional attributes for compatibility with h5py datasets.

Parameters:
  • value – Actual dataset value
  • name (str) – Dataset full name (posix path format, starting with /)
  • file – Parent SpecH5
  • parent – Parent SpecH5Group which contains this dataset
value = None

Actual dataset, can be a numpy array, a numpy.string_, a numpy.int_ or a numpy.float32

All operations applied to an instance of the class use this.

shape = None

Dataset shape, as a tuple with the length of each dimension of the dataset.

dtype = None

Dataset dtype

size = None

Dataset size (number of elements)

name = None

“Dataset name (posix path format, starting with /)

parent = None

Parent SpecH5Group object which contains this dataset

file = None

Parent SpecH5 object

attrs = None

Attributes dictionary

compression = None

Compression attribute as provided by h5py.Dataset

compression_opts = None

Compression options attribute as provided by h5py.Dataset

h5py_class[source]

Return h5py class which is mimicked by this class: h5py.dataset.

Accessing this attribute if h5py is not installed causes an ImportError to be raised

__len__()[source]
__getitem__(item)[source]
class silx.io.spech5.SpecH5LinkToDataset(value, name, file_, parent)[source]

Bases: silx.io.spech5.SpecH5Dataset

Special SpecH5Dataset representing a link to a dataset. It works exactly like a regular dataset, but SpecH5Group.visit() and SpecH5Group.visititems() methods will recognize that it is a link and will ignore it.

class silx.io.spech5.SpecH5Group(name, specfileh5)[source]

Bases: object

Emulate h5py.Group for a SpecFile object

Parameters:
  • name (str) – Group full name (posix path format, starting with /)
  • specfileh5 – parent SpecH5 instance
name = None

Full name/path of group

file = None

Parent SpecH5 object

attrs = None

Attributes dictionary

h5py_class[source]

Return h5py class which is mimicked by this class: h5py.Group.

Accessing this attribute if h5py is not installed causes an ImportError to be raised

parent[source]

Parent group (group that contains this group)

__contains__(key)[source]
Parameters:key – Path to child element (e.g. "mca_0/info") or full name of group or dataset (e.g. "/2.1/instrument/positioners")
Returns:True if key refers to a valid member of this group, else False
get(name, default=None, getclass=False, getlink=False)[source]

Retrieve an item by name, or a default value if name does not point to an existing item.

Parameters:
  • str (name) – name of the item
  • default – Default value returned if the name is not found
  • getclass (bool) – if True, the returned object is the class of the item, instead of the item instance.
  • getlink (bool) – Not implemented. This method always returns an instance of the original class of the requested item (or just the class, if getclass is True)
Returns:

The requested item, or its class if getclass is True, or the specified default value if the group does not contain an item with the requested name.

__getitem__(key)[source]

Return a SpecH5Group or a SpecH5Dataset if key is a valid name of a group or dataset.

key can be a member of self.keys(), i.e. an immediate child of the group, or a path reaching into subgroups (e.g. "instrument/positioners")

In the special case were this group is the root group, key can start with a / character.

Parameters:key (str) – Name of member
Raise:KeyError if key is not a known member of this group.
items()[source]
__len__()[source]

Return number of members,subgroups and datasets, attached to this group.

keys()[source]
Returns:List of all names of members attached to this group
visit(func)[source]

Recursively visit all names in this group and subgroups.

Parameters:func (function) – Callable (function, method or callable object)

You supply a callable (function, method or callable object); it will be called exactly once for each link in this group and every group below it. Your callable must conform to the signature:

func(<member name>) => <None or return value>

Returning None continues iteration, returning anything else stops and immediately returns that value from the visit method. No particular order of iteration within groups is guaranteed.

Example:

# Get a list of all contents (groups and datasets) in a SpecFile
mylist = []
f = File('foo.dat')
f.visit(mylist.append)
visititems(func)[source]

Recursively visit names and objects in this group.

Parameters:func (function) – Callable (function, method or callable object)

You supply a callable (function, method or callable object); it will be called exactly once for each link in this group and every group below it. Your callable must conform to the signature:

func(<member name>, <object>) => <None or return value>

Returning None continues iteration, returning anything else stops and immediately returns that value from the visit method. No particular order of iteration within groups is guaranteed.

Example:

# Get a list of all datasets in a specific scan
mylist = []
def func(name, obj):
    if isinstance(obj, SpecH5Dataset):
        mylist.append(name)

f = File('foo.dat')
f["1.1"].visititems(func)
class silx.io.spech5.SpecH5LinkToGroup(name, specfileh5)[source]

Bases: silx.io.spech5.SpecH5Group

Special SpecH5Group representing a link to a group.

It works exactly like a regular group but SpecH5Group.visit() and SpecH5Group.visititems() methods will recognize it as a link and will ignore it.

keys()[source]
Returns:List of all names of members attached to the target group
class silx.io.spech5.SpecH5(filename)[source]

Bases: silx.io.spech5.SpecH5Group

Special SpecH5Group representing the root of a SpecFile.

Parameters:filename (str) – Path to SpecFile in filesystem

In addition to all generic SpecH5Group attributes, this class also keeps a reference to the original SpecFile object and has a filename attribute.

Its immediate children are scans, but it also gives access to any group or dataset in the entire SpecFile tree by specifying the full path.

keys()[source]
Returns:List of all scan keys in this SpecFile (e.g. ["1.1", "2.1"…])
close()[source]

Close the object, and free up associated resources.

After calling this method, attempts to use the object may fail.

h5py_class[source]

h5py class which is mimicked by this class

Table Of Contents

Previous topic

specfilewrapper: Reading SpecFile (old API)

Next topic

spectoh5: SpecFile to HDF5 conversion

This Page