Converting various data files to HDF5

This document explains how to convert SPEC files, EDF files and various other data formats into HDF5 files.

An understanding of the way these data formats are exposed by the silx.io.open() function is a prerequisite for this tutorial. You can learn more about this subject by reading Getting started with silx.io.

Using the convert module

The silx module silx.io.convert can be used to convert various data files into a HDF5 file with the same structure as the one exposed by the spech5 or fabioh5 modules.

from silx.io.convert import convert

convert("myspecfile.dat", "myfile.h5")

You can then read the file with any HDF5 reader.

The function silx.io.convert.convert() is a simplified version of a more flexible function silx.io.convert.write_to_h5().

The latter allows you to write scans into a specific HDF5 group in the output directory. You can also decide whether you want to overwrite an existing file, or append data to it. You can specify whether existing data with the same name as input data should be overwritten or ignored.

This allows you to repeatedly transfer new content of a SPEC file to an existing HDF5 file, in between two scans.

The following script is an example of a command line interface to write_to_h5().

import argparse
from silx.io.convert import write_to_h5

parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument('input_path',
                    help='Path to input data file')
parser.add_argument('h5_path',
                    help='Path to output HDF5 file')
parser.add_argument('-t', '--target-path', default="/",
                    help='Name of the group in which to save the scans ' +
                         'in the output file')

mode_group = parser.add_mutually_exclusive_group()
mode_group.add_argument('-o', '--overwrite', action="store_true",
                        help='Overwrite output file if it exists, ' +
                             'else create new file.')
mode_group.add_argument('-a', '--append', action="store_true",
                        help='Append data to existing file if it exists, ' +
                             'else create new file.')

parser.add_argument('--overwrite-data', action="store_true",
                    help='In append mode, overwrite existing groups and ' +
                         'datasets in the output file, if they exist with ' +
                         'the same name as input data. By default, existing' +
                         ' data is not touched, corresponding input data is' +
                         ' ignored.')

args = parser.parse_args()

if args.overwrite_data and not args.append:
    print("Option --overwrite-data ignored " +
          "(only relevant combined with option -a)")

if args.overwrite:
    mode = "w"
elif args.append:
    mode = "a"
else:
    # by default, use "write" mode and fail if file already exists
    mode = "w-"

write_to_h5(args.input_path, args.h5_path,
            h5path=args.target_path,
            mode=mode,
            overwrite_data=args.overwrite_data)

But the functionality implemented in this script (and much more) is already implemented in the silx convert application.

Using the convert application

New in version 0.6.

silx also provides a silx convert command line application, which allows you to perform standard conversions without having to write your own program.

Type silx convert --help in a terminal to see all available options.

Note

The complete documentation for the silx convert command is available here: silx convert.

Converting single files

The simplest command to convert a single SPEC file to an HDF5 file would be:

silx convert myspecfile.dat

As no output name is supplied, the output file name will be a time-stamp with a .h5 suffix (e.g. 20180110-114930.h5).

The following example allows you to append the content of a SPEC file to an existing HDF5 file:

silx convert myspecfile.dat -m a -o myhdf5file.h5

The -m a argument stands for append mode. The -o myhdf5file.h5 argument is used to specify the output file name.

You could write the file into a specific group of the HDF5 file by providing the complete URL in the format file_path::group_path. For instance:

silx convert myspecfile.dat -m a -o archive.h5::/2017-09-20/SPEC

Merging a stack of images

silx convert can merge a stack of image files. It support series of single frame files, and is based on fabio.file_series. All frames must have the same shape.

The following command merges all files matching a pattern:

silx convert --file-pattern ch09__mca_0005_0000_%d.edf -o ch09__mca_0005_0000_multiframe.h5

The data in the output file is presented as a 3D array.

It is possible to provide multiple indices in the file name pattern, and specify a range for each index:

silx convert --file-pattern ch09__mca_0005_%04d_%04d.edf --begin 0,1 --end 0,54