dictdump
: Dumping and loading dictionaries¶
This module offers a set of functions to dump a python dictionary indexed by text strings to following file formats: HDF5, INI, JSON
- dicttoh5(treedict, h5file, h5path='/', mode='w', overwrite_data=None, create_dataset_args=None, update_mode=None)[source]¶
Write a nested dictionary to a HDF5 file, using keys as member names.
If a dictionary value is a sub-dictionary, a group is created. If it is any other data type, it is cast into a numpy array and written as a
h5py
dataset. Dictionary keys must be strings and cannot contain the/
character.If dictionary keys are tuples they are interpreted to set h5 attributes. The tuples should have the format (dataset_name, attr_name).
Existing HDF5 items can be deleted by providing the dictionary value
None
, provided thatupdate_mode in ["modify", "replace"]
.Note
This function requires h5py to be installed.
- Parameters
treedict – Nested dictionary/tree structure with strings or tuples as keys and array-like objects as leafs. The
"/"
character can be used to define sub trees. If tuples are used as keys they should have the format (dataset_name,attr_name) and will add a 5h attribute with the corresponding value.h5file – File name or h5py-like File, Group or Dataset
h5path – Target path in the HDF5 file relative to
h5file
. Default is root ("/"
)mode – Can be
"r+"
(read/write, file must exist),"w"
(write, existing file is lost),"w-"
(write, fail if exists) or"a"
(read/write if exists, create otherwise). This parameter is ignored ifh5file
is a file handle.overwrite_data – Deprecated.
True
is approximately equivalent toupdate_mode="modify"
andFalse
is equivalent toupdate_mode="add"
.create_dataset_args – Dictionary of args you want to pass to
h5f.create_dataset
. This allows you to specify filters and compression parameters. Don’t specifyname
anddata
.update_mode –
Can be
add
(default),modify
orreplace
.add
: Extend the existing HDF5 tree when possible. Existing HDF5items (groups, datasets and attributes) remain untouched.
modify
: Extend the existing HDF5 tree when possible, modifyexisting attributes, modify same-sized dataset values and delete HDF5 items with a
None
value in the dict tree.
replace
: Replace the existing HDF5 tree. Items from the root ofthe HDF5 tree that are not present in the root of the dict tree will remain untouched.
Example:
from silx.io.dictdump import dicttoh5 city_area = { "Europe": { "France": { "Isère": { "Grenoble": 18.44, ("Grenoble","unit"): "km2" }, "Nord": { "Tourcoing": 15.19, ("Tourcoing","unit"): "km2" }, }, }, } create_ds_args = {'compression': "gzip", 'shuffle': True, 'fletcher32': True} dicttoh5(city_area, "cities.h5", h5path="/area", create_dataset_args=create_ds_args)
- nexus_to_h5_dict(treedict, parents=(), add_nx_class=True, has_nx_class=False)[source]¶
- The following conversions are applied:
key with “{name}@{attr_name}” notation: key converted to 2-tuple
- key with “>{url}” notation: strip “>” and convert value to
h5py.SoftLink or h5py.ExternalLink
- Parameters
treedict – Nested dictionary/tree structure with strings as keys and array-like objects as leafs. The
"/"
character can be used to define sub tree. The"@"
character is used to write attributes. The">"
prefix is used to define links.parents – Needed to resolve up-links (tuple of HDF5 group names)
add_nx_class – Add “NX_class” attribute when missing
has_nx_class – The “NX_class” attribute is defined in the parent
- Rtype dict
- h5_to_nexus_dict(treedict)[source]¶
- The following conversions are applied:
2-tuple key: converted to string (“@” notation)
h5py.Softlink value: converted to string (“>” key prefix)
h5py.ExternalLink value: converted to string (“>” key prefix)
- Parameters
treedict – Nested dictionary/tree structure with strings as keys and array-like objects as leafs. The
"/"
character can be used to define sub tree.- Rtype dict
- h5todict(h5file, path='/', exclude_names=None, asarray=True, dereference_links=True, include_attributes=False, errors='raise')[source]¶
Read a HDF5 file and return a nested dictionary with the complete file structure and all data.
Example of usage:
from silx.io.dictdump import h5todict # initialize dict with file header and scan header header94 = h5todict("oleg.dat", "/94.1/instrument/specfile") # add positioners subdict header94["positioners"] = h5todict("oleg.dat", "/94.1/instrument/positioners") # add scan data without mca data header94["detector data"] = h5todict("oleg.dat", "/94.1/measurement", exclude_names="mca_")
Note
This function requires h5py to be installed.
Note
If you write a dictionary to a HDF5 file with
dicttoh5()
and then read it back withh5todict()
, data types are not preserved. All values are cast to numpy arrays before being written to file, and they are read back as numpy arrays (or scalars). In some cases, you may find that a list of heterogeneous data types is converted to a numpy array of strings.- Parameters
h5file – File name or h5py-like File, Group or Dataset
path (str) – Target path in the HDF5 file relative to
h5file
exclude_names (List[str]) – Groups and datasets whose name contains a string in this list will be ignored. Default is None (ignore nothing)
asarray (bool) – True (default) to read scalar as arrays, False to read them as scalar
dereference_links (bool) – True (default) to dereference links, False to preserve the link itself
include_attributes (bool) – False (default)
errors (str) – Handling of errors (HDF5 access issue, broken link,…): - ‘raise’ (default): Raise an exception - ‘log’: Log as errors - ‘ignore’: Ignore errors
- Returns
Nested dictionary
- dicttonx(treedict, h5file, h5path='/', add_nx_class=None, **kw)[source]¶
Write a nested dictionary to a HDF5 file, using string keys as member names. The NeXus convention is used to identify attributes with
"@"
character, therefore the dataset_names should not contain"@"
.Similarly, links are identified by keys starting with the
">"
character. The corresponding value can be a soft or external link.- Parameters
treedict – Nested dictionary/tree structure with strings as keys and array-like objects as leafs. The
"/"
character can be used to define sub tree. The"@"
character is used to write attributes. The">"
prefix is used to define links.add_nx_class – Add “NX_class” attribute when missing. By default it is
True
whenupdate_mode
is"add"
orNone
.
The named parameters are passed to dicttoh5.
Example:
import numpy from silx.io.dictdump import dicttonx gauss = { "entry":{ "title":u"A plot of a gaussian", "instrument": { "@NX_class": u"NXinstrument", "positioners": { "@NX_class": u"NXCollection", "x": numpy.arange(0,1.1,.1) } } "plot": { "y": numpy.array([0.08, 0.19, 0.39, 0.66, 0.9, 1., 0.9, 0.66, 0.39, 0.19, 0.08]), ">x": "../instrument/positioners/x", "@signal": "y", "@axes": "x", "@NX_class":u"NXdata", "title:u"Gauss Plot", }, "@NX_class": u"NXentry", "default":"plot", } "@NX_class": u"NXroot", "@default": "entry", } dicttonx(gauss,"test.h5")
- nxtodict(h5file, include_attributes=True, **kw)[source]¶
Read a HDF5 file and return a nested dictionary with the complete file structure and all data.
As opposed to h5todict, all keys will be strings and no h5py objects are present in the tree.
The named parameters are passed to h5todict.
- dicttojson(ddict, jsonfile, indent=None, mode='w')[source]¶
Serialize
ddict
as a JSON formatted stream tojsonfile
.- Parameters
ddict – Dictionary (or any object compatible with
json.dump
).jsonfile – JSON file name or file-like object. If a file name is provided, the function opens the file in the specified mode and closes it again.
indent – If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of
0
will only insert newlines.None
(the default) selects the most compact representation.mode – File opening mode (
w
,a
,w+
…)
- dicttoini(ddict, inifile, mode='w')[source]¶
Output dict as configuration file (similar to Microsoft Windows INI).
- Parameters
dict – Dictionary of configuration parameters
inifile – INI file name or file-like object. If a file name is provided, the function opens the file in the specified mode and closes it again.
mode – File opening mode (
w
,a
,w+
…)
- dump(ddict, ffile, mode='w', fmat=None)[source]¶
Dump dictionary to a file
- Parameters
ddict – Dictionary with string keys
ffile – File name or file-like object with a
write
methodfmat (str) –
Output format:
"json"
,"hdf5"
or"ini"
. When None (the default), it uses the filename extension as the format. Dumping to a HDF5 file requires h5py to be installed.mode (str) – File opening mode (
w
,a
,w+
…) Default is “w”, write mode, overwrite if exists.
- Raises
IOError – if file format is not supported
- load(ffile, fmat=None)[source]¶
Load dictionary from a file
When loading from a JSON or INI file, an OrderedDict is returned to preserve the values’ insertion order.
- Parameters
ffile – File name or file-like object with a
read
methodfmat –
Input format:
json
,hdf5
orini
. When None (the default), it uses the filename extension as the format. Loading from a HDF5 file requires h5py to be installed.
- Returns
Dictionary (ordered dictionary for JSON and INI)
- Raises
IOError – if file format is not supported