API

Dataset

class darfix.core.dataset.AcquisitionDims[source]

Define the view of the data which has to be made

from_dict(_dict)[source]

Convert dictionary of dimensions.

get(axis)[source]

Get Dimension at certain axis.

Parameters

axis (int) – axis of the dimension.

Returns

the requested dimension if exists.

get_names()[source]

Get list with all the names of the dimensions.

Returns

array_like of strings

set_size(axis, size)[source]

Recreated new dimension with new size and same name and kind.

Parameters
  • axis (int) – axis of the dimension

  • size (int) – new size for the dimension

property shape
Returns

shape of the currently defined dims

class darfix.core.dataset.Data[source]

Class to structure the data and link every image with its corresponding url and metadata.

Parameters
  • urls (array_like) – Array with the urls of the data

  • metadata (array_like) – Array with the metadata of the data

  • in_memory (bool, optional) – If True, the data is loaded into memory, default True

apply_funcs(funcs=[], indices=None, save=False, text='', operation=None)[source]

Method that applies a series of functions into the data. It can save the images into disk or return them.

Parameters
  • funcs (array_like, optional) – List of tupples. Every tupples contains the function to apply and its parameters, defaults to []

  • indices (Union[None, array_like], optional) – Indices of the data to apply the functions to, defaults to None

  • save (bool) – If True, saves the images into disk, defaults to False

  • text (str) – Text to show in the advancement display.

  • operation (int) – operation to stop

Returns

Array with the new urls (if data was saved)

convert_to_hdf5(_dir)[source]

Converts the data into an HDF5 file, setting flattened images in the rows. TODO: pass filename per parameter?

Parameters

_dir (str) – Directory in which to save the HDF5 file.

Returns

Hdf5 file

Return type

h5py.File

flatten()[source]
Returns

new data with flattened urls and metadata (but not images).

Return type

Data

property ndim

Number of array dimensions.

>>> x = np.array([1, 2, 3])
>>> x.ndim
1
>>> y = np.zeros((2, 3, 4))
>>> y.ndim
3
reshape(shape)[source]

Returns an array containing the same data with a new shape of urls and metadata. Shape also contains image shape at the last two positions (unreshapable).

Parameters

shape (int or tuple of ints.) – New shape, should be compatible with the original shape.

Returns

new Data object with urls and metadata reshaped to shape.

save(path, indices=None)[source]

Save the data into path folder and replace Data urls. TODO: check if urls already exist and, if so, modify only urls[indices].

Parameters
  • path (str) – Path to the folder

  • indices (Union[None,array_like], optional) – the indices of the values to save, defaults to None

property shape

Tuple of array dimensions.

stop_operation(operation)[source]

Method used for cases where threads are created to apply functions to the data. If method is called, the flag concerning the stop is set to 0 so that if the concerned operation is running in another thread it knows to stop.

Parameters

operation (int) – operation to stop

take(indices, axis=None)[source]

Take elements from urls and metadata from an array along an axis.

Parameters
  • indices (array_like) – the indices of the values to extract

  • axis (Union[N one, int], optional) – the axis over which to select values, defaults to None

Returns

Flattened data.

Return type

Data

class darfix.core.dataset.Dataset(_dir, data=None, raw_folder=None, filenames=None, dims=None, treated='treated_data', in_memory=True)[source]

Class to define a dataset from a series of data files.

Parameters
  • _dir (str) – Global directory to use and save all the data in the different operations.

  • data (Data, optional) – If not None, sets the Data array with the data of the dataset to use, default None

  • raw_folder (Union[None,str], optional) – Path to the folder that contains the data, defaults to None

  • filenames (Union[Generator,Iterator,List], optional) – Ordered list of filenames, defaults to None.

  • treated (str, optional) – Name of the folder with the treated data, defaults ‘treated_data’

  • in_memory (bool, optional) – If True, data is loaded into memory, else, data is read in chunks depending on the algorithm to apply, defaults to False.

add_dim(axis, dim)[source]

Adds a dimension to the dimension’s dictionary.

Parameters
  • axis (int) – axis of the dimension.

  • dim (Dimension) – dimension to be added.

apply_background_subtraction(background=None, method='median', indices=None, chunk_shape=[100, 100])[source]

Applies background subtraction to the data and saves the new data into disk.

Parameters
  • background (Union[None, array_like, Dataset]) – Data to be used as background. If None, data with indices indices is used. If Dataset, data of the dataset is used. If array, use data with indices in the array.

  • method (Method) – Method to use to compute the background.

  • indices (Union[None, array_like]) – Indices of the images to apply background subtraction. If None, the background subtraction is applied to all the data.

  • chunk_shape (array_like) – Parameter used only when flag in_memory is False and method is Method.median. It is the shape of the chunk image to use per iteration.

Returns

dataset with data of same size as self.data but with the modified images. The urls of the modified images are replaced with the new urls.

Return type

Dataset

apply_hot_pixel_removal(kernel=3, indices=None)[source]

Applies hot pixel removal to Data, and saves the new data into disk.

Parameters
  • kernel (int) – size of the kernel used to find the hot pixels.

  • indices (Union[None, array_like]) – Indices of the images to apply background subtraction. If None, the hot pixel removal is applied to all the data.

Returns

dataset with data of same size as self.data but with the modified images. The urls of the modified images are replaced with the new urls.

Return type

Dataset

apply_roi(origin=None, size=None, center=None, indices=None)[source]

Applies a region of interest to the data.

Parameters
  • origin (Union[2d vector, None]) – Origin of the roi

  • size (Union[2d vector, None]) – [Height, Width] of the roi.

  • center (Union[2d vector, None]) – Center of the roi

  • indices (Union[None, array_like]) – Indices of the images to apply background subtraction. If None, the roi is applied to all the data.

Returns

dataset with data with roi applied. Note: To preserve consistence of shape between images, if indices is not None, only the data modified is returned.

Return type

Dataset

apply_shift(shift, dimension=None, shift_approach='fft', indices=None, callback=None)[source]

Apply shift of the data or part of it and save new data into disk.

Parameters
  • shift (array_like) – Shift per frame.

  • dimension (Union[None, tuple, array_like]) – Parametes with the position of the data in the reshaped array.

  • 'linear'] shift_approach (Union['fft',) – Method to use to apply the shift.

  • indices (Union[None, array_like]) – Boolean index list with True in the images to apply the shift to. If None, the hot pixel removal is applied to all the data.

  • None] callback (Union[function,) – Callback

Returns

dataset with data of same size as self.data but with the modified images. The urls of the modified images are replaced with the new urls.

compute_frames_intensity(kernel=(3, 3), sigma=20)[source]

Returns for every image a number representing its intensity. This number is obtained by first blurring the image and then computing its variance.

find_and_apply_shift(dimension=None, h_max=0.5, h_step=0.01, shift_approach='fft', indices=None, callback=None)[source]

Find the shift of the data or part of it and apply it.

Parameters
  • dimension (Union[None, tuple, array_like]) – Parametes with the position of the data in the reshaped array.

  • h_max (float) – See core.imageRegistration.shift_detection

  • h_step (float) – See core.imageRegistration.shift_detection

  • 'linear'] shift_approach (Union['fft',) – Method to use to apply the shift.

  • indices (Union[None, array_like]) – Indices of the images to find and apply the shift to. If None, the hot pixel removal is applied to all the data.

  • None] callback (Union[function,) – Callback

Returns

Dataset with the new data.

find_dimensions(kind, tolerance=1e-09)[source]

Goes over all the headers from a given kind and finds the dimensions that move (have more than one value) along the data.

Note: Before, only the dimensions that could fit where shown, now it shows all the dimensions and let the user choose the valid ones.

Parameters
  • kind (int) – Type of metadata to find the dimensions.

  • tolerance (float) – Tolerance that will be used to compute the unique values.

find_shift(dimension=None, h_max=0.5, h_step=0.01, indices=None)[source]

Find shift of the data or part of it.

Parameters
  • dimension (Union[None, tuple, array_like]) – Parametes with the position of the data in the reshaped array.

  • h_max (float) – See core.imageRegistration.shift_detection

  • h_step (float) – See core.imageRegistration.shift_detection

  • indices (Union[None, array_like]) – Boolean index list with True in the images to apply the shift to. If None, the hot pixel removal is applied to all the data.

Returns

Array with shift per frame.

get_data(indices=None, dimension=None)[source]

Returns the data corresponding to a certains indices and given a certain dimension. The data is always flattened to be a stack of images.

Parameters
  • indices (array_like) – If not None, data is filtered using this array.

  • dimension (array_like) – If not None, return only the data corresponding to the given dimension. Dimension is a 2d vector, where the first component is the axis and the second is the indices of the values to extract.

Returns

Array with the new data.

get_dimensions_values(indices=None)[source]

Returns all the metadata values of the dimensions. The values are assumed to be numbers.

Returns

array_like

property nframes

Return number of frames

nica(num_components, chunksize=None, num_iter=500, error_step=None, indices=None)[source]

Compute Non-negative Independent Component Analysis on the data. The method, first converts, if not already, the data into an hdf5 file object with the images flattened in the rows.

Parameters
  • num_components (Union[None, int]) – Number of components to find

  • chunksize (Union[None, int], optional) – Number of chunks for which the whitening must be computed, incrementally, defaults to None

  • num_iter (int, optional) – Number of iterations, defaults to 500

  • error_step – If not None, find the error every error_step and compares it to check for convergence. TODO: not able for huge datasets.

  • indices (Union[None, array_like], optional) – If not None, apply method only to indices of data, defaults to None

Returns

(H, W): The components matrix and the mixing matrix.

nica_nmf(num_components, chunksize=None, num_iter=500, cascade=None, error_step=None, vstep=100, hstep=1000, indices=None)[source]

Applies both NICA and NMF to the data. The init H and W for NMF are the result of NICA.

nmf(num_components, num_iter=100, error_step=None, cascade=None, H=None, W=None, vstep=100, hstep=1000, indices=None)[source]

Compute Non-negative Matrix Factorization on the data. The method, first converts, if not already, the data into an hdf5 file object with the images flattened in the rows.

Parameters
  • num_components (Union[None, int]) – Number of components to find

  • num_iter (int, optional) – Number of iterations, defaults to 100

  • error_step (Union[None, int], optional) – If not None, find the error every error_step and compares it to check for convergence, defaults to None TODO: not able for huge datasets.

  • cascade (Union[None, array_like], optional) – If not None, NMF is computed using the cascade method. The parameter should be an array with the number of iterations per sub-computation, defaults to None

  • H (Union[None, array_like], optional) – Init matrix for H, defaults to None

  • W (Union[None, array_like], optional) – Init matrix for W, defaults to None

  • indices (Union[None, array_like], optional) – If not None, apply method only to indices of data, defaults to None

Returns

(H, W): The components matrix and the mixing matrix.

partition_by_intensity(bins=None, num_bins=1)[source]

Function that computes the data from the set of urls. If the filter_data flag is activated it filters the data following the next: – First, it computes the intensity for each frame, by calculating the variance after passing a gaussian filter. – Second, computes the histogram of the intensity. – Finally, saves the data of the frames with an intensity bigger than a threshold. The threshold is set to be the second bin of the histogram.

Parameters

num_bins (int) – Number of bins to use as threshold.

pca(num_components=None, chunk_size=None, indices=None, return_vals=False)[source]

Compute Principal Component Analysis on the data. The method, first converts, if not already, the data into an hdf5 file object with the images flattened in the rows.

Parameters
  • num_components (Union[None, int]) – Number of components to find. If None, it uses the minimum between the number of images and the number of pixels.

  • chunk_size – Number of chunks for which the whitening must be computed, incrementally, defaults to None

  • indices (Union[None, array_like], optional) – If not None, apply method only to indices of data, defaults to None

  • return_vals (bool, optional) – If True, returns only the singular values of PCA, else returns the components and the mixing matrix, defaults to False

Returns

(H, W): The components matrix and the mixing matrix.

remove_dim(axis)[source]

Removes a dimension from the dimension’s dictionary.

Parameters

axis (int) – axis of the dimension.

reshape_data()[source]

Function that reshapes the data to fit the dimensions.

stop_operation(operation)[source]

Method used for cases where threads are created to apply functions to the dataset. If method is called, the flag concerning the stop is set to 0 so that if the concerned operation is running in another thread it knows to stop.

Parameters

operation (int) – operation to stop

to_memory(indices)[source]

Method to load only part of the data into memory. Returns a new dataset with the data corresponding to given indices into memory. The new indices array has to be given, if all the data has to be set into memory please set in_memory to True instead, this way no new dataset will be created.

Parameters

indices (array_like) – Indices of the new dataset.

class darfix.core.dataset.Dimension(kind, name, size=None, tolerance=1e-09)[source]

Define a dimension used during the dataset

Parameters
  • kind (Union[int,str]) – metadata type in fabioh5 mapping

  • name (str) – name of the dimension (should fit the fabioh5 mapping for now)

  • size (Union[int,None]) – length of the dimension.

static from_dict(_dict)[source]
Parameters

_dict (dict) – dict defining the dimension. Should contains the following keys: name, kind, size. Unique values are not stored into it because it depends on the metadata and should be obtained from a fit / set_dims

Returns

Dimension corresponding to the dict given

Return type

Dimension

set_unique_values(values)[source]

Sets the unique values of the dimension. If the size of the dimension is fixed, it automatically sets the first size values, else it finds the unique values.

Parameters

values (array_like) – list of values.

to_dict()[source]

Translate the current Dimension to a dictionary

class darfix.core.dataset.Operation[source]

Flags for different operations in Dataset

Image Operations

class darfix.core.imageOperations.Method[source]

Methods available to compute the background.

darfix.core.imageOperations.background_subtraction(data, bg_frames, method='median')[source]

Function that computes the median between a series of dark images from a dataset and subtracts it to each frame of the raw data to remove the noise.

Parameters
  • data (ndarray) – The raw data

  • bg_frames (array_like) – List of dark frames

  • method (Union['mean', 'median', None]) – Method used to determine the background image.

Returns

ndarray

Raises

ValueError

darfix.core.imageOperations.background_subtraction_2D(img, bg)[source]

Compute background subtraction.

Parameters
  • img (array_like) – Raw image

  • bg (array_like) – Background image

Returns

Image with subtracted background

darfix.core.imageOperations.chunk_image(img, start, chunk_shape)[source]

Return a chunk of an image.

Parameters
  • img (array_like) – Raw image

  • start (tuple) – Start of the chunk in the image

  • shape (tuple) – Shape of the chunk

darfix.core.imageOperations.hot_pixel_removal_2D(image, ksize=3)[source]

Function to remove hot pixels of the data using median filter.

Parameters
  • data (array_like) – Input data.

  • ksize (int) – Size of the mask to apply.

Returns

ndarray

darfix.core.imageOperations.hot_pixel_removal_3D(data, ksize=3)[source]

Function to remove hot pixels of the data using median filter.

Parameters
  • data (array_like) – Input data.

  • ksize (int) – Size of the mask to apply.

Returns

ndarray

darfix.core.imageOperations.img2img_mean(img, mean=None, n=0)[source]

Update mean from stack of images, given a new image and its index in the stack.

Parameters
  • img (array_like) – img to add to the mean

  • mean (array_like) – mean img

  • n (int) – index of the last image in the stack

Returns

Image with new mean

darfix.core.imageOperations.threshold_removal(data, bottom=None, top=None)[source]

Set bottom and top threshold to the images in the dataset.

Parameters
  • dataset (Dataset) – Dataset with the data

  • bottom (int) – Bottom threshold

  • top (int) – Top threshold

Returns

ndarray

Image Registration

class darfix.core.imageRegistration.ShiftApproach[source]

Different shifts approaches that can be used for the shift detection and correction.

darfix.core.imageRegistration.apply_shift(img, shift, shift_approach='fft')[source]

Function to apply the shift to an image.

Parameters
  • array_like img (2-dimensional) – Input array, can be complex.

  • array_like shift (2-dimensional) – The shift to be applied to the image. shift[0] refers to the y-axis, shift[1] refers to the x-axis.

  • fft] shift_approach (Union[linear,) – The shift method to be used to apply the shift.

Returns

real ndarray

darfix.core.imageRegistration.compute_com(data)[source]

Compute the center of mass of a stack of images. First it computes the intensity of every image (by summing its values), and then it uses scipy center_of_mass function to find the centroid.

Parameters

data (numpy.ndarray) – stack of images.

Returns

the vector of intensities and the com.

darfix.core.imageRegistration.diff_com(img1, img2)[source]

Finds the difference between the center of mass of two images. This can be used to find the shift between two images with distinguished center of mass.

darfix.core.imageRegistration.find_shift(img1, img2, upsampling_factor=1000)[source]

Uses the function register_translation from skimage to find the shift between two images.

Parameters
  • img1 (array_like) – first image.

  • img2 (array_like) – second image, must be same dimensionsionality as img1.

  • upsampling_factor (int) – optional.

darfix.core.imageRegistration.improve_linear_shift(data, v, h_max, h_step, nimages=None, shift_approach='linear')[source]

Function to find the best shift between the images. It loops h_max * h_step times, applying a different shift each, and trying to find the one that has the best result.

Parameters
  • data (array_like) – The stack of images.

  • array_like v (2-dimensional) – The vector with the direction of the shift.

  • h_max (number) – The maximum value that h can achieve, being h the shift between images divided by the vector v (i.e the coordinates of the shift in base v).

  • h_step (number) – Spacing between the h tried. For any `` shift = h * v * idx``, where idx is the index of the image to apply the shift to, this is the distance between two adjacent values of h.

  • nimages (int) – The number of images to be used to find the best shift. It has to be smaller or equal as the length of the data. If it is smaller, the images used are chosen using numpy.random.choice, without replacement.

  • shift_approach (Union[linear,`fft`]) – The shift method to be used to apply the shift.

Returns

ndarray

darfix.core.imageRegistration.normalize(x)[source]

Normalizes a vector or matrix.

Function that performs random search to a set of images to find an improved vector of shifts (one for each image). For this, it adds to the optimal shift a series of random samples obtained from a multivariate normal distribution, and selects the one with best score.

Parameters
  • data (array_like) – Set of images.

  • optimal_shift (ndarray) – Array with two vectors with the linear shift found for axis y and x.

  • iterations (int) – Number of times the random search should be done.

  • sigma (number) – Standard deviation of the distribution.

  • shift_approach (Union[linear,`fft`]) – Name of the shift approach to be used. Default: linear.

Returns

A vector of length the number of images with the best shift found for every image.

Return type

ndarray

darfix.core.imageRegistration.shift_correction(data, n_shift, shift_approach='fft', callback=None)[source]

Function to apply shift correction technique to stack of images.

Parameters
  • data (array_like) – The stack of images to apply the shift to.

  • n_shift (array_like) – Array with the shift to be applied at every image. The first row has the shifts in the y-axis and the second row the shifts in the x-axis. For image i the shift applied will be: shift_y = n_shift[0][i] shift_x = n_shift[1][i]`.

  • shift_approach (Union[linear,`fft`]) – Name of the shift approach to be used. Default: fft.

  • callback (Union[None,Function]) – Callback function to update the progress.

Returns

The shifted images.

Return type

ndarray

darfix.core.imageRegistration.shift_detection(data, h_max=0.5, h_step=0.01)[source]

Finds the linear shift from a set of images.

Parameters

data (ndarray) – Array with the images.

Returns

A vector of length the number of images with the linear shift to apply to every image.

Return type

ndarray

Genetic Shift Detection

class darfix.core.geneticShiftDetection.GeneticShiftDetection(data, optimal_shift)[source]

Class performing feature selection with a genetic algorithm. Selects the best shift to apply to each image from a set of images. Given a linear (increasing through the stack) shift that produces optimal results, it tries to find the best 2d normal distibution that, added to the optimal shift, produces the best result.

Parameters
  • data (array_like) – Stack of images.

  • optimal_shift (array_like) – Array with 2 rows (y and x) and len(data) columns with an optimal linear shift.

crossover(parents)[source]

Given a set of parents individuals, it randomly mixes pairs of them to create a new generation of children. A fixed number of parents, elite_size, automatically becomes a child. It assumes that a first portion, bigger than elite_size, of parents is from the elite choosen in the select() method.

Parameters

parents (array_like) – Individuals previously chosen to become parents.

fit(mean, sigma, n_gens, size)[source]

Computes the genetic algorithm.

Parameters
  • tuple) mean (Union(list,) – 2d vector to be the mean of the starting population of normals.

  • sigma (number) – Standard deviation used to create the covariance matrix (set in the diagonal of 2x2 matrix).

  • n_gens (int) – number of generations to compute.

  • size (int) – number of individuals of the population.

Returns

The genetic algorithm

Return type

GA

fitness(population, shift_approach='linear')[source]

Finds the score of each of the individuals of a population, by means of the fitness function.

Parameters
  • population (array_like) – List of individuals to score.

  • shift_approach (str) – Name of the shift approach to be used.

Returns

ndarray, ndarray

generate(population)[source]

Creates a new generation of indidivuals.

Parameters

population (array_like) – Actual population of individuals.

initialize(size)[source]

Initializes size normal distributions to be used as initial populations.

Parameters

size (int) – Size of the initial population.

Returns

ndarray

mutate(children)[source]

Given a set of children individuals, it randomly mutates some of their gens.

Parameters

children (array_like) – List of individuals.

select(population, scores)[source]

Selects the parents to breed the new generation. A fixed number self.elite_size of best score population automatically becomes a parent, the others are added with a certain probability, which increases as the fitness score.

Parameters
  • population (array_like) – Population ordered by higher score.

  • score (array_like) – Score, ordered from top to bottom, of each inidivual.

property support_

Returns the best chromosome from the last iteration.

Region of Interest

darfix.core.roi.apply_2D_ROI(img, origin=None, size=None, center=None)[source]

Function that computes a ROI at an image.

Parameters
  • img (array_like) – Image

  • origin (Union[2d vector, None]) – Origin of the roi

  • size (2d-vector) – [Height, Width] of the roi.

  • center (Union[2d vector, None]) – Center of the roi

Returns

ndarray

Raises

AssertionError, ValueError

darfix.core.roi.apply_3D_ROI(data, origin=None, size=None, center=None)[source]

Function that computes the ROI of each image in stack of images.

Parameters
  • data (array_like) – The stack of images

  • origin (Union[2d vector, None]) – Origin of the roi

  • size (2d-vector) – [Height, Width] of the roi.

  • center (Union[2d vector, None]) – Center of the roi

Returns

ndarray

Raises

AssertionError, ValueError

Components Matching

class darfix.core.componentsMatching.Component(image, kp, des)[source]

Class Component. Describes a component of a dataset (image) with its keypoints and descriptors.

class darfix.core.componentsMatching.ComponentsMatching(components)[source]

Class to compute component matching.

Parameters

components (array_like) – List of stack of images. Every element of the list contains a stack of components from a certain dataset.

draw_matches(final_matches, matches, id1=None, id2=None, displayMatches=False)[source]

Create stack of images with each pair of matches.

Parameters
  • final_matches (dict) – Dictionary with the best pairs of matches per items.

  • matches (dict) – Dictionary with keys the pairs of matches and with values the information of every pair of components.

  • id1 (Union[int,None]) – Id of the first dataset to compare.

  • id2 (Union[int,None]) – Id of the second dataset to compare.

  • displayMatches (bool) – If True, dictionary matches has to contain values of type cv2.DMatch.

Returns array_like

stack with the pairs of images, and if so, info about the matching.

euclidean_distance(X, Y)[source]

Compute euclidean distance between two images.

match_components(id1=None, id2=None, method=<Method.orb_feature_matching: 'orb feature matching'>, tol=8)[source]

Match components. Given the components x1,…,xn of dataset 1 and the components y1,…,ym of dataset 2, this function computes the pairs (xi,yi) that have better matching. Considering that each component of dataset 1 corresponds to one and only one component of dataset 2.

Parameters
  • id1 (Union[int,None]) – Id of the first dataset to compare.

  • id2 (Union[int,None]) – Id of the second dataset to compare.

  • method (Method) – Method to use for the matching.

Returns

Dictionary with components ids of id1 per keys and their corresponding id component of id2 match per values, and dictionary with the matching info per pair of components.

Return type

(dict, dict)

class darfix.core.componentsMatching.Method[source]

Methods available to compute the matching.

Decomposition

class darfix.decomposition.base.Base(data, num_components=None, indices=None, epsilon=1e-07)[source]

Base class for decomposition package.

Parameters
  • data – Numpy array or Hdf5 dataset with images in the rows and pixels in the columns.

  • num_components (Union[None,int], optional) – Number of components to keep, defaults to None

  • indices (Union[None,array_like], optional) – The indices of the values to use, defaults to None

  • epsilon (float) – Convergence tolerance, defaults to 1e-07

fit_transform(max_iter=100, error_step=None, compute_w=True, compute_h=True)[source]

Fit to data, then transform it

Parameters
  • max_iter (int, optional) – Maximum number of iterations, defaults to 100

  • error_step (Union[None,int], optional) – If None, error is not computed, defaults to None Else compute error for every error_step images.

  • compute_w (bool, optional) – When False, W is not computed, defaults to True

  • compute_h (bool, optional) – When False, H is not computed, defaults to True

frobenius_norm()[source]

Frobenius norm (||data - WH||) of a data matrix and a low rank approximation given by WH. Minimizing the Fnorm is the most common optimization criterion for matrix factorization methods. Returns: ——- frobenius norm: F = ||data - WH||

class darfix.decomposition.ipca.IPCA(data, chunksize, num_components=None, whiten=False, indices=None, rowvar=True)[source]

Compute PCA in chunks, using the Incremental principal component analysis implementation in scikit-learn. To compute W, partially fits the rows in chunks (reduced number of images). Then, to compute H, applies dimensionality reduction for every chunk, and horizontally stacks the projection into H.

Parameters
  • data (array_like) – array of shape (n_samples, n_features). See rowvar.

  • chunksize (int) – Size of every group of samples to apply PCA to. PCA will be fit with arrays of shape (chunksize, n_features), where nfeatures is the number of features per sample. Depending on rowvar, the chunks will be from the rows or from the columns.

  • num_components (Union[None,int], optional) – Number of components to keep, defaults to None.

  • whiten (bool, optional) – If True, whitening is applied to the components.

  • indices (Union[None,array_like], optional) – The indices of the samples to use, defaults to None. If rowvar is False, corresponds to the indices of the features to use.

  • rowvar (bool, optional) – If rowvar is True (default), then each row represents a sample, with features in the columns. Otherwise, the relationship is transposed: each column represents a sample, while the rows contain features.

fit_transform(max_iter=1, error_step=None, W=None, H=None)[source]

Fit to data, then transform it

Parameters
  • max_iter (int, optional) – Maximum number of iterations, defaults to 100

  • error_step (Union[None,int], optional) – If None, error is not computed, defaults to None Else compute error for every error_step images.

  • compute_w (bool, optional) – When False, W is not computed, defaults to True

  • compute_h (bool, optional) – When False, H is not computed, defaults to True

property singular_values

The singular values corresponding to each of the selected components.

Retuns

array, shape (n_components,)

class darfix.decomposition.nica.NICA(data, num_components, chunksize=None, lr=0.03, indices=None)[source]

Compute the non-negative independent components of the linear generative model x = A * s.

Here, x is a p-dimensional observable random vector and s is the latent random vector of length num_components, whose components are statistically independent and non-negative. The matrix X is assumed to hold n samples of x, stacked in rows (shape(X) = (n, p)) or columns (shape(X) = (p, n)), which can be specified by the rowvar parameter. In practice, if shape(X) = (p, n) (resp. shape(X) = (n, p)) this function solves X = A * S (resp. X = S.T * A.T) both for S and A, where A is the so-called mixing matrix, with shape (p, num_components), and S is a (num_components, n) matrix which contains n samples of the latent source vector, stacked in columns.

This function implements the method presented in: Blind Separation of Positive Sources by Globally Convergent Gradient Search (https://core.ac.uk/download/pdf/76988305.pdf)

Parameters
  • data – array of shape (nsamples, nfeatures).

  • num_components (Union[uint, None]) – Dimension of s. Number of latent random variables.

  • chunksize (int) – Size of every group of samples to apply PCA to. PCA will be fit with arrays of shape (chunksize, nfeatures), where nfeatures is the number of features per sample.

  • lr (float) – Learning rate of gradient descent.

  • max_iter (int) – Maximum number of iterations of gradient descent.

  • tol (float) – Tolerance on update at each iteration.

Returns

(S, A) if rowvar == True else (S.T, A)

fit_transform(max_iter=100, error_step=None)[source]

Fit to data, then transform it

Parameters
  • max_iter (int, optional) – Maximum number of iterations, defaults to 100

  • error_step (Union[None,int], optional) – If None, error is not computed, defaults to None Else compute error for every error_step images.

  • compute_w (bool, optional) – When False, W is not computed, defaults to True

  • compute_h (bool, optional) – When False, H is not computed, defaults to True

class darfix.decomposition.nmf.NMF(data, num_components=None, indices=None, epsilon=1e-07)[source]

Non-Negative Matrix Factorization.

Find two non-negative matrices whose product approximates the non-negative matrix data.

fit_transform(H=None, W=None, max_iter=200, compute_w=True, compute_h=True, vstep=100, hstep=1000, error_step=None)[source]

Find the two non-negative matrices (H, W). The images are loaded from disk in chunks.

Parameters
  • H (array_like, shape (n_components, n_features), optional) – If not None, used as initial guess for the solution.

  • W (array_like, shape (n_samples, n_components)) – If not None, used as initial guess for the solution.

  • max_iter (int, optional) – Maximum number of iterations before timing out, defaults to 200

  • compute_w (bool, optional) – If False, W is not computed.

  • compute_h (bool, optional) – If False, H is not computes.

  • vstep (int, optional) – vertical size of the chunks to take from data. When updating W, vstep images are retrieved from disk per iteration, defaults to 100.

  • hstep (int, optional) – horizontal size of the chunks to take from fata. When updating H, hstep pixels are retrieved from disk per iteration, defaults to 1000.

  • error_step (Union[None,int], optional) – If None, error is not computed, else compute error for every error_step images.

GUI

class darfix.gui.datasetSelectionWidget.DatasetSelectionWidget(parent=None)[source]

Widget that creates a dataset from a list of files or from a single filename. It lets the user add the first filename of a directory of files, or to upload manually each of the files to be read. If both options are filled up, only the files in the list of filenames are read.

loadDataset()[source]

Loads the dataset from the filenames.

class darfix.gui.datasetSelectionWidget.FilenameSelectionWidget(parent=None)[source]

Widget used to obtain a filename (manually or from a file)

class darfix.gui.datasetSelectionWidget.FilesSelectionWidget(parent=None)[source]

Widget used to get one or more files from the computer and add them to a list.

addFile(file)[source]

Adds a file to the table.

Parameters

file (str) – filepath to add to the table.

setFiles(files)[source]

Adds a list of files to the table.

Parameters

files (array_like) – List to add

class darfix.gui.metadataWidget.MetadataWidget(parent=None)[source]
class darfix.gui.dimensionsWidget.DimensionMapping(parent)[source]

Widget used to define the number of dimension and with which values they are mapped

addDim(axis=None, dim=None)[source]
Parameters
  • axis – which axis is defining this dimension

  • dim (Dimension) – definition of the dimension to add

fitFailed

Signal emitted when the fit fails

fitSucceed

Signal emitted when the fit succeeds

removeDim(row)[source]

Remove dimension.

Parameters

Union[int,`_DimensionItem`] – row or item to remove

class darfix.gui.dimensionsWidget.DimensionWidget(parent=None)[source]

Widget to define dimensions and try to fit those with dataset

addDim(axis=None, dim=None)[source]
Parameters
  • axis – which axis is defining this dimension

  • dim (Dimension) – definition of the dimension to add

fit()[source]

Fit dimensions into the data.

Returns

return status of the fit and fail reason, if any

Return type

Union[bool,str,None]

setDataset(dataset, indices=None, bg_indices=None, bg_dataset=None)[source]
Parameters

dataset (Dataset) – the dataset for which we want to define the dimensions.

setDims(dims)[source]
Parameters

dims (dict) – axis as key and Dimension as value.

class darfix.gui.dataPartitionWidget.DataPartitionWidget(parent=None)[source]
setDataset(dataset, indices=None, bg_indices=None, bg_dataset=None)[source]

Dataset setter.

Parameters

dataset (Dataset) – dataset

class darfix.gui.shiftCorrectionWidget.ShiftCorrectionWidget(parent=None)[source]

A widget to apply shift correction to a stack of images

correct()[source]

Function that starts the thread to compute the shift given at the input widget

getStack()[source]

Stack getter

Returns

StackViewMainWindow:

setDataset(dataset, indices=None, bg_indices=None, bg_dataset=None)[source]

Dataset setter. Saves the dataset and updates the stack with the dataset data

Parameters

dataset (Dataset) – dataset

setStack(dataset=None)[source]

Sets new data to the stack. Mantains the current frame showed in the view.

Parameters

dataset (Dataset) – if not None, data set to the stack will be from the given dataset.

class darfix.gui.roiSelectionWidget.ROISelectionWidget(parent=None)[source]

Widget that allows the user to pick a ROI in any image of the dataset.

apply()[source]

Function that replaces the dataset data with the data shown in the stack of images. If the stack has a roi applied, it applies the same roi to the dark frames of the dataset. Signal emitted with the roi parameters.

applyRoi()[source]

Function to apply the region of interest at the data of the dataset and show the new data in the stack. Dataset data is not yet replaced. A new roi is created in the middle of the new stack.

clearStack()[source]

Clears stack.

getRoi()[source]

Returns the roi selected in the stackview.

Return type

silx.gui.plot.items.roi.RectangleROI

resetROI()[source]

Sets the region of interest in the middle of the stack, with size 1/5 of the image.

resetStack()[source]

Restores stack with the dataset data.

setDataset(dataset, indices=None, bg_indices=None, bg_dataset=None)[source]

Dataset setter. Saves the dataset and updates the stack with the dataset data

Parameters

dataset (Dataset) – dataset

setRoi(roi=None, origin=None, size=None, center=None)[source]

Sets a region of interest of the stack of images.

Parameters
  • roi (RectangleROI) – A region of interest.

  • origin (Tuple) – If a roi is not provided, used as an origin for the roi

  • size (Tuple) – If a roi is not provided, used as a size for the roi.

  • center (Tuple) – If a roi is not provided, used as a center for the roi.

setStack(dataset=None)[source]

Sets new data to the stack. Mantains the current frame showed in the view.

Parameters

dataset (Dataset) – if not None, data set to the stack will be from the given dataset.

class darfix.gui.noiseRemovalWidget.NoiseRemovalDialog(parent=None)[source]

Dialog with NoiseRemovalWidget as main window and standard buttons.

class darfix.gui.noiseRemovalWidget.NoiseRemovalWidget(parent=None)[source]

Widget to apply noise removal from a dataset. For now it can apply both background subtraction and hot pixel removal. For background subtraction the user can choose the background to use: dark frames, low intensity data or all the data. From these background frames, an image is computed either using the mean or the median.

setDataset(dataset, indices=None, bg_indices=None, bg_dataset=None)[source]

Dataset setter. Saves the dataset and updates the stack with the dataset data

Parameters

dataset (Dataset) – dataset

setStack(dataset=None)[source]

Sets new data to the stack. Mantains the current frame showed in the view.

Parameters

dataset (Dataset) – if not None, data set to the stack will be from the given dataset.

class darfix.gui.blindSourceSeparationWidget.BSSWidget(parent=None)[source]

Widget to apply blind source separation.

setDataset(dataset, indices=None, bg_indices=None, bg_dataset=None)[source]

Dataset setter.

Parameters

dataset (Dataset) – dataset

class darfix.gui.blindSourceSeparationWidget.Method[source]

Different shifts approaches that can be used for the shift detection and correction.

class darfix.gui.PCAWidget.PCAWidget(parent=None)[source]

Widget to apply PCA to a set of images and plot the eigenvalues found.

setDataset(dataset, indices=None, bg_indices=None, bg_dataset=None)[source]

Dataset setter.

Parameters

dataset (Dataset) – dataset

class darfix.gui.displayComponentsWidget.DisplayComponentsWidget(parent=None)[source]

Widget to apply blind source separation.

setComponents(components, W, dimensions)[source]

Components setter. Updates the plots with the components and their corresponding rocking curves.

Parameters
  • components (array_like) – stack of images with the components

  • W (array_like) – array with the rocking curves intensity

  • dimensions (dict) – dictionary with the values of the dimensions

class darfix.gui.displayComponentsWidget.PlotRockingCurves(parent=None)[source]

Widget to plot the rocking curves of the components. It can be filtered to show only the rocking curves of a certain moving dimension.

class darfix.gui.linkComponentsWidget.LinkComponentsWidget(parent=None)[source]

Widget to compare two stacks of images. Each of these stacks represents the components of a dataset.