pyFAI.opencl package

pyFAI.opencl.azim_csr module

class pyFAI.opencl.azim_csr.OCL_CSR_Integrator(lut, image_size, checksum=None, empty=None, unit=None, bin_centers=None, azim_centers=None, mask_checksum=None, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=None, profile=False, extra_buffers=None)

Bases: OpenclProcessing

Class in charge of doing a sparse-matrix multiplication in OpenCL using the CSR representation of the matrix.

It also performs the preprocessing using the preproc kernel

BLOCK_SIZE = 32
__init__(lut, image_size, checksum=None, empty=None, unit=None, bin_centers=None, azim_centers=None, mask_checksum=None, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=None, profile=False, extra_buffers=None)
Parameters:
  • lut – 3-tuple of arrays data: coefficient of the matrix in a 1D vector of float32 - size of nnz indices: Column index position for the data (same size as data) indptr: row pointer indicates the start of a given row. len nbin+1

  • image_size – Expected image size: image.size

  • checksum – pre-calculated checksum of the LUT to prevent re-calculating it :)

  • empty – value to be assigned to bins without contribution from any pixel

  • unit – Storage for the unit related to the LUT

  • bin_centers – the radial position of the bin_center, place_holder

  • azim_centers – the radial position of the bin_center, place_holder

  • mask_checksum – placeholder for the checksum of the mask

  • ctx – actual working context, left to None for automatic initialization from device type or platformid/deviceid

  • devicetype – type of device, can be “CPU”, “GPU”, “ACC” or “ALL”

  • platformid – integer with the platform_identifier, as given by clinfo

  • deviceid – Integer with the device identifier, as given by clinfo

  • block_size – preferred workgroup size, may vary depending on the outcome of the compilation

  • profile – switch on profiling to be able to profile at the kernel level, store profiling elements (makes code slightly slower)

  • extra_buffers – List of additional buffer description needed by derived classes

buffers = [BufferDescription(name='output', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='output4', size=4, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='tmp', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='image_raw', size=1, dtype=<class 'numpy.int64'>, flags=1), BufferDescription(name='image', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='variance', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='dark', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='dark_variance', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='flat', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='polarization', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='solidangle', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='absorption', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='mask', size=1, dtype=<class 'numpy.int8'>, flags=4)]
property check_mask
property checksum
compile_kernels(kernel_file=None)

Call the OpenCL compiler :param kernel_file: path to the kernel (by default use the one in the resources directory)

guess_workgroup_size(block_size=None)

Determines the optimal workgroup size.

For azimuthal integration, especially the 2D variant, the smallest possible is the size of a warp/wavefront.

The method can be overwritten by derived classes to select larger workgoup

Parameters:

block_size – Input workgroup size (block is the cuda name)

Returns:

the optimal workgoup size as integer

integrate(data, dummy=None, delta_dummy=None, dark=None, flat=None, solidangle=None, polarization=None, absorption=None, dark_checksum=None, flat_checksum=None, solidangle_checksum=None, polarization_checksum=None, absorption_checksum=None, preprocess_only=False, safe=True, normalization_factor=1.0, coef_power=1, out_merged=None, out_sum_data=None, out_sum_count=None)

Before performing azimuthal integration, the preprocessing is:

\[data = (data - dark) / (flat * solidangle * polarization)\]

Integration is performed using the CSR representation of the look-up table

Parameters:
  • dark – array of same shape as data for pre-processing

  • flat – array of same shape as data for pre-processing

  • solidangle – array of same shape as data for pre-processing

  • polarization – array of same shape as data for pre-processing

  • dark_checksum – CRC32 checksum of the given array

  • flat_checksum – CRC32 checksum of the given array

  • solidangle_checksum – CRC32 checksum of the given array

  • polarization_checksum – CRC32 checksum of the given array

  • safe – if True (default) compares arrays on GPU according to their checksum, unless, use the buffer location is used

  • preprocess_only – return the dark subtracted; flat field & solidangle & polarization corrected image, else

  • normalization_factor – divide raw signal by this value

  • coef_power – set to 2 for variance propagation, leave to 1 for mean calculation

  • out_merged – destination array or pyopencl array for averaged data

  • out_sum_data – destination array or pyopencl array for sum of all data

  • out_sum_count – destination array or pyopencl array for sum of the number of pixels

Returns:

averaged data, weighted histogram, unweighted histogram

integrate_legacy(data, dummy=None, delta_dummy=None, dark=None, flat=None, solidangle=None, polarization=None, absorption=None, dark_checksum=None, flat_checksum=None, solidangle_checksum=None, polarization_checksum=None, absorption_checksum=None, preprocess_only=False, safe=True, normalization_factor=1.0, coef_power=1, out_merged=None, out_sum_data=None, out_sum_count=None)

Before performing azimuthal integration, the preprocessing is:

\[data = (data - dark) / (flat * solidangle * polarization)\]

Integration is performed using the CSR representation of the look-up table

Parameters:
  • dark – array of same shape as data for pre-processing

  • flat – array of same shape as data for pre-processing

  • solidangle – array of same shape as data for pre-processing

  • polarization – array of same shape as data for pre-processing

  • dark_checksum – CRC32 checksum of the given array

  • flat_checksum – CRC32 checksum of the given array

  • solidangle_checksum – CRC32 checksum of the given array

  • polarization_checksum – CRC32 checksum of the given array

  • safe – if True (default) compares arrays on GPU according to their checksum, unless, use the buffer location is used

  • preprocess_only – return the dark subtracted; flat field & solidangle & polarization corrected image, else

  • normalization_factor – divide raw signal by this value

  • coef_power – set to 2 for variance propagation, leave to 1 for mean calculation

  • out_merged – destination array or pyopencl array for averaged data

  • out_sum_data – destination array or pyopencl array for sum of all data

  • out_sum_count – destination array or pyopencl array for sum of the number of pixels

Returns:

averaged data, weighted histogram, unweighted histogram

integrate_ng(data, dark=None, dummy=None, delta_dummy=None, error_model=ErrorModel.NO, variance=None, dark_variance=None, flat=None, solidangle=None, polarization=None, absorption=None, dark_checksum=None, flat_checksum=None, solidangle_checksum=None, polarization_checksum=None, absorption_checksum=None, dark_variance_checksum=None, safe=True, workgroup_size=None, normalization_factor=1.0, out_avgint=None, out_sem=None, out_std=None, out_merged=None)

Before performing azimuthal integration with proper variance propagation, the preprocessing is:

\[signal = (raw - dark) variance = variance + dark_variance normalization = normalization_factor*(flat * solidangle * polarization * absortoption) count = number of pixel contributing\]

Integration is performed using the CSR representation of the look-up table on all arrays: signal, variance, normalization and count

Parameters:
  • dark – array of same shape as data for pre-processing

  • dummy – value for invalid data

  • delta_dummy – precesion for dummy assessement

  • error_model – enum ErrorModel

  • variance – array of same shape as data for pre-processing

  • dark_variance – array of same shape as data for pre-processing

  • flat – array of same shape as data for pre-processing

  • solidangle – array of same shape as data for pre-processing

  • polarization – array of same shape as data for pre-processing

  • dark_checksum – CRC32 checksum of the given array

  • flat_checksum – CRC32 checksum of the given array

  • solidangle_checksum – CRC32 checksum of the given array

  • polarization_checksum – CRC32 checksum of the given array

  • safe – if True (default) compares arrays on GPU according to their checksum, unless, use the buffer location is used

  • workgroup_size – enforce this workgroup size

  • preprocess_only – return the dark subtracted; flat field & solidangle & polarization corrected image, else

  • normalization_factor – divide raw signal by this value

  • out_avgint – destination array or pyopencl array for average intensity

  • out_sem – destination array or pyopencl array for standard deviation (of mean)

  • out_std – destination array or pyopencl array for standard deviation (of pixels)

  • out_merged – destination array or pyopencl array for averaged data (float8!)

Returns:

named-tuple

kernel_files = ['silx:opencl/doubleword.cl', 'pyfai:openCL/preprocess.cl', 'pyfai:openCL/memset.cl', 'pyfai:openCL/ocl_azim_CSR.cl']
mapping = {<class 'numpy.int8'>: 's8_to_float', <class 'numpy.uint8'>: 'u8_to_float', <class 'numpy.int16'>: 's16_to_float', <class 'numpy.uint16'>: 'u16_to_float', <class 'numpy.uint32'>: 'u32_to_float', <class 'numpy.int32'>: 's32_to_float'}
send_buffer(data, dest, checksum=None, workgroup_size=None, convert=True)

Send a numpy array to the device, including the type conversion on the device if possible

Parameters:
  • data – numpy array with data

  • dest – name of the buffer as registered in the class

  • checksum – Checksum of the data to determine if the data needs to be transfered

  • workgroup_size – enforce kernel to run with given workgroup size

  • convert – if True (default) convert dtype on GPU, if false, leave as it is.

Returns:

the actual buffer where the data were sent

set_kernel_arguments()

Tie arguments of OpenCL kernel-functions to the actual kernels

sigma_clip(data, dark=None, dummy=None, delta_dummy=None, variance=None, dark_variance=None, flat=None, solidangle=None, polarization=None, absorption=None, dark_checksum=None, flat_checksum=None, solidangle_checksum=None, polarization_checksum=None, absorption_checksum=None, dark_variance_checksum=None, safe=True, error_model=ErrorModel.NO, normalization_factor=1.0, cutoff=4.0, cycle=5, out_avgint=None, out_sem=None, out_std=None, out_merged=None)

Perform a sigma-clipping iterative filter within each along each row. see the doc of scipy.stats.sigmaclip for more descriptions.

If the error model is “azimuthal”: the variance is the variance within a bin, which is refined at each iteration, can be costly !

Else, the error is propagated according to:

\[signal = (raw - dark) variance = variance + dark_variance normalization = normalization_factor*(flat * solidangle * polarization * absortoption) count = number of pixel contributing\]

Integration is performed using the CSR representation of the look-up table on all arrays: signal, variance, normalization and count

Parameters:
  • dark – array of same shape as data for pre-processing

  • dummy – value for invalid data

  • delta_dummy – precesion for dummy assessement

  • variance – array of same shape as data for pre-processing

  • dark_variance – array of same shape as data for pre-processing

  • flat – array of same shape as data for pre-processing

  • solidangle – array of same shape as data for pre-processing

  • polarization – array of same shape as data for pre-processing

  • dark_checksum – CRC32 checksum of the given array

  • flat_checksum – CRC32 checksum of the given array

  • solidangle_checksum – CRC32 checksum of the given array

  • polarization_checksum – CRC32 checksum of the given array

  • safe – if True (default) compares arrays on GPU according to their checksum, unless, use the buffer location is used

  • preprocess_only – return the dark subtracted; flat field & solidangle & polarization corrected image, else

  • error_model – enum ErrorModel

  • normalization_factor – divide raw signal by this value

  • cutoff – discard all points with |value - avg| > cutoff * sigma. 3-4 is quite common

  • cycle – perform at maximum this number of cycles. 5 is common.

  • out_avgint – destination array or pyopencl array for sum of all data

  • out_sem – destination array or pyopencl array for uncertainty on mean value

  • out_std – destination array or pyopencl array for uncertainty on pixel value

  • out_merged – destination array or pyopencl array for averaged data (float8!)

Returns:

namedtuple with “position intensity error signal variance normalization count”

pyFAI.opencl.azim_hist module

Histogram (atomic-add) based integrator

class pyFAI.opencl.azim_hist.OCL_Histogram1d(radial, bins, radial_checksum=None, empty=None, unit=None, azimuthal=None, azimuthal_checksum=None, mask=None, mask_checksum=None, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=None, profile=False)

Bases: OpenclProcessing

Class in charge of performing histogram calculation in OpenCL using atomic_add

It also performs the preprocessing using the preproc kernel

BLOCK_SIZE = 32
__init__(radial, bins, radial_checksum=None, empty=None, unit=None, azimuthal=None, azimuthal_checksum=None, mask=None, mask_checksum=None, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=None, profile=False)
Parameters:
  • radial – array with the radial position of every single pixel. Same as image size

  • bins – number of bins on which to histogram

  • checksum – pre-calculated checksum of the radial array to prevent re-calculating it :)

  • empty – value to be assigned to bins without contribution from any pixel

  • unit – just a place_holder for the units of radial array.

  • azimuthal – array with the azimuthal position, same size as radial

  • azimuthal_checksum – Checksum of the azimuthal array

  • mask – Array with the mask, 0 for valid values, anything for masked pixels, same size as radial

  • ctx – actual working context, left to None for automatic initialization from device type or platformid/deviceid

  • devicetype – type of device, can be “CPU”, “GPU”, “ACC” or “ALL”

  • platformid – integer with the platform_identifier, as given by clinfo

  • deviceid – Integer with the device identifier, as given by clinfo

  • block_size – preferred workgroup size, may vary depending on the outpcome of the compilation

  • profile – switch on profiling to be able to profile at the kernel level, store profiling elements (makes code slightly slower)

buffers = [BufferDescription(name='output4', size=4, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='radial', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='azimuthal', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='tmp', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='image_raw', size=1, dtype=<class 'numpy.int64'>, flags=4), BufferDescription(name='image', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='variance', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='dark', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='dark_variance', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='flat', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='polarization', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='solidangle', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='absorption', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='mask', size=1, dtype=<class 'numpy.int8'>, flags=4)]
compile_kernels(kernel_file=None)

Call the OpenCL compiler

Parameters:

kernel_file – path to the kernel (by default use the one in the resources directory)

integrate(data, dark=None, dummy=None, delta_dummy=None, error_model=ErrorModel.NO, variance=None, dark_variance=None, flat=None, solidangle=None, polarization=None, absorption=None, dark_checksum=None, flat_checksum=None, solidangle_checksum=None, polarization_checksum=None, absorption_checksum=None, preprocess_only=False, safe=True, normalization_factor=1.0, radial_range=None, azimuth_range=None, histo_signal=None, histo_variance=None, histo_normalization=None, histo_normalization_sq=None, histo_count=None, intensity=None, std=None, sem=None)

Performing azimuthal integration, the preprocessing is:

\[Signal= (data - dark) Variance= (variance + dark_variance) Normalization= (normalization_factor * flat * solidangle * polarization * absorption) Count= 1 per valid pixel\]

Integration is performed using the histograms (based on atomic adds

Parameters:
  • dark – array of same shape as data for pre-processing

  • dummy – value for invalid data

  • delta_dummy – precesion for dummy assessement

  • error_model – set to “poisson” assume variance is data (minimum 1)

  • variance – array of same shape as data for pre-processing

  • dark_variance – array of same shape as data for pre-processing

  • flat – array of same shape as data for pre-processing

  • solidangle – array of same shape as data for pre-processing

  • polarization – array of same shape as data for pre-processing

  • dark_checksum – CRC32 checksum of the given array

  • flat_checksum – CRC32 checksum of the given array

  • solidangle_checksum – CRC32 checksum of the given array

  • polarization_checksum – CRC32 checksum of the given array

  • safe – if True (default) compares arrays on GPU according to their checksum, unless, use the buffer location is used

  • preprocess_only – return the dark subtracted; flat field & solidangle & polarization corrected image, else

  • normalization_factor – divide raw signal by this value

  • radial_range – provide lower and upper bound for radial array

  • azimuth_range – provide lower and upper bound for azimuthal array

  • histo_signal – destination array or pyopencl array for sum of signals

  • histo_normalization – destination array or pyopencl array for sum of normalization

  • histo_normalization_sq – destination array or pyopencl array for sum of normalization squared

  • histo_count – destination array or pyopencl array for counting pixels

  • intensity – destination PyOpenCL array for integrated intensity

  • std – destination PyOpenCL array for standard deviation

  • sem – destination PyOpenCL array for standard error of the mean

Returns:

bin_positions, averaged data, histogram of signal, histogram of variance, histogram of normalization, count of pixels

kernel_files = ['silx:opencl/doubleword.cl', 'pyfai:openCL/preprocess.cl', 'pyfai:openCL/ocl_histo.cl']
mapping = {<class 'numpy.int8'>: 's8_to_float', <class 'numpy.uint8'>: 'u8_to_float', <class 'numpy.int16'>: 's16_to_float', <class 'numpy.uint16'>: 'u16_to_float', <class 'numpy.uint32'>: 'u32_to_float', <class 'numpy.int32'>: 's32_to_float'}
send_buffer(data, dest, checksum=None)

Send a numpy array to the device, including the cast on the device if possible

Parameters:
  • data – numpy array with data

  • dest – name of the buffer as registered in the class

set_kernel_arguments()

Tie arguments of OpenCL kernel-functions to the actual kernels

class pyFAI.opencl.azim_hist.OCL_Histogram2d(radial, azimuthal, bins_radial, bins_azimuthal, radial_checksum=None, azimuthal_checksum=None, empty=None, unit=None, mask=None, mask_checksum=None, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=None, profile=False)

Bases: OCL_Histogram1d

Class in charge of performing histogram calculation in OpenCL using atomic_add

It also performs the preprocessing using the preproc kernel

BLOCK_SIZE = 32
__init__(radial, azimuthal, bins_radial, bins_azimuthal, radial_checksum=None, azimuthal_checksum=None, empty=None, unit=None, mask=None, mask_checksum=None, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=None, profile=False)
Parameters:
  • radial – array with the radial position of every single pixel. Same as image size

  • azimuthal – array with the azimuthal position of every single pixel. Same as image size

  • bins_radial – number of bins on which to histogram is calculated in radial direction

  • bins_azimuthal – number of bins on which to histogram is calculated in azimuthal direction

  • radial_checksum – pre-calculated checksum of the position array to prevent re-calculating it :)

  • azimuthal_checksum – pre-calculated checksum of the position array to prevent re-calculating it :)

  • empty – value to be assigned to bins without contribution from any pixel

  • ctx – actual working context, left to None for automatic initialization from device type or platformid/deviceid

  • devicetype – type of device, can be “CPU”, “GPU”, “ACC” or “ALL”

  • platformid – integer with the platform_identifier, as given by clinfo

  • deviceid – Integer with the device identifier, as given by clinfo

  • block_size – preferred workgroup size, may vary depending on the outpcome of the compilation

  • profile – switch on profiling to be able to profile at the kernel level, store profiling elements (makes code slightly slower)

buffers = [BufferDescription(name='output4', size=4, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='radial', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='azimuthal', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='image_raw', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='image', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='variance', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='dark', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='dark_variance', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='flat', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='polarization', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='solidangle', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='absorption', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='mask', size=1, dtype=<class 'numpy.int8'>, flags=4)]
integrate(data, dark=None, dummy=None, delta_dummy=None, error_model=ErrorModel.NO, variance=None, dark_variance=None, flat=None, solidangle=None, polarization=None, absorption=None, dark_checksum=None, flat_checksum=None, solidangle_checksum=None, polarization_checksum=None, absorption_checksum=None, preprocess_only=False, safe=True, normalization_factor=1.0, radial_range=None, azimuthal_range=None, histo_signal=None, histo_variance=None, histo_normalization=None, histo_count=None, histo_normalization_sq=None, intensity=None, std=None, sem=None)

Performing azimuthal integration, the preprocessing is:

\[Signal= (data - dark) Variance= (variance + dark_variance) Normalization= (normalization_factor * flat * solidangle * polarization * absorption) Count= 1 per valid pixel\]

Integration is performed using the histograms (based on atomic adds

Parameters:
  • dark – array of same shape as data for pre-processing

  • dummy – value for invalid data

  • delta_dummy – precesion for dummy assessement

  • variance – array of same shape as data for pre-processing

  • dark_variance – array of same shape as data for pre-processing

  • flat – array of same shape as data for pre-processing

  • solidangle – array of same shape as data for pre-processing

  • polarization – array of same shape as data for pre-processing

  • dark_checksum – CRC32 checksum of the given array

  • flat_checksum – CRC32 checksum of the given array

  • solidangle_checksum – CRC32 checksum of the given array

  • polarization_checksum – CRC32 checksum of the given array

  • safe – if True (default) compares arrays on GPU according to their checksum, unless, use the buffer location is used

  • preprocess_only – return the dark subtracted; flat field & solidangle & polarization corrected image, else

  • normalization_factor – divide raw signal by this value

  • radial_range – provide lower and upper bound for radial array

  • azimuth_range – provide lower and upper bound for azimuthal array

  • histo_signal – destination array or pyopencl array for sum of signals

  • histo_normalization – destination array or pyopencl array for sum of normalization

  • histo_normalization_sq – destination PyOpenCL array or pyopencl array for sum of normalization squared

  • histo_count – destination PyOpenCL array or pyopencl array for counting pixels

  • intensity – destination PyOpenCL array for integrated intensity

  • std – destination PyOpenCL array for standard deviation

  • sem – destination PyOpenCL array for standard error of the mean

Returns:

bin_positions, averaged data, histogram of signal, histogram of variance, histogram of normalization, count of pixels

kernel_files = ['silx:opencl/doubleword.cl', 'pyfai:openCL/preprocess.cl', 'pyfai:openCL/ocl_histo.cl']
mapping = {<class 'numpy.int8'>: 's8_to_float', <class 'numpy.uint8'>: 'u8_to_float', <class 'numpy.int16'>: 's16_to_float', <class 'numpy.uint16'>: 'u16_to_float', <class 'numpy.uint32'>: 'u32_to_float', <class 'numpy.int32'>: 's32_to_float'}
set_kernel_arguments()

Tie arguments of OpenCL kernel-functions to the actual kernels

pyFAI.opencl.azim_lut module

class pyFAI.opencl.azim_lut.OCL_LUT_Integrator(lut, image_size, checksum=None, empty=None, unit=None, bin_centers=None, azim_centers=None, mask_checksum=None, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=None, profile=False)

Bases: OpenclProcessing

Class in charge of doing a sparse-matrix multiplication in OpenCL using the LUT representation of the matrix. It also performs the preprocessing using the preproc kernel

BLOCK_SIZE = 32
__init__(lut, image_size, checksum=None, empty=None, unit=None, bin_centers=None, azim_centers=None, mask_checksum=None, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=None, profile=False)

Constructor of the OCL_LUT_Integrator class

Parameters:
  • lut – array of int32 - float32 with shape (nbins, lut_size) with indexes and coefficients

  • image_size – Expected image size: image.size

  • checksum – pre-calculated checksum of the LUT to prevent re-calculating it :)

  • empty – value to be assigned to bins without contribution from any pixel

  • unit – Storage for the unit related to the LUT

  • bin_centers – the radial position of the bin_center, place_holder

  • azim_centers – the radial position of the bin_center, place_holder

  • mask_checksum – placeholder for the checksum of the mask

  • ctx – actual working context, left to None for automatic initialization from device type or platformid/deviceid

  • devicetype – type of device, can be “CPU”, “GPU”, “ACC” or “ALL”

  • platformid – integer with the platform_identifier, as given by clinfo

  • deviceid – Integer with the device identifier, as given by clinfo

  • block_size – preferred workgroup size, may vary depending on the outpcome of the compilation

  • profile – switch on profiling to be able to profile at the kernel level, store profiling elements (makes code slightly slower)

buffers = [BufferDescription(name='output', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='output4', size=4, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='tmp', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='image_raw', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='image', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='variance', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='dark', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='dark_variance', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='flat', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='polarization', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='solidangle', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='absorption', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='mask', size=1, dtype=<class 'numpy.int8'>, flags=4)]
property check_mask
property checksum
compile_kernels(kernel_file=None)

Call the OpenCL compiler :param kernel_file: path to the kernel (by default use the one in the resources directory)

integrate(data, dummy=None, delta_dummy=None, dark=None, flat=None, solidangle=None, polarization=None, absorption=None, dark_checksum=None, flat_checksum=None, solidangle_checksum=None, polarization_checksum=None, absorption_checksum=None, preprocess_only=False, safe=True, normalization_factor=1.0, coef_power=1, out_merged=None, out_sum_data=None, out_sum_count=None)

Before performing azimuthal integration, the preprocessing is:

\[data = (data - dark) / (flat * solidangle * polarization * absorption)\]

Integration is performed using the LUT representation of the look-up table

Parameters:
  • dark – array of same shape as data for pre-processing

  • flat – array of same shape as data for pre-processing

  • solidangle – array of same shape as data for pre-processing

  • polarization – array of same shape as data for pre-processing

  • dark_checksum – CRC32 checksum of the given array

  • flat_checksum – CRC32 checksum of the given array

  • solidangle_checksum – CRC32 checksum of the given array

  • polarization_checksum – CRC32 checksum of the given array

  • safe – if True (default) compares arrays on GPU according to their checksum, unless, use the buffer location is used

  • preprocess_only – return the dark subtracted; flat field & solidangle & polarization corrected image, else

  • normalization_factor – divide raw signal by this value

  • coef_power – set to 2 for variance propagation, leave to 1 for mean calculation

  • out_merged – destination array or pyopencl array for averaged data

  • out_sum_data – destination array or pyopencl array for sum of all data

  • out_sum_count – destination array or pyopencl array for sum of the number of pixels

Returns:

averaged data, weighted histogram, unweighted histogram

integrate_legacy(data, dummy=None, delta_dummy=None, dark=None, flat=None, solidangle=None, polarization=None, absorption=None, dark_checksum=None, flat_checksum=None, solidangle_checksum=None, polarization_checksum=None, absorption_checksum=None, preprocess_only=False, safe=True, normalization_factor=1.0, coef_power=1, out_merged=None, out_sum_data=None, out_sum_count=None)

Before performing azimuthal integration, the preprocessing is:

\[data = (data - dark) / (flat * solidangle * polarization * absorption)\]

Integration is performed using the LUT representation of the look-up table

Parameters:
  • dark – array of same shape as data for pre-processing

  • flat – array of same shape as data for pre-processing

  • solidangle – array of same shape as data for pre-processing

  • polarization – array of same shape as data for pre-processing

  • dark_checksum – CRC32 checksum of the given array

  • flat_checksum – CRC32 checksum of the given array

  • solidangle_checksum – CRC32 checksum of the given array

  • polarization_checksum – CRC32 checksum of the given array

  • safe – if True (default) compares arrays on GPU according to their checksum, unless, use the buffer location is used

  • preprocess_only – return the dark subtracted; flat field & solidangle & polarization corrected image, else

  • normalization_factor – divide raw signal by this value

  • coef_power – set to 2 for variance propagation, leave to 1 for mean calculation

  • out_merged – destination array or pyopencl array for averaged data

  • out_sum_data – destination array or pyopencl array for sum of all data

  • out_sum_count – destination array or pyopencl array for sum of the number of pixels

Returns:

averaged data, weighted histogram, unweighted histogram

integrate_ng(data, dark=None, dummy=None, delta_dummy=None, error_model=ErrorModel.NO, variance=None, dark_variance=None, flat=None, solidangle=None, polarization=None, absorption=None, dark_checksum=None, flat_checksum=None, solidangle_checksum=None, polarization_checksum=None, absorption_checksum=None, dark_variance_checksum=None, safe=True, normalization_factor=1.0, out_avgint=None, out_sem=None, out_std=None, out_merged=None)

Before performing azimuthal integration with proper variance propagation, the preprocessing is:

\[signal = (raw - dark) variance = variance + dark_variance normalization = normalization_factor*(flat * solidangle * polarization * absortoption) count = number of pixel contributing\]

Integration is performed using the LUT representation of the look-up table on all arrays: signal, variance, normalization and count

Parameters:
  • dark – array of same shape as data for pre-processing

  • dummy – value for invalid data

  • delta_dummy – precesion for dummy assessement

  • error_model – select the ErrorModel (defined in enum), use POISSON to enforce variance=signal

  • variance – array of same shape as data for pre-processing

  • dark_variance – array of same shape as data for pre-processing

  • flat – array of same shape as data for pre-processing

  • solidangle – array of same shape as data for pre-processing

  • polarization – array of same shape as data for pre-processing

  • dark_checksum – CRC32 checksum of the given array

  • flat_checksum – CRC32 checksum of the given array

  • solidangle_checksum – CRC32 checksum of the given array

  • polarization_checksum – CRC32 checksum of the given array

  • safe – if True (default) compares arrays on GPU according to their checksum, unless, use the buffer location is used

  • preprocess_only – return the dark subtracted; flat field & solidangle & polarization corrected image, else

  • normalization_factor – divide raw signal by this value

  • out_avgint – destination array or pyopencl array for average intensity

  • out_sem – destination array or pyopencl array for standard deviation (of mean)

  • out_std – destination array or pyopencl array for standard deviation (of pixels)

  • out_merged – destination array or pyopencl array for averaged data (float8!)

Returns:

large namedtuple with out_avgint, out_sem, out_merged …

kernel_files = ['silx:opencl/doubleword.cl', 'pyfai:openCL/preprocess.cl', 'pyfai:openCL/memset.cl', 'pyfai:openCL/ocl_azim_LUT.cl']
mapping = {<class 'numpy.int8'>: 's8_to_float', <class 'numpy.uint8'>: 'u8_to_float', <class 'numpy.int16'>: 's16_to_float', <class 'numpy.uint16'>: 'u16_to_float', <class 'numpy.uint32'>: 'u32_to_float', <class 'numpy.int32'>: 's32_to_float'}
send_buffer(data, dest, checksum=None)

Send a numpy array to the device, including the cast on the device if possible

Parameters:
  • data – numpy array with data

  • dest – name of the buffer as registered in the class

  • convert – if True (default) convert dtype on GPU, if false, leave as it is.

Returns:

the actual buffer where the data were sent

set_kernel_arguments()

Tie arguments of OpenCL kernel-functions to the actual kernels

set_kernel_arguments() is a private method, called by configure(). It uses the dictionary _cl_kernel_args. Note that by default, since TthRange is disabled, the integration kernels have tth_min_max tied to the tthRange argument slot. When setRange is called it replaces that argument with tthRange low and upper bounds. When unsetRange is called, the argument slot is reset to tth_min_max.

pyFAI.opencl.preproc module

OpenCL implementation of the preproc module

class pyFAI.opencl.preproc.OCL_Preproc(image_size=None, image_dtype=None, image=None, dark=None, flat=None, solidangle=None, polarization=None, absorption=None, mask=None, dummy=None, delta_dummy=None, empty=None, split_result=False, calc_variance=False, error_model=ErrorModel.NO, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=32, profile=False)

Bases: OpenclProcessing

OpenCL class for pre-processing … mainly for demonstration

__init__(image_size=None, image_dtype=None, image=None, dark=None, flat=None, solidangle=None, polarization=None, absorption=None, mask=None, dummy=None, delta_dummy=None, empty=None, split_result=False, calc_variance=False, error_model=ErrorModel.NO, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=32, profile=False)
Parameters:
  • image_size – (int) number of element of the input image

  • image_dtype – dtype of the input image

  • image – retrieve image_size and image_dtype from template

  • dark – dark current image as numpy array

  • flat – flat field image as numpy array

  • solidangle – solid angle image as numpy array

  • absorption – absorption image as numpy array

  • mask – array of int8 with 0 where the data are valid

  • dummy – value of impossible values: dynamic mask

  • delta_dummy – precision for dummy values

  • empty – value to be assigned to pixel without contribution (i.e masked)

  • split_result – return the result a tuple: data, [variance], normalization, so the last dim becomes 2 or 3, can be also 4 for full error propagation

  • calc_variance – report the result as data, variance, normalization

  • error_model – default error-model to be used

  • ctx – actual working context, left to None for automatic initialization from device type or platformid/deviceid

  • devicetype – type of device, can be “CPU”, “GPU”, “ACC” or “ALL”

  • platformid – integer with the platform_identifier, as given by clinfo

  • deviceid – Integer with the device identifier, as given by clinfo

  • block_size – preferred workgroup size, may vary depending on the outpcome of the compilation

  • profile – switch on profiling to be able to profile at the kernel level, store profiling elements (makes code slower)

buffers = [BufferDescription(name='output', size=4, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='image', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='image_raw', size=8, dtype=<class 'numpy.uint8'>, flags=1), BufferDescription(name='temp', size=4, dtype=<class 'numpy.uint8'>, flags=1), BufferDescription(name='variance', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='dark', size=1, dtype=<class 'numpy.float32'>, flags=1), BufferDescription(name='dark_variance', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='flat', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='polarization', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='solidangle', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='absorption', size=1, dtype=<class 'numpy.float32'>, flags=4), BufferDescription(name='mask', size=1, dtype=<class 'numpy.int8'>, flags=4)]
compile_kernels(kernel_files=None, compile_options=None)

Call the OpenCL compiler

Parameters:

kernel_files – list of path to the kernel (by default use the one declared in the class)

property delta_dummy
property dummy
property empty
kernel_files = ['pyfai:openCL/preprocess.cl']
mapping = {<class 'numpy.int8'>: 's8_to_float', <class 'numpy.uint8'>: 'u8_to_float', <class 'numpy.int16'>: 's16_to_float', <class 'numpy.uint16'>: 'u16_to_float', <class 'numpy.uint32'>: 'u32_to_float', <class 'numpy.int32'>: 's32_to_float'}
process(image, dark=None, variance=None, dark_variance=None, normalization_factor=1.0, error_model=None, split_result=None, out=None)

Perform the pixel-wise operation of the array

Parameters:
  • raw – numpy array with the input image

  • dark – numpy array with the dark-current image

  • variance – numpy array with the variance of input image

  • dark_variance – numpy array with the variance of dark-current image

  • normalization_factor – divide the result by this

  • error_model – set to “poisson” to set variance=signal (minimum 1). None uses the default from constructor

  • out – output buffer to save a malloc

Returns:

array with processed data, may be an array of (data,variance,normalization) depending on class initialization

send_buffer(data, dest, convert=True)

Send a numpy array to the device

Parameters:
  • data – numpy array with data

  • dest – name of the buffer as registered in the class.

  • convert – if True (default) convert dtype on GPU, if false, leave as it is in buffer named dest_raw

Returns:

the destination buffer/array

set_kernel_arguments()

Tie arguments of OpenCL kernel-functions to the actual kernels

pyFAI.opencl.preproc.preproc(raw, dark=None, flat=None, solidangle=None, polarization=None, absorption=None, mask=None, dummy=None, delta_dummy=None, normalization_factor=1.0, empty=None, split_result=False, variance=None, dark_variance=None, error_model=ErrorModel.NO, dtype=<class 'numpy.float32'>, out=None)

Common preprocessing step, implemented using OpenCL. May be inefficient

Parameters:
  • data – raw value, as a numpy array, 1D or 2D

  • mask – array non null where data should be ignored

  • dummy – value of invalid data

  • delta_dummy – precision for invalid data

  • dark – array containing the value of the dark noise, to be subtracted

  • flat – Array containing the flatfield image. It is also checked for dummies if relevant.

  • solidangle – the value of the solid_angle. This processing may be performed during the rebinning instead. left for compatibility

  • polarization – Correction for polarization of the incident beam

  • absorption – Correction for absorption in the sensor volume

  • normalization_factor – final value is divided by this

  • empty – value to be given for empty pixels

  • split_result – set to true to separate numerator from denominator and return an array of float2 or float3 (with variance)

  • variance – provide an estimation of the variance, enforce split_result=True and return an float3 array with variance in second position.

  • error_model – set to POISSONIAN to assume variance=signal

  • dtype – dtype for all processing

  • out – output buffer to save a malloc

All calculation are performed in single precision floating point (32 bits).

NaN are always considered as invalid values

if neither empty nor dummy is provided, empty pixels are 0. Empty pixels are always zero in “split_result” mode

Split result:

  • When set to False, i.e the default, the pixel-wise operation is: I = (raw - dark)/(flat * solidangle * polarization * absorption) Invalid pixels are set to the dummy or empty value.

  • When split_ressult is set to True, each result result is a float2 or a float3 (with an additional value for the variance) as such:

    I = [(raw - dark), (variance), (flat * solidangle * polarization * absorption)]

    Empty pixels will have all their 2 or 3 values to 0 (and not to dummy or empty value)

  • If error_model is set to poissonian, the variance is evaluated as (raw + dark) minimum (1)

pyFAI.opencl.sort module

Module for 2D sort based on OpenCL for median filtering and Bragg/amorphous separation on GPU.

class pyFAI.opencl.sort.Separator(npt_height=512, npt_width=1024, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=None, profile=False)

Bases: OpenclProcessing

Implementation of sort, median filter and trimmed-mean in pyopencl

DUMMY = -3.4028235e+38
__init__(npt_height=512, npt_width=1024, ctx=None, devicetype='all', platformid=None, deviceid=None, block_size=None, profile=False)
Parameters:
  • ctx – context

  • block_size – 1 on macOSX on CPU

  • profile – turn on profiling

allocate_buffers(*arg, **kwarg)

Allocate OpenCL buffers required for a specific configuration

Note that an OpenCL context also requires some memory, as well as Event and other OpenCL functionalities which cannot and are not taken into account here. The memory required by a context varies depending on the device. Typical for GTX580 is 65Mb but for a 9300m is ~15Mb In addition, a GPU will always have at least 3-5Mb of memory in use. Unfortunately, OpenCL does NOT have a built-in way to check the actual free memory on a device, only the total memory.

filter_horizontal(data, dummy=None, quantile=0.5)

Sort the data along the vertical axis (azimuthal)

Parameters:
  • data – numpy or pyopencl array

  • dummy – dummy value

  • quantile

Returns:

pyopencl array

filter_vertical(data, dummy=None, quantile=0.5)

Sort the data along the vertical axis (azimuthal)

Parameters:
  • data – numpy or pyopencl array

  • dummy – dummy value

  • quantile

Returns:

pyopencl array

kernel_files = ['silx:opencl/doubleword.cl', 'pyfai:openCL/bitonic.cl', 'pyfai:openCL/separate.cl', 'pyfai:openCL/sigma_clip.cl']
mean_std_horizontal(data, dummy=None)

calculates the mean and std along a row

mean_std_vertical(data, dummy=None)

calculates the mean and std along a column, column size has to be multiple of 8 and <8192

set_kernel_arguments()

Tie arguments of OpenCL kernel-functions to the actual kernels

set_kernel_arguments() is a private method, called by configure(). It uses the dictionary _cl_kernel_args.

Note that by default, since TthRange is disabled, the integration kernels have tth_min_max tied to the tthRange argument slot.

When setRange is called it replaces that argument with tthRange low and upper bounds. When unsetRange is called, the argument slot is reset to tth_min_max.

sigma_clip_horizontal(data, sigma_lo=3, sigma_hi=None, max_iter=5, dummy=None)

calculates iterative sigma-clipped mean and std per row. column size has to be multiple of 8 and <8192

sigma_clip_vertical(data, sigma_lo=3, sigma_hi=None, max_iter=5, dummy=None)

calculates iterative sigma-clipped mean and std per column. column size has to be multiple of 8 and <8192

sort_horizontal(data, dummy=None)

Sort the data along the horizontal axis (radial)

Parameters:
  • data – numpy or pyopencl array

  • dummy – dummy value

Returns:

pyopencl array

sort_vertical(data, dummy=None)

Sort the data along the vertical axis (azimuthal)

Parameters:
  • data – numpy or pyopencl array

  • dummy – dummy value

Returns:

pyopencl array

trimmed_mean_horizontal(data, dummy=None, quantiles=(0.5, 0.5))

Perform a trimmed mean (mean without the extremes) After sorting the data along the vertical axis (azimuthal)

Parameters:
  • data – numpy or pyopencl array

  • dummy – dummy value

  • quantile

Returns:

pyopencl array

trimmed_mean_vertical(data, dummy=None, quantiles=(0.5, 0.5))

Perform a trimmed mean (mean without the extremes) After sorting the data along the vertical axis (azimuthal)

Parameters:
  • data – numpy or pyopencl array

  • dummy – dummy value

  • quantile

Returns:

pyopencl array

Module contents

Contains all OpenCL implementation.

pyFAI.opencl.dtype_converter(dtype)

convert a numpy dtype as a int8

pyFAI.opencl.get_x87_volatile_option(ctx)