Skip to content

nabu.cuda.utils

source module nabu.cuda.utils

Functions

  • get_cuda_stream Notes


    In cupy, contexts/device/stream management is not so explicit. It seems that cupy wants to hide these things from user and handle it automatically. There are multiple issues: - cuda.get_current_stream() will always return device_id = -1 (even when called with device_id != -1) - A Device object does not have a list of attached streams - A Stream object does not have its device, we have keep various objects references

  • detect_cuda_gpus Detect the available Nvidia CUDA GPUs on the current host.

  • collect_cuda_gpus Return a dictionary of GPU ids and brief description of each CUDA-compatible GPU with a few fields.

  • dtype_to_ctype

  • to_int2 Convert a 1D array of length 2 into a "int2" data type. Beware, the first coordinate is x ! (as opposed to usual python/numpy/C convention)

  • create_texture Create a texture object.

  • cupy_array_from_ptr

  • copy_to_cuarray_with_offset Copy a cupy.ndarray into a CUDA Array (matrix-like data structures that bakes textures).

source get_cuda_stream(device_id=None, force_create=False)

Notes

In cupy, contexts/device/stream management is not so explicit. It seems that cupy wants to hide these things from user and handle it automatically. There are multiple issues: - cuda.get_current_stream() will always return device_id = -1 (even when called with device_id != -1) - A Device object does not have a list of attached streams - A Stream object does not have its device, we have keep various objects references

Solution so far

  • Call Device(id).use() at program startup
  • Use cuda.runtime.getDevice()
  • Use cuda.stream.Stream()

source detect_cuda_gpus()

Detect the available Nvidia CUDA GPUs on the current host.

Returns

  • gpus : dict Dictionary where the key is the GPU ID, and the value is a cupy.cuda.Device object.

  • error_msg : str In the case where there is an error, the message is returned in this item. Otherwise, it is a None object.

source collect_cuda_gpus(on_error='warn')

Return a dictionary of GPU ids and brief description of each CUDA-compatible GPU with a few fields.

Parameters

  • on_error : str, optional What to do on error. Possible values: - "warn": print a warning - "raise": throw an error

Raises

  • RuntimeError

source dtype_to_ctype(dtype)

source to_int2(arr)

Convert a 1D array of length 2 into a "int2" data type. Beware, the first coordinate is x ! (as opposed to usual python/numpy/C convention)

source create_texture(shape, dtype, from_image=None, address_mode='border', filter_mode='linear', normalized_coords=False)

Create a texture object.

Parameters

  • shape : tuple of int Image shape

  • dtype : numpy.dtype Data type

  • from_image : numpy.ndarray, optional Image to be used as a texture. If provided, above parameters are replaced with image properties.

  • address_mode : str, optional Which address mode to use. Can be: - "border" (default): extension with zeros - "clamp": extension with edges - "mirror": extension with mirroring - "wrap": periodic extension

  • filter_mode : str, optional Which filter mode to use when accessing texture at non-integer coordinates. Can be "linear" or "nearest" Default is (bi)linear filtering.

  • normalized_coords : bool, optional Whether to use normalized coordinates. Default is False.

Returns

  • tex_obj : cupy.cuda.texture.TextureObject Texture object that can be passed to kernels (accessed in source code with 'cudaTextureObject_t' type)

  • cu_arr : cupy.cuda.texture.CUDAarray Cuda array that bakes the texture

Notes

  • Two-dimensional images default to single-chanel textures.
  • It will allocate memory for the underlying "cuda array" object

Raises

  • NotImplementedError

source cupy_array_from_ptr(ptr, shape, dtype, owner_reference)

source copy_to_cuarray_with_offset(dst_cuarray, src_array, offset_x=0, offset_y=0)

Copy a cupy.ndarray into a CUDA Array (matrix-like data structures that bakes textures).

Copies a matrix ('height' rows of 'width' bytes each) from the memory area pointed to by 'src' to the CUDA array 'dst' starting at the upper left corner (wOffset, hOffset) [...] 'spitch' is the width in memory in bytes of the 2D array pointed to by 'src', including any padding added to the end of each row. - 'wOffset' + 'width' must not exceed the width of the CUDA array 'dst'. - 'width' must not exceed 'spitch'.

NB: for copies without offset, we can just use dst_cuarray.copy_from(src_cupyarray)