scilpy.gpuparallel package

scilpy.gpuparallel.opencl_utils module

class scilpy.gpuparallel.opencl_utils.CLKernel(entrypoint, module, filename)[source]

Bases: object

Wrapper for OpenCL kernel/program code.

Parameters:
  • entrypoint (string) – Name of __kernel function in .cl file.

  • module (string) – Scilpy module in which the kernel code is located.

  • filename (string) – Name for the file containing the kernel code.

property code_string
property entry_point
set_define(def_name, value)[source]

Set the value for a compiler definition in the kernel code. This method will overwrite the previous value for this definition.

Parameters:
  • def_name (string) – Name of definition. By convention, #define should be in upper case. Therefore, this value will also be converted to upper case.

  • value (string) – The value for the define. Will be replaced directly in the kernel code.

Note

Be careful! #define instructions are not typed and therefore prone to compilation errors. They are however faster to access than const variables. Moreover, they do not take additional space on the GPU.

class scilpy.gpuparallel.opencl_utils.CLManager(cl_kernel, device_type='gpu')[source]

Bases: object

Class for managing an OpenCL program.

Wraps a subset of pyopencl functions to simplify its integration with python. The OpenCL program can be run on the cpu or on the gpu, given the appropriate drivers are installed.

When multiple cpu or gpu are available, the one that first comes up in the list of available devices is selected.

Parameters:
  • cl_kernel (CLKernel object) – The CLKernel containing the OpenCL program to manage.

  • device_type (string) – The device onto which to run the program. One of ‘cpu’, ‘gpu’.

class OutBuffer(buf, shape, dtype)[source]

Bases: object

Structure containing output buffer information.

Parameters:
  • buf (cl.Buffer) – The cl.Buffer object containing the output.

  • shape (tuple) – Shape for the output array.

  • dtype (dtype) – Datatype for output.

add_input_buffer(key, arr=None, dtype=<class 'numpy.float32'>)[source]

Add an input buffer to the kernel program. Input buffers must be added in the same order as they are declared inside the kernel code (.cl file).

Parameters:
  • key (string) – Name of the buffer in the input buffers list. Used for referencing when updating buffers.

  • arr (numpy ndarray) – Data array.

  • dtype (dtype, optional) – Optional type for array data. It is recommended to use float32 whenever possible to avoid unexpected behaviours.

Note

Array is reordered as fortran array and then flattened. This is important to keep in mind when writing kernel code.

For example, for a 3-dimensional array of shape (X, Y, Z), the flat index for position i, j, k is idx = i + j * X + z * X * Y.

add_output_buffer(key, shape=None, dtype=<class 'numpy.float32'>)[source]

Add an output buffer to the kernel program. Output buffers must be added in the same order as they are declared inside the kernel code (.cl file).

Parameters:
  • key (string) – Name of the buffer in the output buffers list. Used for referencing when updating buffers.

  • shape (tuple) – Shape of the output array.

  • dtype (dtype, optional) – Optional type for array data. It is recommended to use float32 whenever possible to avoid unexpected behaviours.

run(global_size, local_size=None)[source]

Execute the kernel code on the GPU.

Parameters:
  • global_size (tuple) – Tuple of between 1 and 3 entries representing the shape of the grid used for GPU computing. OpenCL uses global_size to generate a unique id for each kernel execution, which can be queried using get_global_id(axis) with axis between 0 and 2.

  • local_size (tuple, optional) – Dimensions of local groups. Must divide global_size exactly, element-wise. If None, an implementation local workgroup size is used. Memory allocated in the __local address space on the GPU is shared between elements in a same workgroup.

Returns:

outputs – List of outputs produced by the program.

Return type:

list of ndarrays

update_input_buffer(key, arr, dtype=<class 'numpy.float32'>)[source]

Update an input buffer. Input buffers must first be added to program using add_input_buffer.

Parameters:
  • key (string) – Name of the buffer in the input buffers list.

  • arr (numpy ndarray) – Data array.

  • dtype (dtype, optional) – Optional type for array data. It is recommended to use float32 whenever possible to avoid unexpected behaviours.

update_output_buffer(key, shape, dtype=<class 'numpy.float32'>)[source]

Update an output buffer. Output buffers must first be added to program using add_output_buffer.

Parameters:
  • key (string) – Name of the buffer in the output buffers list.

  • shape (tuple) – New shape of the output array.

  • dtype (dtype, optional) – Optional type for array data. It is recommended to use float32 whenever possible to avoid unexpected behaviours.

scilpy.gpuparallel.opencl_utils.cl_device_type(device_type_str)[source]