File elemwise.h

Custom elementwise operations generator.

Defines

GE_SCALAR

Argument is a scalar passed from the CPU, requires nd == 0.

GE_READ

Array is read from in the expression.

GE_WRITE

Array is written to in the expression.

GE_NOADDR64

Don’t precompile kernels for 64-bits addressing.

GE_CONVERT_F16

Convert float16 inputs to float32 for computation.

GE_BROADCAST

Allow broadcasting of dimensions of size 1.

GE_NOCOLLAPSE

Disable dimension collapsing (not recommended).

Typedefs

typedef struct _GpuElemwise GpuElemwise

Elementwise generator structure.

The contents are private.

Functions

GpuElemwise* GpuElemwise_new(gpucontext * ctx, const char * preamble, const char * expr, unsigned int n, gpuelemwise_arg * args, unsigned int nd, int flags)

Create a new GpuElemwise.

This will allocate and initialized a new GpuElemwise object. This object can be used to run the specified operation on different sets of arrays.

The argument descriptor name the arguments and provide their data types and geometry (arrays or scalars). They also specify if the arguments are used for reading or writing. An argument can be used for both.

The expression is a C-like string performing an operation with scalar values named according to the argument descriptors. All of the indexing and selection of the right values is handled by the GpuElemwise code.

Return
a new GpuElemwise object or NULL
Parameters
  • ctx: the context in which to run the operations
  • preamble: code to be inserted before the kernel code
  • expr: the expression to compute
  • n: the number of arguments
  • args: the argument descriptors
  • nd: the number of dimensions to precompile for
  • flags: see GpuElemwise flags

void GpuElemwise_free(GpuElemwise * ge)

Free all storage associated with a GpuElemwise.

Parameters
  • ge: the GpuElemwise object to free.

int GpuElemwise_call(GpuElemwise * ge, void ** args, int flags)

Run a GpuElemwise on some inputs.

Parameters
  • ge: the GpuElemwise to run
  • args: pointers to the arguments (must macth what was described by the argument descriptors)
  • flags: see GpuElemwise call flags

struct gpuelemwise_arg
#include <elemwise.h>

Argument information structure for GpuElemwise.

Public Members

const char* name

Name of this argument in the associated expression, mandatory.

int typecode

Type of argument, mandatory (not GA_BUFFER, the content dtype)

int flags

Argument flags, mandatory (see GpuElemwise argument flags).