Drivers
- Find the files in:
lib/CL/devices/formosa
Thepocl-formosa.h
andpocl-formosa.c
define and implement the user space device driver for FORMOSA. Thepocl-formosa-util.h
andpocl-formosa-util.cc
includes the support functions for the driver.
pocl-formosa.c
void pocl_formosa_init_device_ops(struct pocl_device_ops *ops)
ops
is the struct for storing the driver function for the device. For example, ops->alloc_mem_obj = pocl_formosa_alloc_mem_obj
set the memory allocation handler to our implementation pocl_formosa_alloc_mem_obj
.
pocl_formosa_probe
The function will test if your device is available or not. If a formosa device is probed and correctly responsed, the function will set the global variable formosa_available
to true
.
pocl_formosa_init
The function defines the attributes of the devices and perform intialization for the device drivers. For example, we will set the device name here and specify the local/global memory size here. Also, we will intialize the memory allocator in this function. There is also a pocl_formosa_uninit
function to free some data structure safely.
pocl_formosa_read
/ pocl_formosa_write
These functions implement how to read data from device to host memory and write data to devices memory. These functions will be called when OpenCL read/write buffer is enqueued.
pocl_formosa_alloc_mem_obj
/ pocl_formosa_free
Allocate and free device memory space. Note that we currently only support CL_MEM_READ_WRITE
/ CL_MEM_READ_ONLY
/ CL_MEM_WRITE_ONLY
memory flags. The memory allocation on global memory are all done on the host driver, that is, given a memory size to allocate, the allocator will maintain a data structure and return the available addres on the device.
pocl_formosa_post_build_program
We will build our kernel program in this function with the following steps:
1. Run pocl llvm passes
- Output LLVM kernel modules (in LLVM IR format)
2. Compile program (fsa_compile_program
in pocl_formosa_utils
)
1. Generate trampoiline functions for the modules
2. Write the bitcodes
3. Compile the bitcodes and link with 1) kernel library, 2) start.S with our linker script
pocl_formosa_run
This is the most important function that implements how we gather the kernel argument and run the kernel. The function will be used when enqueueNDRangeKernel
is called. The execution steps are as follow:
- Iterate all kernel arguments and calculate the space device need to store the arguments
- Allocate host argument buffer and device arguemnt buffer
- Place the arguments in the host buffer
- Upload the host argument buffer to device argument buffer
- Setup the context data
- Work dimension
- Kernel ID
- Local sizes for a workgroup
- Number of workgroups in 3 dimensions
- Set the entry pc and trampoline function pc
- etc.
- Upload the kernel to device
- Start
- Wait for kernel to finish
- Release buffers