Hello,
I write this blog post from a wonderful chalet in the south of France, close to the Mont Blanc. The landscape is beautiful, the days hot and sunny, all is really good.
I committed yesterday a few changes :
- The first and biggest is the implementation of clGetProgramInfo. This function allows the client application to retrieve a binary form of a program. Clover returns LLVM bitcode. This new functions also allows me to test the implementation of clCreateProgramWithBinary. It took a few hours to implement that because I had to install LLVM, and it's tricky on openSUSE (you need to install llvm, llvm-devel, llvm-clang and llvm-clang-devel from devel:tools, and only this repository works. If you install another version, you get a broken llvm-config, or not all the needed libs).
- Another one is just a cleanup : it removes all the trailing spaces Kate keeps in the files (spaces at the end of lines and lines made only of spaces). My Kate being broken, I now use Geany which I recommend to everyone (who can't use Kate). Geany removes trailing spaces, and I decided to do that for every file in Clover.
- Doing this put me in “cleanup mode”, so I decided to write my C++ code in ... C++. I replaced all the C functions like memcpy by the C++ equivalents, that's to say std::memcpy and friends. I also removed all the printf I put here and there to debug tricky problems, so the code is clean and silent.
During my holidays, I can work about 4 to 5 hours a day, I'm very happy, Clover will advance ! I have planned for today the implementation of clGetProgramBuildInfo and the OpenCL C compiler's options. I could also begin some sort of standard library, but I have yet to decide if I will implement it in LLVM bitcode (compiled from a C file by Clang), good for interprocedural optimizations and very fast LLVM IR, or if I only code C headers and let the function calls unresolved in the LLVM IR. The last solution is the slowest (no inlining, no optimization), but the easiest to implement for hardware accelerated devices (just find all the unresolved calls and replace them with an intrinsic of the device). For a CPU device, the LLVM JIT has a wonderful function : InstallLazyFunctionCreator. This function lets me register a callback that is given a function name (like get_global_id) and returns a function pointer. I will use that to implement functions like get_global_id, barrier and fences.
I think I will do the two : code the headers, and let the device choose if it wants the binary be linked to a standard library IR, or not.
I'm very happy and very motivated : the interesting part of my project begins and it will be even more interesting than I expected. OpenCL is a good spec, but with small challenges that make work event more enjoyable.