[MacRuby-devel] OpenCL experiment (request for comments)
lsansonetti at apple.com
Sat Jan 30 13:09:50 PST 2010
Sorry for the late response!
On Jan 26, 2010, at 4:26 AM, Ruben Fonseca wrote:
> Hi @all!
> I promised Laurent I would start a discussion on OpenCL and Macruby
> on the ML, and here I am :)
> I wrote a small hack for MacRuby that adds basic (and hacky) support
> for OpenCL kernels running on your GPUs or CPUs. You can read about
> it on my blog post here http://blog.0x82.com/2010/1/23/opencl-in-macruby-hack-not-very-useful
> The branch with the code is located here on github http://github.com/rubenfonseca/macruby
> . If you are interested on the actual hacky implementation, please
> look at “opencl.c” file.
> Now there are a couple of things I need help. It was my very first
> (mac)ruby C extension, and I’m not really sure about many details on
> the implementation (I basically copy-pasted code from other modules
> eheh). I’ll raise a couple of questions here, I’m sure you can
> answer some of them :)
> - How and were to store primitive values?
> For instance, OpenCL::Device has somewhere inside a “cl_device_id”
> pointer. However, on other classes (think OpenCL::Context#new) I’ll
> need to have a reference to that “cl_device_id” pointer somewhere.
> What’s the best way to store the pointer? Inside the Object
> struct? As an instance var? As an accessor? These later options
> doesn’t make sense to me, 'cause I’m never interested on getting the
> “cl_device_id” pointer on a IRB shell for instance... Hope I’m
> making myself clear.
The best way is to use the RData structure, as you would do with the
upstream Ruby implementation. For this, you are using the following
A few notes:
1) The mark callback will not be honored (our GC does not use it)
2) The free callback will not be honored too (this is a current
limitation). In order to free resources upon GC cycles, one must
implement a -finalize method on the class. You can grep the MacRuby
source code for "rb_objc_install_method2" calls using the "finalize"
selector as an example.
3) If you decide to store inside the RData structure a C structure
allocated with Ruby allocated memory (xmallos & friend), you must be
careful to appropriately use write barriers when setting Ruby objects
inside that structure. But I believe you should not need to do this.
However, in the next release (0.6) we intend to fully support the
upstream Ruby C interface. The work already started a little bit.
> - How the memory should be managed?
> As I said, I never wrote a MacRuby extension before. When writing
> the extension, I needed to do a couple of memory allocations. I used
> “xmalloc” (discovered by looking at other macurby *.c files).
> However, when I called “free” after I don’t need the memory anymore,
> all sorts of warnings happened at runtime.
> After I deleted the “free” calls, it all worked, but I’m not sure
> if I’m leaking memory somehow. On the other hand, maybe the memory
> is automatically GC’ed :) Can you clear this for me and show me the
> best practices?
When using xmalloc(), free() should not be used. xfree() is the
appropriate free method, but you don't need to call it. The collector
will collect garbage anyway.
> - How to turn the OpenCL API more “Rubyish"
> I have no clue on this one. OpenCL seems like a huge API, and I
> don’t have a really background knowledge of GPU programming. There
> are all sorts of variations on each call, and a number of different
> entities (classes). Any suggestion on how to make this more pleasant
> to write in Ruby would be very very welcome.
I will have a better look when I have some time, but may I suggest the
1) Try to wrap as much as the OpenCL API as possible in this C module.
2) Write higher-level APIs/paradigms in a pure Ruby file that would
ship with the standard library.
This is the method we are taking for GCD.
Also, I heard there is an existing OpenCL wrapper for MRI, maybe it
would be interesting to have a look at it to see how they do things :)
> I even saw Laurent talk about a “Ruby -> OpenCL direct compiler via
> LLVM bitcode”, but I’m definitely not qualified to even consider
> that hipotesys. I tried twice creating a very simple compiler with
> LLVM and failed completely :P
So LLVM should be able to generate code for OpenCL, I believe. It
would be awesome if in MacRuby you could just send a given block to
the GPU, internally it would compile the block as LLVM IR (probably
specialized for OpenCL) then JIT compile it and run it on the GPU.
That would be a higher level abstraction and would probably give
OpenCL on the hands of the average Ruby programmer, who knows little
(if not nothing) about C-based code.
More information about the MacRuby-devel