[MacRuby-devel] Thread safety in apply example?

Charles Oliver Nutter headius at headius.com
Tue Jan 25 00:11:04 PST 2011


On Tue, Jan 25, 2011 at 1:00 AM, Joshua Ballanco <jballanc at gmail.com> wrote:
> On Mon, Jan 24, 2011 at 8:20 PM, Charles Oliver Nutter <headius at headius.com>
> wrote:
>> I'm curious what you mean by this. Can you point out the code? Is it
>> actually attempting to rewrite local variable or instance variable or
>> array modifications in code?
>
> Sorry, I mis-remembered. Not a mix-in, but Dispatch::Proxy from the dispatch
> gem is what I was thinking of. It uses the delegator pattern to funnel all
> access through a serial queue. Admittedly, this technique is a bit heavy
> handed (and, I'm just now realizing, somewhat inefficiently implemented...).
> Still, it gets the job done...

Ok, got it. Unsurprisingly HawtDispatch also provides a built-in
queue-backed delegation proxy using Java reflection. Of course, we can
do a lot better than that in Ruby.

>> I'm building a JRuby-based mimic API of the MacRuby GCD API:
>> https://github.com/headius/jcd
>
> That sounds like an excellent idea! One issue that we've come across
> repeatedly in MacRuby is how to teach Ruby programmers to write "grown-up"
> thread-safe code (and, from my half-hearted attempts to follow what's going
> on in JRuby, it seems you've been dealing with this as well). That is,
> something like "@my_array[i] += 1" is safe enough when you have a GIL, but
> is not when you have true low-level reentrancy. (In fact, even @my_array[i]
> = foo is not thread-safe!)

Yes, I think JRuby has helped force the issue for years, and now
MacRuby gets to help :)

Things used to be a lot worse. I remember spending long hours writing
emails and IM responses to people that were almost angry that I
decided JRuby's Array, Hash, and String would not be thread-safe. They
really didn't get why we couldn't just "do what MRI does". It took a
lot of effort and a few favors from key greybeards in the Ruby
community for people to accept that JRuby was making the pragmatic
decision. And hopefully MacRuby has had an easier time selling such
decisions as a result.

The Ruby community has come a long way, but there's obviously more work to do.

> Instead of attempting to bring the world of spin-locks and mutexes (and when
> to use one instead of the other) into Ruby, I think it's probably more
> useful to introduce a transactional programming model. This LtU post comes
> to mind: http://lambda-the-ultimate.org/node/4070

Yes, you're right for most scenarios. There are of course many places
where queues or transactions are far too heavy. What we need to do is
ensure that as better concurrency models become available for
Rubyists, we do what we can to ensure they work across Ruby
implementations and share the load of educating users. That's what I
hope to do with JCD, and I hope someone will be able to do the same
for MRI in a way that works on all its supported platforms.

I have other concurrency-related projects you may be interested in:

The "ruby-atomic" gem, providing explicitly atomic references (and
CAS-like operations) on JRuby and MRI:
https://github.com/headius/ruby-atomic. I collaborated with MenTaLguY
on this one.

The "cloby" gem, which wraps Clojure's STM and Ref to allow creating
(J)Ruby objects with all transactional instance variables:
https://github.com/headius/cloby

I have also worked with MRI and Rubinius folks on various forms of
Multi-VM (MVM) API, providing a standard interface for launching
isolated ruby environments in the same process. Of course JRuby has
been able to do this for years (since org.jruby.Ruby is just another
object), but MVM and similar APIs may be the only way MRI ever gets
access to true in-process concurrency.

> Anyway, I'd be glad to lend a hand (and whatever slice of my limited time I
> can spare) to the effort. In particular, test suites/specs are difficult in
> this space, as the MacRuby team has learned first-hand.

I don't have a lot to offer in this area. We benefited years ago from
the big threadsafety push in Rails 2.2, by providing rails core a
32-way box and letting them beat the hell out of Rails and JRuby under
heavy concurrent load. I know of no similar efforts since, for any
library.

>> What I have now is a mostly-complete Queue and Object, but I'm still
>> figuring out how to map the other libdispatch primitives to what
>> HawtDispatch provides. The library currently is complete enough to run
>> a fork of ControlTower I made called ChaosBazaar:
>> http://github.com/headius/ChaosBazaar
>
> So, now I'm curious: have you gotten to the dispatch_source_t part of
> libdispatch? I wonder, because this is where I would predict the most
> difficulty. In OS X, the dispatch sources are made possible through the use
> of a kqueue, which is kinda like a select but at the kernel level. Also,
> there might be some difficulty in mapping the semantics directly due to the
> high degree of asynchrony achieved by libdispatch. For example: the
> following is perfectly legal:
> myq = Dispatch::Queue.new
> myq.async do
>   myq.cancel!
>   *keep doing stuff on 'myq'*
> end
> ...because cancelation is dispatched, asynchronously, to be executed by the
> same queue (or, well, with the same queueing semantics) as the queue you are
> canceling.

Currently, I have punted on the source logic, mostly because
HawtDispatch only includes out-of-the-box Source support for
selectable IO channels (in which case it would use kqueue on OS X,
epoll on Linux, etc). For the cases I've been trying to support in my
early attempt, source support has not been necessary. I know I'll have
to deal with this soon, however, and that will probably mean working
with HawtDispatch folks to provide a more complete set of source
types.

FWIW, the JVM does have fairly solid support for asynchronous queuing
systems, and the asynchronous IO support is at least passable (so long
as you're dealing with sockets).

> Also, I'm really excited by the clone of ControlTower. I should warn you
> that ControlTower only includes ~40% of the GCD-ishness that was originally
> planned for it. I originally intended that reading from/writing to the
> socket should occur using dispatch_sources, since that is more efficient
> than waiting on a blocking read. In fact, I have (somewhere) a branch with
> this implemented, but the concurrency bugs that developed were so cryptic
> and difficult to chase down, that I eventually gave up...resulting in what
> you have now. I think you'll find the vestiges of these attempts in the
> "gcd-ify" branch...but I can't vouch at all for their functional state.
> Anyway, I think this could be the start of a great thing for Ruby in
> general...

Yes, I was wondering about that. HawtDispatch supports sources against
NIO (Java's "New IO" API) channels, which provides at least one leg of
the libdispatch "source" support. As it stands now I'm really just
using Queue and largely ignoring the Group use that doesn't seem to be
fired unless you do a clean shutdown of the server. My impl is still
mostly at proof-of-concept stage.

I did have to hack around the parser logic, since native extensions
largely mean death for concurrency on JRuby (and by native I mean C
extensions using MRI's API). Instead, I lifted code from Mongrel and
Rack to use Mongrel's parser to populate a Rack environment, and
managed to make the result function well enough to benchmark.

- Charlie


More information about the MacRuby-devel mailing list