Tuesday, January 29, 2008

Ruby Concurrency with Actors

Tony Arcieri (read an interview with Tony here) recently posted an announcement about his new Revactor library, which provides actor style concurrency built atop Ruby 1.9’s new Fiber technology. Since Fibers and Actors haven’t been a part of the Ruby lexicon, I wanted to put some information together to help myself (and anyone else who wants to crib off my notes) get up to speed. Update! You might also want to check out my new Questions Five Ways on Concurrency post.

To start with, I figured it was best to with the basics. Quoting from Revactor’s Philsophy page

The basic operation of an Actor is easy to understand: like a thread, it runs concurrently with other Actors. However, unlike threads it is not pre-emptable. Instead, each Actor has a mailbox and can call a routine named ‘receive’ to check its mailbox for new messages. The ‘receive’ routine takes a filter, and if no messages in an Actor’s mailbox matches the filter, the Actor sleeps until it receives new messages, at which time it’s rescheduled for execution.

Well, that’s a bit of a naive description. In reality the important part about Actors is that they cannot mutate shared state simultaneously. That means there are no race conditions or deadlocks because there are no mutexes, conditions, and semaphores, only messages and mailboxes.

Ok, that’s not so bad. How well can this work in Ruby? I thought I’d go to the horses mouth for this one. Joe Armstrong, creator of the Erlang language which is built on the Actor model had this to say:

Difficult to say – we always said that performance of a concurrent system depends upon three critical times

  1. process spawning
  2. context switching
  3. message passing times If you can make these fast it might work.

The main problem with concurrency is isolating processes from each other one process must not be able to corrupt another process -

Even though the message-passing process spawning etc. seems simple there is a lot of junk going on in the background that you are not aware of – implementing this is of the order of complexity of implementing an operating system – i.e., there is processes scheduling, memory management garbage collection etc.

It’s rather easy to add a layer to an existing system that can give you a few thousand of processes but as you get to the hundreds of thousands things get tricky – the overheads per process must be small and so on.

This is tricky stuff – I’m sure you can make it work – but making it fast needs a lot of thinking …

MenTaLguY has spent a lot of time working with Ruby concurrency. He also weighed in Actors place in Ruby:

I think in the long-term actors are likely to become the major means of distributed concurrent programming in Ruby. At the very least I think distributed actors are likely to displace DRb for the things it is used for today.

I don’t know that actors will necessarily dominate for more “local” problems, however. I think the actor model will face competition from e.g. the join calculus, and other approaches like concurrent logic programming which can offer more natural solutions to some problems. Transactional memory will also have a place, although after writing several STM implementations I am not the fan of it that I used to be.

Paul Brannon: (a long time Rubyist and all-around good guy) told me that he thinks Actor implementations are important for Ruby:

I think the industry is about to make a shift toward erlang (actor)-style concurrency, because it makes true concurrency transparent and easy for the user. Home computers are being shipped with more and more cores these days, and pretty soon, taking full advantage of the hardware available will necessarily imply concurrent programming.

I also asked around for recommendations about good resources for learning about Actor based concurrency (and concurrency in general). People were unanimous in suggesting that interested programmers spend some time with Erlang to get a handle on it. The book Programming Erlang also got high marks. Ola Bini also recommended Java Concurrency in Practice for those with a Java bent/background.

However you choose to do it, I recommend spending some time with Actors, it looks like a good way to get more out of your programming. Let me just go back to Revactor’s Philosophy page for a second:

Actors rule. You really should use them in your programs. Especially if your programs do a lot of stuff at once. Seriously, whatever you’re doing besides Actors, it probably sucks. Actors are this awesome panacea that will make it all better, I swear. In conclusion: use them, do it!

There, are you convinced now?

4 comments:

Unknown said...

Thx for the writeup. I'm still getting my head to understand this and how/which of my work can be redone using these ideas. I would definitely buy a book that explains Revactor with lots of small to medium size examples.

AEM

BradfordW said...

Nice write up, I especially enjoy articles like these because they don't put any pressure on you to implement the newly hocked technology.

Actors will be our new gods!

But in all seriousness, I have a small listener which receives acknowledgments of faxed documents operating on the standard Ruby tcp library, it processes ~2000 responses every hour or so with a fair amount of overhead. If/when I have the time I'd love to see what I get on Revactor. Thanks again for the article.

Anonymous said...

Great article. Thanks for wrting it. Revactor looks very promising. I hope to be able to play with in the next weeks.
It would be great, if something like this could make it into the Ruby standard library.

Anonymous said...

hello I am new in ruby, i am still confused about two concurrent process in ruby on rails.

Example if i have 2 concurrent process with the same operation and i run it together even in the same time or the same second (trough console).

What is thing that can make ruby on rails can take decision to process which will be run firstly than another has to wait it. is it randomly? or using any priority things? or any theory can explain it?

I open Active_record/transaction.rb and there is requiring 'thread' then i open thread.rb but is still can not understand how ruby can make thread for concurrent process in the same second even micro second.

I appreciate your kindly answer.

Thanks.
Reinhart