Showing posts with label Revactor. Show all posts
Showing posts with label Revactor. Show all posts

Tuesday, January 29, 2008

Ruby Concurrency with Actors

Tony Arcieri (read an interview with Tony here) recently posted an announcement about his new Revactor library, which provides actor style concurrency built atop Ruby 1.9’s new Fiber technology. Since Fibers and Actors haven’t been a part of the Ruby lexicon, I wanted to put some information together to help myself (and anyone else who wants to crib off my notes) get up to speed. Update! You might also want to check out my new Questions Five Ways on Concurrency post.

To start with, I figured it was best to with the basics. Quoting from Revactor’s Philsophy page

The basic operation of an Actor is easy to understand: like a thread, it runs concurrently with other Actors. However, unlike threads it is not pre-emptable. Instead, each Actor has a mailbox and can call a routine named ‘receive’ to check its mailbox for new messages. The ‘receive’ routine takes a filter, and if no messages in an Actor’s mailbox matches the filter, the Actor sleeps until it receives new messages, at which time it’s rescheduled for execution.

Well, that’s a bit of a naive description. In reality the important part about Actors is that they cannot mutate shared state simultaneously. That means there are no race conditions or deadlocks because there are no mutexes, conditions, and semaphores, only messages and mailboxes.

Ok, that’s not so bad. How well can this work in Ruby? I thought I’d go to the horses mouth for this one. Joe Armstrong, creator of the Erlang language which is built on the Actor model had this to say:

Difficult to say – we always said that performance of a concurrent system depends upon three critical times

  1. process spawning
  2. context switching
  3. message passing times If you can make these fast it might work.

The main problem with concurrency is isolating processes from each other one process must not be able to corrupt another process -

Even though the message-passing process spawning etc. seems simple there is a lot of junk going on in the background that you are not aware of – implementing this is of the order of complexity of implementing an operating system – i.e., there is processes scheduling, memory management garbage collection etc.

It’s rather easy to add a layer to an existing system that can give you a few thousand of processes but as you get to the hundreds of thousands things get tricky – the overheads per process must be small and so on.

This is tricky stuff – I’m sure you can make it work – but making it fast needs a lot of thinking …

MenTaLguY has spent a lot of time working with Ruby concurrency. He also weighed in Actors place in Ruby:

I think in the long-term actors are likely to become the major means of distributed concurrent programming in Ruby. At the very least I think distributed actors are likely to displace DRb for the things it is used for today.

I don’t know that actors will necessarily dominate for more “local” problems, however. I think the actor model will face competition from e.g. the join calculus, and other approaches like concurrent logic programming which can offer more natural solutions to some problems. Transactional memory will also have a place, although after writing several STM implementations I am not the fan of it that I used to be.

Paul Brannon: (a long time Rubyist and all-around good guy) told me that he thinks Actor implementations are important for Ruby:

I think the industry is about to make a shift toward erlang (actor)-style concurrency, because it makes true concurrency transparent and easy for the user. Home computers are being shipped with more and more cores these days, and pretty soon, taking full advantage of the hardware available will necessarily imply concurrent programming.

I also asked around for recommendations about good resources for learning about Actor based concurrency (and concurrency in general). People were unanimous in suggesting that interested programmers spend some time with Erlang to get a handle on it. The book Programming Erlang also got high marks. Ola Bini also recommended Java Concurrency in Practice for those with a Java bent/background.

However you choose to do it, I recommend spending some time with Actors, it looks like a good way to get more out of your programming. Let me just go back to Revactor’s Philosophy page for a second:

Actors rule. You really should use them in your programs. Especially if your programs do a lot of stuff at once. Seriously, whatever you’re doing besides Actors, it probably sucks. Actors are this awesome panacea that will make it all better, I swear. In conclusion: use them, do it!

There, are you convinced now?

Interview With Revactor Developer Tony Arcieri

With the recent release of his Revactor library, I wanted to talk with Tony Arcieri about Ruby, Actors, and Revactor. He was kind enough to sit down for a short interview. Here’s what we talked about.


How did you get started with Ruby?

Tony Ruby is a language some roommates of mine were using for years and kept raving to me about. Unfortunately, I was a performance-obsessed C programmer and couldn’t really get past the whole “Ruby is slow” stigma. Then in early 2005 Rails started generating a lot of buzz, and I got sucked into using Ruby for web development. A few years later I can look back wondering how I could stand programming in C for so long.

Revactor is an implementation of Actors for Ruby 1.9. Is there a reason you targetted 1.9 instead of Rubinius (with tasks) or another implementation?

Tony Ruby 1.8 already supports Actors with the Omnibus Concurrency Library and Rubinius supports them in its standard library. I’m not aware of an Actor model implementation for JRuby but it’d be pretty easy to do with a Scala-like thread pool. I chose Ruby 1.9 because I felt that, for the time being, it’s the most practical and performant platform for writing network applications with the Actor model. Revactor is built on a number of Ruby 1.9-specific features, specifically Fibers which provide the underlying concurrency primitive. However, Revactor is also built on top of an event library called Rev whose feature set was tailored for implementing high performance networking within the Actor model (although it can be used as a general purpose event library if you so desire). Ruby 1.9 contains several features which made writing this event library quick and easy with minimal C code. These include things like support for blocking system calls and non-blocking I/O.

However, I definitely feel that down the road Rubinius will be much better suited. Rubinius already supports multiple shared-nothing virtual machines which each run in their own hardware thread and can communicate over an internal message bus. Using that in conjunction with Actors, you can do scatter/gather distributed programming (MapReduce is probably the most famous example of this) which can run a copy of a job on each VM (and thus on its own CPU core) then reduce the results to the final output. With this approach, your program runs N times faster on N CPUs.

What do you think about some of the other approaches to concurrency? (See MenTaLguY’s page for example.)

Tony Many of the techniques there can go hand in hand with Actors (futures, for example). As far as non-Actor approaches, my favorite is probably join calculus as seen in languages like JoCaml.

MenTaLguY has long been involved in concurrency in Ruby. I see that you’re using his Case gem in Revactor. What other influence has he had on Revactor?

Tony MenTaLguY has been very helpful in smoothing out the API design and will hopefully be making Revactor thread safe in the near future. He’s pointed out solutions to problems which, in retrospect, were pretty obvious but I just didn’t see at the time. We’re trying to put together something of a standard Actor API and protocol such that a program written using Actors in Ruby isn’t tied to a particular implementation and can run on Omnibus, Rubinius, or Revactor. We’ll also hopefully be putting out a cross-compatible gem which bundles up a lot of the standard Actor functionality so there aren’t 3 different implementations of the same thing floating around.

You’ve got a great introduction to Actors up at your Philosophy page, but it’s a little light on code. Could you give us an example of Revactor at work?

Tony There’s a number of code examples available on http://doc.revactor.org which go a bit more in depth as to how Actors send and receive messages, but here’s an example of an echo server:


# An example echo server, written using Revactor::TCP
# This implementation creates a new actor for each
# incoming connection.

require 'revactor'

HOST = 'localhost'
PORT = 4321

# Before we can begin using actors we have to call Actor.start
# Future versions of Revactor will hopefully eliminate this
Actor.start do

# Create a new listener socket on the given host and port
  listener = Revactor::TCP.listen(HOST, PORT)
  puts "Listening on #{HOST}:#{PORT}"

  # Begin receiving connections
  loop do

    # Accept an incoming connection and start a new Actor
    # to handle it
    Actor.spawn(listener.accept) do |sock|
      puts "#{sock.remote_addr}:#{sock.remote_port} connected"

      # Begin echoing received data
      loop do
        begin
          # Write everything we read
          sock.write sock.read
          
        rescue EOFError
          puts "#{sock.remote_addr}:#{sock.remote_port} disconnected" 
        end
        
        # Break (and exit the current actor) if the connection
        # is closed, just like with a normal Ruby socket
        break
      end
    end
  end
end

This doesn’t demonstrate inter-Actor messaging (although it’s doing it behind the scenes). However, what you do see is that there’s very little disconnect between using Revactor and writing a traditional threaded network server. If you’ve written programs in the past using Thread and Queue, then moving over to Revactor will be easy, and you’ll find Actor mailboxes to be a much more powerful way of processing messages.

Are there any books, blogs, or websites you’d recommend for learning more about concurrency in general or actors in particular.

Tony Programming Erlang by language creator Joe Armstrong was immensely helpful in understanding Actor-based concurrency, and many of the ideas in Revactor are drawn directly from Erlang. Some of the Erlang portal sites such as planeterlang.org also cover concurrent programming in general, particularly with Actors.