Tuesday, January 29, 2008

Interview With Revactor Developer Tony Arcieri

With the recent release of his Revactor library, I wanted to talk with Tony Arcieri about Ruby, Actors, and Revactor. He was kind enough to sit down for a short interview. Here’s what we talked about.

How did you get started with Ruby?

Tony Ruby is a language some roommates of mine were using for years and kept raving to me about. Unfortunately, I was a performance-obsessed C programmer and couldn’t really get past the whole “Ruby is slow” stigma. Then in early 2005 Rails started generating a lot of buzz, and I got sucked into using Ruby for web development. A few years later I can look back wondering how I could stand programming in C for so long.

Revactor is an implementation of Actors for Ruby 1.9. Is there a reason you targetted 1.9 instead of Rubinius (with tasks) or another implementation?

Tony Ruby 1.8 already supports Actors with the Omnibus Concurrency Library and Rubinius supports them in its standard library. I’m not aware of an Actor model implementation for JRuby but it’d be pretty easy to do with a Scala-like thread pool. I chose Ruby 1.9 because I felt that, for the time being, it’s the most practical and performant platform for writing network applications with the Actor model. Revactor is built on a number of Ruby 1.9-specific features, specifically Fibers which provide the underlying concurrency primitive. However, Revactor is also built on top of an event library called Rev whose feature set was tailored for implementing high performance networking within the Actor model (although it can be used as a general purpose event library if you so desire). Ruby 1.9 contains several features which made writing this event library quick and easy with minimal C code. These include things like support for blocking system calls and non-blocking I/O.

However, I definitely feel that down the road Rubinius will be much better suited. Rubinius already supports multiple shared-nothing virtual machines which each run in their own hardware thread and can communicate over an internal message bus. Using that in conjunction with Actors, you can do scatter/gather distributed programming (MapReduce is probably the most famous example of this) which can run a copy of a job on each VM (and thus on its own CPU core) then reduce the results to the final output. With this approach, your program runs N times faster on N CPUs.

What do you think about some of the other approaches to concurrency? (See MenTaLguY’s page for example.)

Tony Many of the techniques there can go hand in hand with Actors (futures, for example). As far as non-Actor approaches, my favorite is probably join calculus as seen in languages like JoCaml.

MenTaLguY has long been involved in concurrency in Ruby. I see that you’re using his Case gem in Revactor. What other influence has he had on Revactor?

Tony MenTaLguY has been very helpful in smoothing out the API design and will hopefully be making Revactor thread safe in the near future. He’s pointed out solutions to problems which, in retrospect, were pretty obvious but I just didn’t see at the time. We’re trying to put together something of a standard Actor API and protocol such that a program written using Actors in Ruby isn’t tied to a particular implementation and can run on Omnibus, Rubinius, or Revactor. We’ll also hopefully be putting out a cross-compatible gem which bundles up a lot of the standard Actor functionality so there aren’t 3 different implementations of the same thing floating around.

You’ve got a great introduction to Actors up at your Philosophy page, but it’s a little light on code. Could you give us an example of Revactor at work?

Tony There’s a number of code examples available on http://doc.revactor.org which go a bit more in depth as to how Actors send and receive messages, but here’s an example of an echo server:

# An example echo server, written using Revactor::TCP
# This implementation creates a new actor for each
# incoming connection.

require 'revactor'

HOST = 'localhost'
PORT = 4321

# Before we can begin using actors we have to call Actor.start
# Future versions of Revactor will hopefully eliminate this
Actor.start do

# Create a new listener socket on the given host and port
  listener = Revactor::TCP.listen(HOST, PORT)
  puts "Listening on #{HOST}:#{PORT}"

  # Begin receiving connections
  loop do

    # Accept an incoming connection and start a new Actor
    # to handle it
    Actor.spawn(listener.accept) do |sock|
      puts "#{sock.remote_addr}:#{sock.remote_port} connected"

      # Begin echoing received data
      loop do
          # Write everything we read
          sock.write sock.read
        rescue EOFError
          puts "#{sock.remote_addr}:#{sock.remote_port} disconnected" 
        # Break (and exit the current actor) if the connection
        # is closed, just like with a normal Ruby socket

This doesn’t demonstrate inter-Actor messaging (although it’s doing it behind the scenes). However, what you do see is that there’s very little disconnect between using Revactor and writing a traditional threaded network server. If you’ve written programs in the past using Thread and Queue, then moving over to Revactor will be easy, and you’ll find Actor mailboxes to be a much more powerful way of processing messages.

Are there any books, blogs, or websites you’d recommend for learning more about concurrency in general or actors in particular.

Tony Programming Erlang by language creator Joe Armstrong was immensely helpful in understanding Actor-based concurrency, and many of the ideas in Revactor are drawn directly from Erlang. Some of the Erlang portal sites such as planeterlang.org also cover concurrent programming in general, particularly with Actors.

No comments: