On Ruby: Concurrency

Showing posts with label Concurrency. Show all posts

Wednesday, May 13, 2009

Questions Five Ways - Concurrency

It's time for the first of my Questions Five Ways posts. This time I approached five programmers that do a lot of work with concurrency. Three of them responded (Tony Arcieri (@bascule), Venkat Subramaniam (@venkat_s), and MenTaLguY (@mentalguy). Here are the responses they came up with.

Please help continue this discussion by sharing your thoughts in the comments below.

Which 2-3 languages/approaches should a programmer be studying to move toward a more concurrent future?

Tony Well, besides Reia... :)

1) Erlang: Obviously I believe in the actor model quite a bit, which is the basis of Erlang's approach to concurrency. You can diagram a concurrent system on a whiteboard in terms of different components which talk to each other with messages, and pretty much translate that diagram directly into code. Concurrency is localized to actors, which makes it a lot easier to reason about. The Erlang VM has some issues with scalability on massively multicore systems now but they're being resolved, and the Erlang model makes optimizing concurrency in the VM incredibly easy. All that said, far and away the thing that makes Erlang so incredibly cool is the nearly seamless distribution across multiple systems, which is something Erlang can do far better than any other language in existance. Also, Erlang makes handling faults in concurrent systems comparatively easy, and when it comes to concurrent systems simplified fault handling is invaluable.

2) Haskell: In some regards Haskell is farther along than Erlang when it comes to concurrency, and provides multiple different models for different types of concurrency, whereas Erlang pretty much forces you to use one (the actor model/shared nothing concurrency). However, I prefer Erlang to Haskell as I'm not a fan of pure functional programming/monads and think Erlang's dirty imperative features like its approach to I/O actually make writing programs simpler and more practical. All that said, at Erlang Factory Ulf Wiger pointed out how horribly Erlang performs on shared state concurrency problems like chameneos-redux, which Haskell does great on. I appreciate Haskell providing multiple approaches to modeling concurrency in the same environment.

3) Clojure: uses Software Transactional Memory (STM) for modeling concurrency, which has been around in other places (like Haskell) before but Clojure does some neat stuff as far as simple Lispy syntax for marking sections of the code atomic and also permitting mutable state within the atomic sections. This is great for shared state concurrency problems, but most concurrent problems don't require shared state, and I think reasoning about STM systems is more difficult than actor-based systems because there's no logical mapping between what the system is doing concurrently and how the code is structured. It's sort of like throwing a bunch of queries at a database, and when things start going unexpectedly slow, or break, you're kind of left to wonder what's going on.

Venkat Studying a language is not about learning the syntax, but to learn the idioms and beginning to think along the lines of designing applications using those. From a practical point of view, I don't think you — a busy everyday programmer — have the time and energy to study multiple languages at the same time. So, I don't recommend studying 2 to 3 languages.

At any given time, as a professional programmer you should be studying a language.

One one hand, the language you pick must be quite different from the one you are using extensively. In addition to exercising your mental muscle, it helps you not get boxed into the paradigms and idioms promoted by one particular language you're used to.

On the other hand, if you intend to put some of the studying to real use relatively soon, it will help for that language you learn to integrate well with the language or the platform you're working with. However, if you learn continuously, you'll find it easier and quicker to pick the language with features you desire and integrates well with your current platform or language. So, focus on one language at a time.

From the point of view of concurrency, rather than learning a language specifically, I suggest learning the different approaches. Rather than assuming a particular approach is the right one (or the wrong one), learn the pros and cons of each. Understand what problem they solve, where they would be useful, and what their limitations are. Don't restrict your learning to the approaches supported by one language, your favorite language, or what's currently creating the buzz. Sometimes what's old becomes new again. Learn about the shared state vs. message passing, Communicating Sequential Processes, the Actor based model, Nested Data Parallelism, Software Transactional Memory,...

MenTaLguY Erlang and Haskell, but I'd like to offer a different sort of justification from those which are usually offered. The thing is, both of these languages sort of force you to take the functional programming beast by the horns. It isn't actually hard, but it requires learning new habits — and those habits actually happen to correspond nicely to the habits required for writing good concurrent programs (at a high level — low-level optimizations are another story).

It does also happen that they each more or less represent the two main modes of concurrent programming — message-passing (as represented by actors in Erlang) versus shared-memory (as represented by STM in Haskell), so you'll get a flavor for each that way. But do realize that there are lots of different ways of looking at message-passing besides just actors; joins as in JoCaml, for instance.

Also, in general, I think the main goal for learning new programming languages should be to stretch your brain and find things to take back to your work in your usual languages. (No matter how different it is from what you're used to — and it should be different if you intend to stretch your brain — there will always be something.) You may find a language you end up falling in love with in the process (Ruby was one such language for me), but it's better not to approach new languages with such high expectations.

Click here to Tweet this article

Friday, September 05, 2008

New Improved MySQL library

MySQLPlus is a new, non-blocking MySQL driver for Ruby 1.8 and 1.9 (anyone know if it will run on Rubinius?) from eSpace, the folks who created NeverBlock. (They also talk about NeverBlockPG, a postgreSQL driver, but it seems to have been deleted.) To quote eSpace's announcement:

[MySQLPlus does] IO operations concurrently and in a transparent manner, thanks to NeverBlock. An interesting side effect emerged during the development of this driver. We were required to update the current MySQL driver to be able to do async operations. Once those were done, we discovered that the basic foundation for threaded support was there. Hence we went forward and implemented it (with help from Ruby gurus like Aman Gupta and Roger Pack). What we have now is a new general purpose MySQL driver that supports threaded access and async operations. This means that you can send queries to a MySQL server in a concurrent manner from Ruby applications. This is big news for those waiting for Rails thread safety. Finally there is a MySQL driver that can help them achieve that concurrency.

I'd love to see what alternative ORMs like Sequel and DataMapper will do with this kind of library underneath them.

Tuesday, January 29, 2008

Ruby Concurrency with Actors

Tony Arcieri (read an interview with Tony here) recently posted an announcement about his new Revactor library, which provides actor style concurrency built atop Ruby 1.9’s new Fiber technology. Since Fibers and Actors haven’t been a part of the Ruby lexicon, I wanted to put some information together to help myself (and anyone else who wants to crib off my notes) get up to speed. Update! You might also want to check out my new Questions Five Ways on Concurrency post.

To start with, I figured it was best to with the basics. Quoting from Revactor’s Philsophy page

The basic operation of an Actor is easy to understand: like a thread, it runs concurrently with other Actors. However, unlike threads it is not pre-emptable. Instead, each Actor has a mailbox and can call a routine named ‘receive’ to check its mailbox for new messages. The ‘receive’ routine takes a filter, and if no messages in an Actor’s mailbox matches the filter, the Actor sleeps until it receives new messages, at which time it’s rescheduled for execution.

Well, that’s a bit of a naive description. In reality the important part about Actors is that they cannot mutate shared state simultaneously. That means there are no race conditions or deadlocks because there are no mutexes, conditions, and semaphores, only messages and mailboxes.

Ok, that’s not so bad. How well can this work in Ruby? I thought I’d go to the horses mouth for this one. Joe Armstrong, creator of the Erlang language which is built on the Actor model had this to say:

Difficult to say – we always said that performance of a concurrent system depends upon three critical times

process spawning

context switching

message passing times If you can make these fast it might work.

The main problem with concurrency is isolating processes from each other one process must not be able to corrupt another process -

Even though the message-passing process spawning etc. seems simple there is a lot of junk going on in the background that you are not aware of – implementing this is of the order of complexity of implementing an operating system – i.e., there is processes scheduling, memory management garbage collection etc.

It’s rather easy to add a layer to an existing system that can give you a few thousand of processes but as you get to the hundreds of thousands things get tricky – the overheads per process must be small and so on.

This is tricky stuff – I’m sure you can make it work – but making it fast needs a lot of thinking …

MenTaLguY has spent a lot of time working with Ruby concurrency. He also weighed in Actors place in Ruby:

I think in the long-term actors are likely to become the major means of distributed concurrent programming in Ruby. At the very least I think distributed actors are likely to displace DRb for the things it is used for today.

I don’t know that actors will necessarily dominate for more “local” problems, however. I think the actor model will face competition from e.g. the join calculus, and other approaches like concurrent logic programming which can offer more natural solutions to some problems. Transactional memory will also have a place, although after writing several STM implementations I am not the fan of it that I used to be.

Paul Brannon: (a long time Rubyist and all-around good guy) told me that he thinks Actor implementations are important for Ruby:

I think the industry is about to make a shift toward erlang (actor)-style concurrency, because it makes true concurrency transparent and easy for the user. Home computers are being shipped with more and more cores these days, and pretty soon, taking full advantage of the hardware available will necessarily imply concurrent programming.

I also asked around for recommendations about good resources for learning about Actor based concurrency (and concurrency in general). People were unanimous in suggesting that interested programmers spend some time with Erlang to get a handle on it. The book Programming Erlang also got high marks. Ola Bini also recommended Java Concurrency in Practice for those with a Java bent/background.

However you choose to do it, I recommend spending some time with Actors, it looks like a good way to get more out of your programming. Let me just go back to Revactor’s Philosophy page for a second:

Actors rule. You really should use them in your programs. Especially if your programs do a lot of stuff at once. Seriously, whatever you’re doing besides Actors, it probably sucks. Actors are this awesome panacea that will make it all better, I swear. In conclusion: use them, do it!

There, are you convinced now?

On Ruby

Wednesday, May 13, 2009

Questions Five Ways - Concurrency

Friday, September 05, 2008

New Improved MySQL library

Tuesday, January 29, 2008

Ruby Concurrency with Actors

About Me

Subscribe Now: Feed Icon

Most Popular Posts

My Best

Blog Archive

Links & Blogs