On Ruby: January 2007

Wednesday, January 31, 2007

Blogging Contest: February Challenge

Update: It's time to get going on the March contest.

Well, as we close up the January contest, it's time to get a new one started. Same rules, you enter by writing a blog post responding to the question below, and linking to it in the comments section. At the end of the February, Peter Cooper (the author of Beginning Ruby: From Novice to Professional, head of Ruby Inside, and this month's guest judge), Jason Gilmore (the Apress Editor), and I will take a week to pick our favorite and announce it. The winner will win three Apress books of their choice.

With that in mind, let's move right on to this month's question:

Last month Jarkko Laine asked "How has Ruby on Rails made you a better programmer?" This month I want to ask, "How has Ruby blown or stretched your mind?" Initially you might think the same answer applies as to Jarkko's question, but what I am specifically referring to are the things in Ruby that have made your brain race and go "wow!", even if you've never bothered to go on to use that feature much (if at all!)
Most Ruby programmers have come to the language from another. I came from Perl, which had already blown my mind with its development style, having previously come from the monolithic, strict worlds of Pascal and C. At each language transition, I have initially resisted and refused to look at the new programming language with an open mind, instead comparing it to my current language. Transitions are hard, especially when you're invested many years in your current techniques, but even though it can take a real effort to start on something new, especially a programming language, I've found that once those "Wow!" moments start coming along, you rapidly become a convert.
What were your "Wow!" moments with Ruby? Do you still get them? Are you just intellectually impressed when they come along, or do you actually become truly excited (keep it clean please)? What were the moments and the things that you saw that made you want to give your first-born to Matz and live in Ruby land happily ever after?
To kick things off, I will shortly be blogging my own answer to this question (although I won't be eligible for the contest, naturally!) but I look forward to getting to see what it is that's made you excited about Ruby, especially if I get to experience some new "Wow!" moments I haven't had before!

And while you head off to write you masterpiece, I'll be off reading through a stack of great entries from last week's contest.

Tuesday, January 30, 2007

Cardinal: A Behind the Curtains Look at Parrot

This week (and next), I'm taking some time away from the rubinius serial interview to let those guys get back to work. To keep everyone on their toes, I've got a couple of other bits lined up. This week, I asked Dan Sugalski, the original architect of Parrot, about a design decision that he made. Parrot might seem a bit far afield for a Ruby blog but there is the Cardinal project which is trying to build a Ruby front-end for it, so I don't think we're too far off the path.

Next week? You'll have to come back then and see.

In recent discussions in the Ruby world, the Parrot VM has taken some heat because of its register based design. I know you guys went through the trade-offs, but those discussions are lost in the mists of time. Could you fill everyone in?

Dan: There were a couple of reasons.

Firstly, I just like 'em. Yeah, I know, "taste" is a dodgy reason to do things, but when you're designing stuff like this, you need to do stuff that you're kinda fond of, and I was always comfortable working with register machines. Wrote more assembly than I care to think about for 6502 and 68000 family microprocessors. I liked them, was comfortable writing code for them by hand, so it was a good reason to design a register machine.

Second, there is a lot of literature out there for writing optimizers for register machines. All modern CPUs are register machines, and a lot of people have spent a lot of time figuring out how best to generate code for them. I knew that Parrot wouldn't be able to use it right away, but I was trying to plan for things in 2005, 2010, and 2020 when I started. (This was back in 2000)

Thirdly, register machines are generally more efficient than stack machines. Yeah you might see the bytecode look a little denser (which is certainly important) but there's an amazing amount of bookkeeping with a stack machine. There's an extra store for every opcode, which adds up. For example, storing 1 in register X is:


machine reg = regbase + x
store indirect 1

while a stack machine is:


machine reg = stackbase + 1
store indirect 1
store stackbase

Also on the efficiency front, stack operations are destructive, so something like:


x = x + 1
y = y + 1

requires putting that 1 on the stack twice — on a register machine you can reuse it, saving a full op. (This requires some intelligence on the part of the compiler, but it's actually a pretty simple optimization and easy for anyone past first year compiler construction to do.)

Fourthly, for JITting, we know we're targeting a register architecture — the CPU we're running on, which has registers. It's easy enough to do a naive stack VM -> native machine translation, but it's a stack transform, treating your register hardware as a stack machine, and that's inefficient. Doing a good JIT from a stack to register system is much tougher.

A register VM -> register machine translation is a lot easier. You can, of course, grab a register and use it as your 'vm register' base pointer, which is as naive as the stack transform above, but it's much simpler to get relatively efficient register machine code out of a register VM system, assuming some thought went into desinging the register VM. (Which it did.)

(The usual argument that "well, compilers build up a sort of stack-based intermediate representation before generating code" is interesting but bogus, since almost all the useful information in that intermediate representation is tossed away when the low-level stack ops are created.)

So that's basically it. One part taste, one part lots of free work to bootstrap on, and two parts efficiency.

Ruby Hacker Interview: Jens Kraemer

I recently completed an interview with Jens Kraemer, a german Ruby Hacker, and the author of several Ruby and Ruby on Rails tools. Read on to learn more about Jens and what he's doing.

To begin with, would you please introduce yourself to the readers?

Jens: I'm 28 years old and live in Dresden, Germany.

I work as a software developer at webit!, a mid-sized IT-consulting and software development company specializing on web-based applications. I also keep a blog.

I started studies in economics and engineering at the Technical University of Dresden in 1997. Though I always was fascinated by computers and especially by letting them do what I wanted, I choose not to study computer sciences because it seemed far too theoretical to me. Later in my studies I selected computer science courses wherever possible (picking the interesting topics, and leaving out the theoretical ones, of course ;-).

I started earning money with software development in 2000 when I joined webit! for what I thought would be a summer job developing an e-commerce portal. The project died the .com death, but I kept working at webit! while trying to finish my studies by the way. So it took some time, but in 2004 I finally finished my diploma thesis. I stayed at webit! without looking around too much for alternative jobs, just because it's a really great place to work, without too many rules and with lots of cool people around.

How did you discover Ruby?

Jens: I don't know for sure, but I think the first time I heard about Ruby was in a conversation with Steffen Gemkow, an IT consultant from Dresden. Must have been 2002 or 2003, I think.

I then had a look at it, but at first underestimated it's power. I liked it more than Perl because of the cleaner syntax, and so I used Ruby for what I would have done with Perl otherwise - small scripting and screen scraping tasks, and some data import/export/conversion jobs.

I was a Java guy at this time and I didn't think too much about using Ruby (or any other scripting language) for 'serious' web apps.

Later I discovered an early version of Rails while looking for some ORM layer to simplify a database migration job I planned to do with Ruby.

I found quite interesting what I saw, tried it out in a small side project, and fell in love with it right away. I think the real secret to Rails' success is the Ruby language, at least that's why I prefer it to all the other newish Rails-like web frameworks written in other languages.

Are you using Ruby professionally?

Jens: Yes, we already did several Rails projects at webit!, some for customers, some for internal use. Right now I'm working on an ecommerce solution implemented in Rails for a publishing company here in Saxony. It allows customers to subscribe to online and printed versions of several official publications concerning saxon law and administration. Ferret-powered full text search is available to registered users, too. The site is located at www.sachsen-gesetze.de.

As time permits, I also do some Ruby/Rails freelance work.

What other languages are you using?

Jens: Recently I was involved in C# and Perl projects. Before that I've built J2EE web apps for several years.

What is the Ruby community like in Germany?

Jens: As far as I can tell - quite small ;-) But it seems to be growing. I really can't say much about this, there's not much activity Ruby-wise in or around Dresden. In other areas (Berlin, Hamburg, Frankfurt) there seems to be more movement, though. There's a quite low traffic german rails mailing list.

But Ruby seems to get momentum, recently a potential client explicitly asked for Ruby experience ;-)

What projects are you working on with Ruby?

Jens: Besides building web applications with Rails at webit!, I currently maintain two Ruby-related open source projects:

RDig is a full text indexer for web sites and file systems written in Ruby
acts_as_ferret is a Rails plugin for easy full text search across model data.

Both projects are built on top of David Balmain's great Ferret library.

In addition to that, I'm involved in the soon-to-go-live project led by Benjamin Krause, that David talked about in his interview ;-)

What can you say about this project?

Jens: OK, time for some planned leakage ;-)

The project is called Open Media Database or short, OMDB. As the name says, it's a database about media. Our goal is to provide objective, correct and structured information about media of any kind, be it books, movies or music. To reach this goal we combine full text information with structured, domain specific data. For now, the scope of the platform is limited to movies.

Basically all data will be open for editing by the public, just like Wikipedia. To ensure the correctness of the structured data, it is possible for a team of editors to freeze never-changing relationships like 'George Lucas has made Star Wars' to a non-editable state after they have been confirmed as correct.

All information will be published under a free license. we're currently looking into the CC and GNU license versions. Of course we plan to provide APIs for easy access to the database.

At the moment we're busy getting ready for a public beta. A team of film students and journalists is entering content into the site as we don't want to go live with an empty database.

You can have a look at our current development version here.

It's our testing server, so expect the occasional hiccup ;-). As we're mainly german natives, most of the content already entered is in german, too. But we plan to go live at least with english as an additional language. Most of the user interface already is translated.

We also have a blog.

Can you give some examples of how Ruby makes Rails better?

Jens: In general Ruby tends to not surprise it's user - most of the time even a novice developer can guess how to do something. That makes it very easy to get started with Rails for people coming from other languages.

From a more technical point of view, I think one of the important Ruby features that make Rails what it is is objects and classes being open to extension from the outside after their declaration. That enables the little goodies like '3.weeks.ago' as well as custom methods in Active Record relationships and plugins that build complex stuff such as versioning right into the framework.

Another example of using Ruby's power in Rails is when you create your own DSL for use in your integration test sessions - telling user stories with statements like 'joe.buys_a_book' makes testing really fun.

Another point is that in a standard Ruby installation, you have almost everything you need, in terms of functionality. So, any special needs aside, you have Ruby, Rails, and maybe the ruby-mysql bindungs, but that's it. You can start hacking away on your project and chances are you don't need any other libraries. Depends on the project, of course, but it makes installation and maintenance in general much easier if you have less external libraries to watch.

Compare that to Perl, where even object orientation is implemented in external modules. A project like Catalyst has to depend on so many external modules (not talking about the project itself being split into another huge set of modules), that it really can be a hard job to get everything installed in the right versions. Imho, you have to be a hard core Perl hacker to get a successful start with Catalyst. I don't want to say the way Perl does handle this is plain wrong, but it makes it harder to get up to speed with a new project, especially for people who are new to the framework and the language.

What problems have you seen with Rails?

Jens: Deployment could be really hard in the beginning (say before Mongrel was there). Zed Shaw is really doing a great job with Mongrel, and it became my preferred deployment platform right from the start.

Otherwise, I'm really happy with Rails :-)

Have you looked at JRuby at all for bridging the Ruby/RoR and Java worlds?

Jens: I did a short look at it, but at the time (I guess somewhere in between 2004 and 2005) it was still quite inactive and I decided to not try using it in production code. Maybe I would have taken a second look after JRuby got up to speed recently, but I didn't do any Java projects since then.

That said, a working .NET integration would be what I need now ;-) I've been watching the Gardens Point Ruby.NET Compiler project for a while now, but I think it will take at least a year until we can talk about using Ruby e.g. in an ASP.net web app.

I'm not an RDig user, can you tell me a bit more about it? Do you know how many people/projects are using it?

Jens: RDig mainly does three things:

crawl for documents
extract content from those documents
index that content with Ferret

The crawling can take place in the file system, or on the web.

For content extraction there are pluggable content extractors for various formats (pdf using xpdf-utils, doc using the wv utility, and html - here you have the choice between a Rubyful Soup based extractor, and one based on the hpricot lib). The hpricot content extractor was the great news of the latest version of RDig, since it's way faster than Rubyful Soup.

In theory, the indexing backend could support other indexing libs, too, but I didn't feel the need to implement one yet.

There's also a CLI for querying the ferret index created by RDig, but that's more for testing purposes. You're supposed to write your own frontend for your index, think of the search in your intranet, or for a site search on a public web site. RDig however has some code to make accessing the index easy, there's no need to learn the Ferret API.

For the numbers, they're quite small. There have been 25/31 gem/tgz downloads of the last version in 2 months. However I have mails from several people successfully using it for tasks like intranet search. Some even send me patches - so I guess it's a useful tool for some people.

At webit! we use RDig for a client who has a web site that in large parts is built by a CMS that publishes static HTML pages. RDig crawls the site via HTTP every night and so rebuilds the index that is used for the site wide search (which is implemented in Rails).

Thursday, January 25, 2007

MountainWest RubyConf: Implementors Panel

One of the things I'm most interested in at the 2007 MountainWest RubyConf is the Implementor's Panel. We're going to have John Lam (RubyCLR, Microsoft), Charles Nutter and Thomas Enebo (JRuby, Sun), Evan Phoenix (rubinius), and Kevin Tew (Cardinal) taking part in the second Implementors' Summit alongside the conference and making a panel presentation during the confence. We're going to make sure there's plenty of time for Q & A during this panel session.

Ruby Implementers Summit Attendees

The first implementors' summit was held at RubyConf 2006, and paved the way for a lot of the cooperation that we're seeing between implementors today. Hopefully this summit and panel will help promote further cooperation between the implementors and with the community at large.

If this kind of presentation is interesting to you, you should head over and sign up now, before all the seats are gone. There are only 250 available, and you don't want to get left out in the cold.

If you're working on a Ruby implementation of your own, and you'd like to be involved in the summit/panel, please drop me a line and we'll see what we can do.

Wednesday, January 24, 2007

rubinius serial interview: episode VI

In this week's episode, Brian Ford (brixen) is joining us. He's been doing a lot of work on the testing and RSpec side of the rubinius house, so that's where we'll start today.

Evan, in our last interview, you talked about 4 tasks you'd like to tackle, at the time, you mentioned that you'd like to work on RNI first. Shortly afterwards, the same topic came up on #rubinius. and there was a pretty strong call for stabilizing the compiler and adding method lookup caching. How does this affect your plans. In general, how do you balance public requests with your own vision of rubinius?

Evan: I ask for public opinion on my task list to help me gauge what's important. There are tasks that I find quite interesting, but they're not always the ones that help the project the most.

I don't allow public opinion to totally drive the order of my list, if I did, I'd probably have to jump from task to task. Everyone on the project has a different list of what they see as important.

I think that giving the public input into the dev process, even if it's just their opinion, helps develop a feeling of community. That's important because I want to get more people to believe in rubinius and want it to succeed. As the number of people interested grows, the interest grows and the project has a better chance of success.

Back in the 2nd episode, Wilson mentioned a spec technique that would allow two implementations to be compared. How is that coming?

Wilson: A great deal of progress has been made. Here's an example spec. In this case, for Array#&:


specify "& should create an array with no duplicates" do
 example do
   p(([ 1, 1, 3, 5 ] & [ 1, 2, 3 ]).uniq!)
 end.should == nil
end

example takes a block, and then evaluates that block using a particular Ruby implementation; MRI, JRuby, or Rubinius. When you run the specs, you decide which runtime you would like to use. At the moment we just run these manually, but eventually we would like to have a continuous integration system for it. If you have 100 failures already, it can be hard to notice that you (for example) fixed one but created a new problem.

Brian: Actually, there are currently a couple of approaches. One is to write a single set of specs that describes Ruby both at the language level and also at the core and standard library level. When I first wanted to add specs to drive my work on rubinius, I knew that rubinius could not run RSpec. Looking at the tests, I saw that Evan was running rubinius in a subprocess from the unit tests. That was all the inspiration I needed. I initially wrote a method named 'rubinius' to wrap the subprocess machinery. That lead to specs that looked like this:


specify "some behavior" do
  rubinius do
    p [1, 2, 3].first
  end.should == '1'
end

This really bugged me because I wanted the specs to look like this:


specify "some behavior" do
  [1, 2, 3].first.should == 1
end

But I realized that in MRI (Matz' Ruby Interpreter), I could define 'rubinius' like so:


def rubinius
  yield
end

"Sweet!" I thought. But then the rubinius name didn't make sense anymore. That's when I decided on the a method named 'example'. And since I was writing all my spec bodies in irb and copying the result into the spec, it made a lot of sense to just make it run under MRI. So, I decided to make a host/target configuration where the 'host' would run RSpec and the 'target' would execute the body of the spec. Making that work with MRI/MRI meant that I could write specs and run them directly instead of doing everything in irb first. That was a big productivity gain for me.

Of course, I was asking a lot of questions and getting a lot of feedback on my initial work on the specs. A couple folks, lypanov and mae in particular really disliked the string comparisons I thought we were forced to do. Finally, one day I was chatting with headius and he asked why I didn't just put the 'p' call into the 'example' method and eval the result. That was a big jump because now specs looked like this:


specify "some behavior" do
  example do
    [1, 2, 3].first
  end.should == 1
end

Still not ideal, but much closer. And changing these to the ideal form would be really simple, we'd probably even be able to automate it. So anyway, showing off what I had accomplished with headius' prompting to nicksieger inspired him to whip the jruby_target.rb (the file that implements the machinery for running the specs with JRuby) into shape. So now we can target MRI, JRuby, or Rubinius.

Another approach to compatibility has been pursued by mae (Matthew Elder). His idea was that it would be great to just have files of many different examples of Ruby code and have an automated process to run those through RSpec. His work on this can be found in the directory spec/compatibility. I think it's still rough at the moment because I think mae is pretty busy with school work.

Last week, when we talked about STM, it was mentioned that rubinius core classes should use STM to make sure they're thread-safe. I think that using STM like this is easier said than done. Could you show us simple example of using STM to implement something (say, Array#pop)?

Wilson: OK. Let's say that there is a core method, 'atomic', that takes a block, and yields to it. Behind the scenes, it wraps that code in a transactions, and either completes it or retries it, depending on what happens. For those familiar with database transactions, it's just like that, but in RAM.

I'm not sure that Array#pop is a method that would really be 'automatically' wrapped in a transaction. I think it is more likely that users would wrap a section of code that they needed to be re-entrant in an 'atomic' block, at a higher level. We don't want to slow down serial code unnecessarily. For now, though, let's pretend like we wanted a guaranteed-safe version of Array#pop:


class Array
 def pop
   return nil if self.empty? # Popping an empty array is fine.
   e = self.last # Fetch the last element
   @total -= 1 # 'resize' the array.
   return e
 end
end

Without some kind of safety net, two different tasks could end up popping the same element from the array. Thread #2 could end up hitting e = self.last before Thread #1 updates the array size.

One (not that great) way to implement this would be to allocate a Mutex when creating an array, and have every critical method synchronize on it.

That would probably work fine in ruby 1.8, because Array is implemented in C, and Ruby code cannot directly tamper with it. In Rubinius, Array is a .rb file, and user code can (potentially) re-open the class and muck about with the internals. User code could then bypass the Mutex, and directly access the '@total' instance variable. (Not really; we'll be protecting that before we are done, but you get the idea.)

With STM, you would simply wrap the body of the 'pop' method with 'atomic { }' and move on. The visible state of the '@total' variable will change 'all at once' when the transaction commits.

Finally, I understand that rubinius can now load archives, and perform partial recompiles. What new opportunities will this open up for rubinius and its developers?

Brian: This is a feature that I really lobbied for when Evan was asking what he should tackle next. This gives us the ability to stabilize parts of the rubinius implementation and thereby involve a wider range of folks. The core libraries needed to run rubinius are now included, so to get going, people only need to compile the C source for the rubinius VM (shotgun) . Some people will jump right in and build rubinius, chasing down compile errors and that sort of thing. Others will start reading source and asking questions. Others will be most excited to run their own carefully typed 'puts "hello, world!"'. We want as many people at every level to help us get rubinius into shape as quickly as possible since we're surely playing catchup with MRI, YARV, and JRuby.

Wilson: High performance virtual machines (such as Microsoft's CLR, Sun's JVM, and Jikes) have shown that there are many different execution scenarios for dynamic programs. The machine needs to be able to choose amongst several different optimization plans, and even switch back and forth as circumstances dictate.

Given code like this:


if almost_never_happens
 self.something_expensive
else
 nil
end

..a dynamic optimizer should probably treat that as:

nil

..with a handler that will go back to the original code if almost_never_happens ends up being true someday.

The evolving Rubinius features are, at least partly, the support infrastructure that will allow this kind of thing. A file containing puts 'Hello, world' should probably just be interpreted directly, to reduce overhead. A Rails app that runs for a year without a restart needs aggressive optimization and caching.

Evan: One scenario I'm hoping to see archive loading used is deployment. The deployment strategy for rails apps and the like is hit or miss, relying on tools like capistrano and rsync to push new code out. My hope is that archives will be able to replace some of those mechanisms, allowing a rails app to deployed by simply replacing the app's main archive on the production systems.

Deployment of real applications also becomes simpler. A developer building an OS X application in ruby can easily package up all the ruby code in an archive to make it easy to release new versions.

This week's episode is brought to you by Test Driven Development: A Practical Guide

Previous episodes of the Serial Rubinius Interview are available here:

Episode 1, in which we talk about the rubinius community
Episode 2, in which we talk about cuby and testing tools
Episode 3, in which we talk about rubinius documentation
Episode 4, in which we talk about cooperation with the JRuby team and YARV
A diversion, in which Ola Bini, MenTaLguY, Evan, and I talk about rubinius and lisp
Episode 5, in which we talk about STM and rubinius (with special guest MenTaLguY).

Episode 1, in which Charles, Thomas, and Ola talk about their plans for JRuby.
Episode 2, in which Charles, Thomas, and Ola talk about cooperation with the rubinius team and YARV.
Episode 3, in which Charles, Thomas, and Ola talk about cooperation with the rubinius team Rails and (more about) YARV.

MountainWest RubyConf: Gregory Brown

This is my second in a series of posts about the speakers at MountainWest RubyConf (you can read the first one here).

One of our speakers is Gregory Brown (the creater and lead developer of Ruport), who's going to talk about "Pragmatic Community Driven Development in Ruby". Given his work on Ruport during Google's Summer of Code, and his continuing community building (like this blog entry, I can't think of a better person to talk about it.

He promises to talk about:

Why write free software? (and why it can still be a commercial effort)
Choosing the right license
Attracting new developers
Keeping the community healthy
The benefits (and drawbacks) of building a community and keeping it running

Greg's talk alone is probably going to be worth the $50 registration, but there are some more great talks lined up. I'll be sharing a bit about them over the next couple of weeks, but you don't have to wait, you could head over and sign up now, before all the seats are gone.

Tuesday, January 23, 2007

MountainWest RubyConf: JRuby

I wanted to share a little more insight into the 2007 MountainWest RubyConf, and thought that introducing you to some of our presenters would be the best way to do that.

I'm really excited that we're having both Charles Nutter and Thomas Enebo coming out to talk about JRuby.

Besides the usual suspects in a talk about a project (where we've been, where we're going, and how soon we plan on getting there), Charles and Thomas have two very interesting topics in their talk:

how we can all share the load and help raise Ruby up in all quarters.
lots of demos to show that JRuby is a real, viable Ruby platform you should consider while developing Ruby applications in the future.

With talks like this, how can you not want to come out and play? The registration is only $50, and includes an conference T-Shirt. C'mon, head over and sign up. Seats are going fast.

Monday, January 22, 2007

Rubinius Library Writing Guide

Brian (Brixen on #rubinius) has put together a nice how-to on writing libraries for rubinius. He starts out with an overview of the library architecture, then moves on to writing library methods in Ruby and primatives in C. It's good stuff. Check it out. (Oh, and it's on a wiki, so you should also take a moment to improve it if you can.)

MountainWest RubyConf: Open For Registration

Wow! The MountainWest RubyConf is taking registrations now. It's been long, hard road and we wouldn't have made it without the excellent work of several people. I'd like to say a special "Thanks!" to Mike Moore — if he hadn't stepped up and done a ton of leg work, we wouldn't be here today.

The speakers have been selected and notified, I'm just waiting on two of them to get back to me, and we'll release the schedule. I can't wait for what looks like an incredible regional RubyConf.

There are just 250 seats (at $50 each), and everything looks like great skiing before and after the conference, so head on over, sign up, and make your plans to be at the best Ruby Conference ever to hit Salt Lake City.

Author Interview: Brian Marick

Brian Marick, the author of the recently released Everyday Scripting with Ruby, was gracious enough to spend some time talking with my about books (including his), scripting, testing, and Ruby.

Would you please introduce yourself?

Brian: I'm a consultant. I work almost exclusively with companies using Agile methods (Scrum, XP, etc.). My emphasis is on testing in Agile projects, but one of the things about Agile projects is that roles don't work in isolation from each other. So, although I'm usually hired to work on testing, I end up sticking my nose into programming, planning, management, etc.

You've had a long history with Ruby, how did you first discover it?

Brian: In February 2001, I went to a workshop at a ski resort in Utah (the one that ended up writing the Manifesto for Agile Software Development). The shuttle ride to the airport was about an hour. Two of the other people in the shuttle — people I didn't know — were also going to the workshop, so we got to talking about programming. I mentioned that I'd been working with this language called Python and was liking it.

The two people were Andy Hunt and Dave Thomas. By the time we got to the resort, I'd promised to buy the Pickaxe book and try Ruby. As has happened with so many others, it just clicked. Things worked the way I guessed they would, so I learned fast.

What's kept you involved with Ruby?

Brian: Two things, really. One is that it's the best language I have for the kinds of projects I want to do. The other is that I used to be kind of a language geek in my Lisp programming days, and Ruby has that Lisp/Smalltalk air about it: the way there's something conceptually simple at the core that lets you do powerful things. In my Lisp days, I loved _The Little Lisper_, and Ruby inspired me to write a book in that style. I never finished, but the first few chapters are here

I loved that book when you first announced it, and have gone back to look at it a few times since. Any chance it will ever see the light of day?

Brian: There's always a chance... Dave Thomas said the PragPress might be interested in it, but that was long ago. They might be more pragmatic these days.

What warts do you wish the Ruby core developers would clean up?

Brian: I'm pretty happy with the core of the language. If I had one wish, it would be doc strings (originally from Lisp, now used in Python). Here's an example from my .emacs file:


(defun ruby-visit-source ()
  "If the current line contains text like '../src/program.rb:34', visit
that file in the other window and position point on that line."
  (interactive)
  (let* ((start-boundary (save-excursion (beginning-of-line) (point)))
  ...

That string is available at runtime. It allows a further step away from thinking of programming as editing text and then running it, toward programming as a more interactive dialogue with an interpreter — one that, oh yeah, produces a text file you can run again.

I suppose I'd also like something closer to what Lisp gives you: a method's parse tree as a first-class object. Editing and eval'ing strings feels clunky compared to what you can do with Lisp macros. (See Paul Graham's downloadable On Lisp.)

The standard library could use better documentation, despite www.ruby-doc.org. Javadoc-style documentation is weak at helping you get started with a library. (Same goes for RAA packages.)

To my mind, what we need most is an IDE for Ruby as powerful as IDEA is for Java. A few years ago, Ward Cunningham remarked to me that IDEA made using a statically typed language almost as pleasant as using the Smalltalk browser. Coming from an old-time Smalltalker, that's a stunning statement. But it's kinda true. I've been doing a lot of Java programming in the past month, and the refactoring support of IDEA is ever so wonderful. It's not impossible to have that in a dynamically-typed language: the very first refactoring browser was written for Smalltalk.

But that's pretty far afield from the core.

A lot of people bring up Ruby's slowness when this questions is asked. Is it not a big problem for you? Do you think there are other niches where Ruby is fast enough?

Brian: The premise of the book is that there are lots of tasks you do that could be done by a computer. Since the computer will always be faster than you at rote work, Ruby's speed is irrelevant. The thing that's relevant is how long it takes you to get the script working. There, it matters that you use a language that allows you to express your desires quickly, and that you program in a way that helps you from getting bogged down in complexity. Ruby works for the first, and I prefer test-driven design for the second. Even for smallish scripts, I think I'm faster when I use TDD.

Speed is a complicated emotional issue among programmers. When I came into the field, the great debate was between assembly language programmers and young whippersnappers like me who preferred C. Some assembly programmers said that C compilers could never produce good enough code to beat a competent assembly programmer, so people should program in assembly. Others said you should write in C, just put the parts that really need to be fast in assembly (though more people said that than did it). Others said you could program in C, but you should know assembly. That way, you'd know what the machine is really doing, and you could avoid faux pas like passing using records as function arguments instead of pointers to records.

That issue eventually went away because C compilers got better than even good assembly programmers.

But nowadays you can hear the same debate, just with the names changed. I recall reading a posting of Joel Spolsky's in which he decried all those young whippersnappers who graduate knowing only Java. They should learn C so they know what the machine is *really* doing...

A dispassionate person, I think, would see that discussions about speed and programming languages are often only masked discussions about who deserves to be "one of us" and who doesn't, about who's demonstrated enough dedication to be trusted making technical decisions, about what the right balance is between conversatism and technophilia. It's a proxy argument. An important conversation to have, really, but I prefer to ground it in other issues (like refactoring, design patterns, and attitudes toward duplication). That lets me err on the side of using Ruby and dealing with speed when I have to, not when I'm afraid I might have to.

(This does give me the opportunity to point to the story of Mel, which captures the aesthetic dimension of the speed issue.)

What do you like most about Ruby?

Brian: How blocks and modules work together. I *need* blocks for my personal happiness, so a favorite programming language has to have them. Modules and mixins are a nice compromise between single and multiple inheritance.

What made you want to write a book about Ruby? How does your book differ from the growing collection of books out there?

Brian: I didn't start out thinking of this book as being about Ruby. At the first RubyConf, there was talk about what would help Ruby break out into the mainstream. Did it need a killer app (like Zope was for Python)? Did it need a niche (like cgi scripts were for Perl)? The answer has turned out to be a killer app - Rails. But at the time, I suggested that one niche might be testers. Many of them could really save time (and, potentially, their jobs) if they could script their repetitive tasks. But the only languages with documentation or training that reached out to them were the languages embedded into GUI testing tools, which (1) don't have library or language support for non-GUI-testing tasks, and (2) are, by and large, pretty crummy programming languages.

That percolated for a while, until Bret Pettichord and I put together a tutorial called "Scripting for Testers", which walked testers through testing a web app via Watir. But we knew that only gave a taste of scripting. What testers really needed was a book, and I wanted Bret to write it, and Bret wanted me to write it, and eventually I did.

The book teaches scripting via a series of projects, each of which automates a particular manual task. I deliberately chose tasks that do not involve test execution. A lot of the time, testers only think about automating their manual testing, and they don't script other mundane tasks that would have more bang for the buck.

As a result, people started noticing that the book had a larger audience than the title suggested. There are lots of nonprogrammers on software projects who do rote tasks that ought to be automated.

People noticed another thing, too. Testers often write simple scripts that grow and grow and grow as new demands are put on them. At some point, the scripts get confusing enough that they can't grow any more. You hit the wall of complexity, and it's common for testers to hit that wall way too soon. Because of that, the book pushes important coping techniques like test-driven design, fear of IF statements, and a loathing for duplication. A couple of programmers who reviewed the draft thought that a lot of programmers would benefit from that material. So - *poof* - programmers were in the intended audience as well, and the book became "Everyday Scripting with Ruby." I'm really hoping that doesn't limit its appeal to testers.

So the short answer is: this isn't really a book about Ruby. Hope you publish this interview anyway.

What kinds of scripts do testers write as opposed to other kinds of programmers? Can you give us a short example?

Brian: I think you'll find proportionally more scripts that set up big piles of data. Some scripts drive programs, but in an interesting way: they move the program to a particular point where a manual tester takes over for some exploratory testing.

Chris McMahon does interesting things with Ruby support of testing. For example, suppose you have a system built in front of a database (perhaps replacing an old "green screen" application). The system ought to be able to handle all the existing records in the database. So Chris might write a script that semi-randomly finds "interesting" records in the database (via SQL queries), then feeds them into the new system (via SOAP) and checks what happens.

I'm guilty of several "simple scripts that grow and grow and grow". How did you approach refactoring? What place do you see for theory-laden topics like this in a book for 'Everyman'?

Brian: For refactoring, I lean heavily on three rules of thumb: get rid of duplication, fear IF statements, and change your names a lot. You get a lot of mileage out of just being prepared to do those things. (The word "refactoring" only appears twice: once in the glossary, and once in a parenthetical comment in the main text.)

My approach to teaching is to work through examples in print, talking about them as we go. The story told in the book (especially for the second part's example) is pretty close to what really happened, mistakes and all. I even include the bugs I made. When I noticed something ugly in the code in real life and changed it, the book describes what I noticed, why I thought it was ugly, and what I did about it.

The style I teach is very reactive, not proactive. In Part IV of the book, I tell the story of a class named Barker, an abstract superclass. It started out as just a part of the Watchdog class, but then I noticed that there were three methods there that had everything to do with each other and not much to do with any other methods. I pulled them out into a single place — a class — and called it Barker (because it was the part of the watchdog that barked). The Barker happened to bark via email. After that, I added code to bark to a Jabber server (renaming Barker to MailBarker and creating a JabberBarker). Their initialize() methods happened to be identical. Duplication is bad, so I pulled the initialize method into a superclass. As I went on, I found more bits to put up there.

In the 80's, this was called "inheritance of implementation, not specification", and it was considered a Bad Thing. But I think it's an OK thing - so long as you maintain standards of code cleanliness, are willing to change your mind, and have unit tests to help catch you when you mess up.

So: I avoid the problem of theory-laden topics by not talking about the theory much at all.

So it sounds like you're writing about scripting in general, aiming for testers, and using Ruby as the language. The first two are pretty clear, but why Ruby? What makes you think it's the right language for "nonprogrammers . . . who do rote tasks that ought to be automated."

Brian: I'm favoring convenience over speed. That immediately rules out languages without garbage collection.

I think having an interpreter like irb is an enormous help in learning a language, especially the first language. That rules out Java and C#. (What interpreters they have are not, I think, mainstream enough and are annoyingly not the same as the base language.) They also make you tell the language all kinds of things it could figure out itself (type declarations, for example). That's for speed and to eliminate a certain class of programmer mistake. Speed isn't an issue. Decent unit testing catches most of (but not all of) the same mistakes type checking catches, and it's not a hugely important class of mistake anyway.

I think Perl makes it easier than Ruby or Python to write hard-to-maintain programs. In Perl 5, the support for object-oriented programming is pasted on. I believe/hope that object-oriented programming is easy to learn if you don't cloak it in a lot of forbidding mysticism and theory. (Kids do learn Squeak all the time, after all.)

Visual Basic is out because it's not free and because it doesn't run on my Macintosh.

Lisp and Smalltalk aren't oriented toward processing text from files, which is a lot of what the target audience will be doing.

Other languages, like Groovy, don't have as much traction as Python or Ruby.

There's no overwhelming reason to pick Ruby over Python. I do think Ruby is an incrementally better language for lots of small reasons, but I could be wrong. It really comes down to: I've never been motivated to learn Python all that well (despite having maintained a Zope site), so if I were going to write a book, it would be in Ruby.

Since you've said this isn't a Ruby book, which Ruby books do you think people should be reading?

What non-Ruby books should Rubyists be reading?

Brian: I think the cohort of Rubyists who like thinking about programming as an end in itself (not *just* a means to an end) will like Abelson and Sussman's Structure and Interpretation of Computer Programs (long) and The Little Schemer by Friedman and Felleisen (short). (I prefer the earlier versions of _The Little Schemer_ called _The Little Lisper_, but they're out of print.)

When it comes to non-programming books, I don't care what you read: just read *some* book that makes you a more knowledgeable voter. That means something other than a "comfort book": one that will make you feel more virtuous about the opinions you already hold. Two books up on my queue are Barzun's, From Dawn to Decadence: 1500 to the Present: 500 Years of Cultural Life and Armstrong's Islam: a Short History.

Friday, January 19, 2007

FamilyLearn Interview

Duane Johnson and Neal Harmon are a programmer at and the president of FamilyLearn, a small company using Ruby, Rails, and Amazon's S3 and EC2. They've ported their applications from PHP to Ruby on Rails, and have recently started a Public Beta of their iMemorybook tools. I've invited them to share some of their experiences in this interview.

To start things off, would you two please tell us a bit about yourselves and about FamilyLearn?

Neal: My wife Trisha and I started this website as a little family project over 4 years ago. After reading a wonderful compilation of stories my grandpa wrote, we decided to collect stories about our family to share with our unborn son, Michael. The project evolved, went online, and was shared with other families. So, we became a company with the same vision for the rest of the world, build the world's most enjoyed family library. We're building a place where families can preserve and share the stories of their lives. iMemoryBook is a powerful software for capturing stories and publishing them as a hardbound book. The best part is it's free for families to use.

Duane: I've been a web developer ever since graduating from High School. I didn't know the difference between VB Script and Javascript in those days, so when my first boss said I'd be programming in ASP, I thought, "Great!" and they gave me a company laptop to boot. Since then, I've followed most of the webdev crowd as we've moved to happier and more sensible development languages. PHP was a nice, clean language for a large set of problems. But I was lucky enough to be an early-adopter for Ruby on Rails, as I'd been secretly using Ruby for a number of years before Rails 0.5 came out.

My first job also taught me a lot about what I wanted to focus my life energy on. I had an opportunity for a short time to help engineer the operating system software for an electronic slot machine. I found out quickly that getting paid for something I couldn't give my heart to was bad for both me and my employer. A number of jobs later, I'm delighted to work for a company that believes in family and preserving relationships. FamilyLearn is a smart company with a great leader.

How did you decide on Ruby and Rails for your development platform?

Duane: From my angle, as an experienced web developer, I wanted to enjoy what I do. I got pretty tired of re-inventing the wheel in PHP, or, if you'll pardon the analogy, trying to fit a Honda engine in to a Jetta. You can find anything for PHP, but getting the whole system to work right together was a real challenge for me — one that kept me up at night and eventually led to frustration and a search for something better. When I found out about Ruby on Rails, it just had to have "Ruby" in its name and I was already hooked.

Since Neal was already hearing good things about Rails, I think he just needed the right players with availability and he was willing to see what we could do.

Neal: I choose to give Ruby on Rails a try after reading "Agile Web Development with Rails". The framework conventions seemed to make programmers happier and more productive. I saw impressive demonstrations on the development speed of the technology. Also, I liked to use nearly all the websites that I encountered which had been built with Rails. They were, for the most part, simple and straight forward.

Our company faced some rapid growth on a hodge-podge of loosely tied together PHP applications, along with some Perl, LaTeX and TeX. The database reflected numerous shifts in the business' direction and changes in programmers. Bringing on new engineers to help the iMemoryBook service grow quickly revealed the weaknesses in our mess of code and we ultimately decided to green field our new iMemoryBook application completely in Ruby on Rails. I made the decision hoping for some miraculous development times before our company outgrew the old code (we were already having some serious growing pains).

What challenges has that created for you?

Neal: Initially, Rails didn't prove as fast in development times I had hoped (part of it was that two of us on the project were getting into Rails for the first time). It was a slow start in the beginning, but the benefits of the framework and our Ruby guru's (Duane) approach began to shine as the code base grew.

Duane: Choosing Ruby on Rails meant green-fielding the first version that Neal had put so much time and effort in to. We couldn't re-use a scrap of code once we chose Ruby. But as a testament to his flexibility and personal humility, Neal was willing to take a leap of faith — we built the database structure from scratch so that it would fit cleanly with ActiveRecord's expectations of what a database should look like. When Neal would look at the new database, he'd laugh so hard and say, "You can tell I wasn't a programmer when I started this thing, can't you?" This has been a blessing in the long run, but rebuilding what was basically an already functioning system was a real frustration for everyone at first.

Another area where Rails has not been kind to us is in its speed. Our iMemoryBook system is a computationally intensive process — sometimes taking minutes to complete a task (such as converting a TeX book into PDF format), which means that a whole process can be tied up for that long. We're still trying to figure out how to get this to work on a large scale.

Neal: When I asked Paul (our other developer) about this, he said:

Ruby itself seems just as fast as the next scripting language. It's not C, but nothing but C is. ... There are a couple of problems with rails/mongrel in a production environment. Ruby/Rails encourages lazy programming. The think the philosophy is something like, "don't work for the computer, let it work for you". If you are not careful you will be dong things the ruby way and not realize that you are hitting your database multiple times on every iteration of a loop. This just won't work in real life. You still have to remember you are working with finite hardware.

Duane: For our system, Ruby hasn't been the bottleneck. By far, using LaTeX in the back-end has been our challenge. It's just so hard to get speed improvements out of that system when it isn't really meant to be creating PDFs on-the-fly for each change the user makes. Once we address that issue with some caching techniques, we'll be able to see if Ruby becomes a bottleneck

In addition to using Ruby on Rails, you guys have been using the Amazon EC2 and S3 offerings. I'd like to spend a little bit of time talking about them too. Looking back on the past couple of months (the time you've been working with Amazon) what's stood out as the good and the bad?

Neal: Except for the struggles to successfully launch a SUSE virtual machine, everything has worked well for us so far. I have seen a few slow downs in our website that are not due to our own traffic. I suspect that the grid got hit by too much traffic all together and we weren't really getting our full virtual machine resources or we didn't have access to enough bandwidth. But, I don't know...we haven't done sufficient testing to determine if Amazon is the problem.

Duane, there's a lot of movement around S3 libraries for Ruby right now. Are the Amazon libraries good enough for you, or are you looking at any of the third-party offerings?

Duane: We've wanted some better tools. While the S3 sample ruby library that Amazon provided was a good starting point, it didn't seem to take advantage of some of Ruby's idioms. In other words, I guess it didn't feel like a Ruby library to me, and so it fell in to that "itch that needs to be scratched" category. I've taken a quick look at some of the nicer offerings that are now available, but since our code seems to do well enough with the current implementation, we haven't had a need to go fix it. In the words of my coworker, Paul Jones, "on a scale from terrible to beautiful, this code is 'working'".

What tools do you find lacking?

Duane: We've wished that some of our FTP tools would support S3. I use Panic's "Transmit" application for the Mac, and I've heard rumors that they may be implementing a solution soon. It would also be nice to have a more centralized "control panel" or something for all of the common tasks on EC2. The command line is a bare-bones kind of tool that gets the job done--with a lot of reference help. I'd like to see some kind of web-based administration system for tasks like backing up / duplicating your EC2 image, starting new instances of common images, and browsing what other shared images people have created.

How is working with EC2 and S3 different than developing for a local system?

Duane: There are some pros and cons to using S3 for file storage. Some of the benefits are widely known, such as infinite, scalable storage space and very fast content delivery. We've found some disadvantages, however, that have made things a little painful, but still workable. For example, we do a lot of image resizing in our application, so we have to decide between caching on S3 for speed and getting scaled / rotated images directly from our server. We've kind of struck a balance so far where some of our "probably not going to change very often" images are stored on S3 while our "probably going to change" images are generated dynamically. Another area that has sometimes surprised us is that uploads to S3 will fail inexplicably and so our code has to take retries in to account. Other than that, we've been quite happy with the fast transfer times between the two systems. Amazon has definitely thought out how the two systems orthogonally complement one another.

Neal, what changes would you like to see in the EC2 or S3 offerings?

Neal: I would like to see pricing that scales with a business. The day will come when Amazon will be more expensive than building our own system. We will need to switch when that day arrives. I suspect, even hope, there will be a company who will solve that problem for us before we get there. It'd be great if it were Amazon.

What advice would you give to someone who's looking at the Amazon Suite?

Duane: I think it's a great solution for small developers who intend to grow quickly. While a stand-alone server will work for quite some time, it's always painful when you hit that "wall" and realize your configuration isn't going to hold up against all of the traffic. Being able to turn on another server and/or rely on the distributed storage of Amazon is a decent way to scale on a budget.

Neal: Use both. Bandwidth between S3 and EC2 is basically free. It saves us a lot in image processing.

What advice would you give to someone who's looking at migrating to Ruby on Rails?

Neal: Consider hiring one RoR developer to guide your project and hiring other developers to learn RoR while helping with the project. It's difficult to find good RoR people.

Duane: Make the switch, and don't look back! Seriously though, if you're someone who takes pride in what you do as a programmer, Ruby is a great language with a great supporting culture. There are a lot of computer-scientist types in the Ruby world that make for good examples wherever you have weaknesses. And of course, the push for "convention over configuration" on the Rails platform is generally a good thing when it comes to quickly learning how to get things done.

I've saved myself often by doing something the "Rails way" by finding out later that a convention I'd followed earlier was smarter than face value. Web development isn't a new thing anymore, and we shouldn't have to be re-thinking the whole business each time a new application is built. That's why I said good-bye to PHP two and a half years ago.

Technorati tags: Ruby Rails interviews

Monday, January 15, 2007

Serial rubinius Interview: Episode V

This week's episode hits three points; Evan's next steps with rubinius, garnet (the new name for cuby), and Software Transactional Memory (STM). MenTaLguY joins us for the third discussion, and provides a long list of resources to help get up to speed. Happy reading, and happier hacking!

Evan, after hiding out in your secret lair to hack on the newly released GC, what's next on your coding schedule?

Evan: Well, I've just committed some error reporting code. The code is invoked when rubinius crashes and prints out a C backtrace, ruby backtrace, and part of the rubinius stack.

I've got a few tasks that I see as coming up next:

Finish up RNI (or whatever the name will be). It's the VM interface to external C functions. This includes a shim library which has the same API as the 1.8 C interface, so current C extensions will work under rubinius with little or no change.
Work to stabilize the compiler running under shotgun. Currently, lots of people are running into problems with it. The problems lie in the implementation of the core methods that the compiler uses, so fixing the problem would in fact be ironing out the core and kernel of the VM.
Add method lookup caching. This probably would come in a few different forms.
- A global cache, like the one in 1.8. This is easiest to implement.
- A per class cache. This cache is faster than a global cache, but requires more complicated logic to support flushing the cache (an important operation)
- Inline cache. The fastest, but again, the most difficult to implement. This is the caching that the fastest smalltalk implementations use.
Tail-call recursion optimization. After a discussion on #rubinius, I realized how easy it will be to integrate tail-call optimization to the compiler and VM.

Most likely, I'll finish RNI first. I prefer to finish a major task before going on to another task.

Wilson, Evan tells me you're behind the cuby -> garnet name change. What's up with that? And, more importantly, are we going to see any more additions to garnet (say, a tutorial, or some docs) in the next couple of weeks?

Wilson: I have spoken to a (large) number of people about Rubinius since the interviews began. When Cuby comes up, almost everyone says "What? How do you pronounce that?" or similar. Garnet came to mind while trying to think of things 'lower level' than Ruby. Hopefully I can stop explaining how to pronounce it now.

A tutorial will definitely be forthcoming. I'm hoping to have one written up in the next week or so, partly to teach myself what it is good for. Heh.

Evan: Garnet docs will begin to appear soon I hope, as soon as I've gotten the GC stabilized and I begin to actually use Garnet.

Recently, MenTaLguY has been hanging out in #rubinius and talking a lot about STM. What is it, and how does it fit into your plans for Rubinius?

MenTaLguY: STM stands for "Software Transactional Memory". Operations on shared data are grouped into atomic transactions which either succeed or fail as a whole. While one thread is performing a transaction, none of its changes will be made visible to other threads until the transaction is complete. If there is an error during the transaction, or if the transaction ends up conflict with others, its changes are discarded and an appropriate error reported -- at which point the offending thread is free to try again, with the other threads' updates taken into account.

A common misconception about STM is that it doesn't involve locks at all. In reality, as with database transactions, some implementations use locks directly to implement transactions, while others do not. The important thing is that you, the user of the STM API, don't need to deal with locks directly. It's a bit like what automatic garbage collection does for memory management, making it easier to write composable abstractions.

It's worth noting that there are a number of other models competing with STM, including dataflow concurrency (lazy evaluation and futures), actors (a la Erlang), and various realizations of the join calculus (per funnel, Join, etc.), but I think STM and the join calculus are recently the most promising. STM is interesting in that, while it's been an active research topic for quite a while, it didn't really "break through" until 2005, when the paper "Composable Memory Transactions" (Harris, et al) introduced key primitives for blocking (retry) and choice (orElse). [Editor's note: See below for a great list of links about this stuff.]

My personal plans for Rubinius are really pretty modest; I heard Evan and Wilson were considering STM for use in Rubinius, and I volunteered my existing STM-in-Ruby implementation in the hope that it would save them some work. Time permitting, I'd also like to implement the other concurrency models, whether atop STM or in some other way.

Evan: STM is going to be used in rubinius to hopefully make it easier to write multi-threaded code. My hope is that the core libraries will use STM so that out of the box they work properly when used multithreaded. And btw, when I say multithreaded, I mean both green and native threads.

Wilson: MenTaLGuY has covered the options in great detail, and I'm glad to have him on board. I intend to leech him of of concurrency knowledge like some kind of psychic vampire.

I'm a big fan of STM for two main reasons:

It lets you write concurrent library code without worrying that user code will break it. This is a huge problem for fine-grained threading.
The major CPU developers are planning to implement it in hardware in the coming years. That should offer a nice speed boost, and offer yet another excuse to spend money on computer hardware.

MenTaLguY: Hopefully we'll see hardware support for sub-transactions, as that's what allows interesting stuff like 'retry' and 'orElse' to work, and hopefully the OS APIs won't bind transactions to hardware threads (maybe someone needs to get on them about that once such APIs begin to be developed).

MenTaLguY was also kind enough to provide a list of links if you want to read more about STM and its alternative:

Blog posts:

Composability and Productivity — About the importance of composable abstractions.

STM Comes to Pugs — The post that inspired my original Ruby STM efforts

Concurrency Five Ways — My own overview of different concurrency options.

Wikipedia articles:

Software Transactional Memory

Join Calculus

Actor Model

Dataflow

Languages/Libraries:

Ruby STM — A draft of my own STM implementation for MRI.

lazy.rb — My implementation of promises and futures for MRI. Last released version is rather out-of-date, and predates fastthread.

LibLTX — Rob Ennals' papers and demo lockful STM implementation for C

C\omega — C# extended for Join Calculus

Funnel — Based on Join Calculus

Futures in Alice — Example of futures for data-flow concurrency

Papers:

Composable Memory Transactions — The paper that started it all.

Software Transactional Memory should not be Obstruction-Free — STM with locks, and why.

The Joins Concurrency Library — Joins in regular C# 2.0, courtesy of generics

This episode is brought to you by: Garbage Collection: Algorithms for Automatic Dynamic Memory Management. .

Previous episodes of the Serial Rubinius Interview are available here:

Episode 1, in which we talk about the rubinius community
Episode 2, in which we talk about cuby and testing tools
Episode 3, in which we talk about rubinius documentation
Episode 4, in which we talk about cooperation with the JRuby team and YARV

Episode 1, in which Charles, Thomas, and Ola talk about their plans for JRuby.
Episode 2, in which Charles, Thomas, and Ola talk about cooperation with the rubinius team and YARV.

Friday, January 12, 2007

Button, Button, Who's Got The Button?

Wow, the nice artists over at Apress have put together a button for this month's How Rails Made Me a Better Programmer blogging contest. If you've already submitted an entry, feel free to grab the image and use it with your essay. If you haven't entered yet, what's keeping you? Time's running out. Just go read the challenge and leave a link to your entry in the comments.

I'm hoping the Apress artists will come up with a special button for the winner too. Hey, three books and a button — what more could you ask for?

Update: I just realized that the button was an animated gif ... *sigh*. I've deleted it to see if we can come up with something better. Until then, I've pulled it. Sorry.

>p>Update again: Okay, new image, no animation, less ad-like.

Will rubinius Be An Acceptable Lisp

Yesterday (Wednesday, January 10th, 2007), there was a short discussion on the #rubinius irc channel which prompted a few questions which I thought would be best asked and answered here. Before I get to them though, I thought I'd share some context:

olabini: pate said something about Lisp on rubinius? wanna elaborate Evan? I'm drooling at the thought...

evan: Sure, since i started the project, i've always wanted to write a parser for a lisp-dialect that i could feed to the existing compiler/assembler

olabini: Yeah, pate said so. it sounds very interesting. My first thought is it sounds nice to be able to mix and match ruby and lisp within the same runtime. The basic Ruby operations should make it very simple to create a basic Lisp dialect.

So, what would kinds of uses do you think would Lisp on rubinius really see?

MenTaLguY: Well, speculatively, it'd pretty much be a Lisp dialect with Ruby's smalltalk-esque object system bolted on, I guess. Maybe otherwise undefined functions would be tried as method calls on their first argument? Most likely people would be doing things like using lisp macros to glue together DSLs that would be grotty in straight Ruby. You'd probably also see Ruby methods implemented in Risp when there was a compelling reason to do so (e.g. because it was easier in Lisp or because the author had a Lisp fetish).

It's interesting to note that (unless I'm making this up) Ruby itself started out as "matzlisp", a Scheme dialect (as still evidenced by many things like Bignums, the numeric tower, callcc, immediate types packed into VALUE, symbols, and the general feel of the C API). So it'd be sort of a return to Ruby's roots.

Ola: For me it's a matter of taste. I love those parenthesis.

No, what's more interesting is combining Ruby with code that can do real macros. That's one thing I really miss in Ruby, and also something I have written about in my blog on occasion. The most prominent use case would be writing some of the code in pure, simple Lisp, have a few macros that transform these in Common Lisp-style, and be able to require that file and call the methods defined from regular Ruby-code. Once again it's one of those best things from both worlds stuff. Would it be possible for me to write Lisp and fall back on Ruby when I wanted too, integrated with Ruby, it would be like heaven. Almost. =)

Evan: I'm motivated by the ability to have a simple language that I can use to write tests to stress different parts of the VM. It should be noted that I've only really done lisp once and it was for class. I'm mainly drawn to it because the parser is dead simple and the grammar could use all the functionality of the VM.

Is this really putting the cart before the horse? How do you know rubinius will even be stable/performant enough to handle this?

MenTaLguY: No. It's always time for Lisp.

Realistically, I'm sure Ruby-on-Rubinius is going to take priority anyway.

Ola: Regarding performance, it's very easy to get Lisp performing well enough. And since Rubinius is built on a Smalltalk VM architecture, it's operations fit very well with Lisp. A Lisp on Rubinius would perform as well as Ruby on Rubinius, no doubt. (Remember that the first Smalltalks were implemented in Lisp, btw).

Regarding stable... Well, another language will use the machine in different ways, which would possibly help increase stability. And the more interesting use cases you can find for Rubinius, the more hype it will generate, and the more people will contribute. So, the question isn't really if it's stable enough. Stability can be a consequence if it instead.

MenTaLguY: If it's stable/performant enough for Ruby, it's stable/performant enough for Lisp. But let's say it's not -- them the demands of implementing Risp will make it so. Everyone wins.

Evan: Sure, It will run the same speed as ruby, it's going to get compiled down and run on the VM. It should be noted that when I said lisp, I actually said a lisp-dialect. I don't indend for this to confirm to ANY lisp standard out there. I only expect to have a lisp that lets me perform the same operations you can do in ruby.

Would a programmer really be able to mix and match as they went along?

Ola: They should be. The probable delimiter would be on file-basis, or possible a RubyInline-variation.

Evan: Since the lisp will compile to rubinius bytecode, sure!

MenTaLguY: Don't see why not. Mixing in the same file, though? I don't know. Maybe by playing Dylan-like syntax games. The other option would be string evals, but I have a rather low opinion of those in general.

What are some of the bigger technical issues you see involved in making this happen?

Evan: Well, writing a simple lisp parser first, and then continuing to stabilize the VM and compiler.

Ola: There are no real technical issues. Implementing a basic Lisp is very easy. Rather, the two big difficulties I see are not technical at all.

First of all, some of us need to have time enough to do it. Secondly, we should decide some ground rules for what Lisp constructs should map to which Ruby constructs. (For example, should all Lisp methods be added to Object, or can we add something like CL packages, that map to modules or classes? Should it be able to use XSTR syntax in the RL Strings? (XSTR is strings like "abc #{3+2}")) Lots of fun issues to think about.

Another thing that needs to be decided is how the macro support should look. Should it by hygienic or all-powerful? Or both?

MenTaLguY: Figuring out what to take from other Lisps. I don't expect Risp to be a Common Lisp -- the Common Lisp standard library is nice, and it's standard, but it's huge, maybe a little redundant and the naming conventions are so ... un-Ruby.

What follows are personal preferences; don't take this as any kind of statement about what Rubinius might actually do.

As far as syntax goes, I'd like to take some cues from MISP actually. (I've collected links to MISP postings here.)

In particular, I'd like to see this familiar-looking syntax from MISP:


{|x| ...}

being shorthand for:


(fn (x) (...))

[fn being MISP's name for lambda]

An important thing being that this would be shorthand (as in Arc), not new syntax (as in Dylan).

Other than MISP, probably stealing anything that isn't nailed down from Paul Graham's Arc writings. At least the bits that make sense.

One other big deal is going to be making arrays pleasant to use. Since people are going to want to use Risp with the Ruby libraries, arrays are going to have to be more natural to work with in Risp than most Lisp dialects make them.

I think that means at least that Arrays should be usable anywhere a list can (the cdr/tail/rest of an Array would be an external iterator into that array which duck-types as a Pair and #to_a's to an appropriate Array slice).

Ola, you've mentioned wanting to see a Java based rubinius. Does this mean we should be watching out for JLispR or something?

Ola: You can count on a JRuby-based Rubinius. It will happen. And making a Lispinius possible on JRuby would be a major reason for it.

MenTaLguY: JRisp?

And for the questions that's on everyone's mind — Might this be a way to get macros into Ruby?

MenTaLguY: Hygienic macros, please. The fact that they avoid name collisions is just icing — their real value is that they're easier to reason about and an IDE can do smarter things with them.

Ola: Probably not. Or not completely. It would allow macros to exist in the Ruby VM, but not actually expanding Ruby-code.

Evan: It depends. Lisp macros are easy because they take lisp code/data in and output lisp code/data. For ruby to have macros, it has to be able to take ruby code in and at least output something the compiler understands. A macro that accepts ruby code on it's input would have to incorporate that code into the output, which would mean either outputting ruby code or converting the ruby code into another interpretation to be incorporated and output.

That means you probably couldn't have a macro that takes ruby code in and just outputs lisp because it would be lisp with ruby code as strings stuck in the middle.

Something that is possible is to leverage the fact that currently, the compiler takes sexp's as input and turns that into bytecode. So perhaps you could do something like...


(macro debug (code) (
(if $DEBUG (
   ('puts (to_sexp code))
))
))

so that code is ruby code that is converted to a sexp at compile time and the integrated into the output from the macro, which the compiler then processes.

But it might not work either, I've just come up with this off the top of my head. :)

Thursday, January 11, 2007

Refactoring Ruby

Over in my 2007 Ruby prediction post on Linux Journal, I wrote:

Refactoring tools — This is something I think there's just too much clamor for (and too much momentum toward) not to hit in 2007. The JRuby team is making steady progress in NetBeans and Eclipse while wierd, wonderful things are being done with code rewriting on top of ParseTree and other tools. This year, we'll be able to stop saying "Yeah, there aren't any tools, but Ruby is still really easy to refactor."

At the time, I had absolutely no clue that Jay Fields, et al. were going to translate Refactoring into Ruby.

I'd done a translation of the code and the refactorings in the first chapter myself when I was first learning Ruby. It was a great way to figure some things out. I'm really excited to hear that they're working on the whole book though (not just translating, they're going to include some Ruby specific refactoring and other content). I think this is great news.

Here's the original news break and here's a look at the beginning of the book. I'd love to know more about the project, hopefully Jay will keep posting updates (as well as the promised translation).

Super special thanks go to Martin Fowler for giving his permission for this to happen.

Advance Directives For Hosted Projects

Well, it looks like I'm not the only one thinking about this problem. Dries Buytaert blogged about it (from a drupal perspective) a while ago. I'll be interested to see if this idea is useful to the Drupal community.

My original post also seems to be making the rounds, having shown up over on the squeak smalltalk list and on LWN (in a subscription only page, for now). I'm glad that people are talking about it, and I'm glad that I'm getting some feedback (mostly off line).

I'm supposed to go talk to a lawyer/law professor Friday night, so I hope to get some more feedback then. If you've got specific thoughts or issues you'd like me to include in the discussion, please drop a comment below.

Monday, January 08, 2007

rubinius Serial Interview: Episode IV

This time around, we're talking about cooperation between JRuby and rubinius developers. It's a timely topic, since Nick Sieger is starting a Serial JRuby Interview. Good times.

Evan wasn't available this time around ... I understand he's in seclusion working on rubinius' Garbage Collection. I'd rather have him get that right than respond to these — I think you'll agree.

I've been really excited to see all the discussion between the JRbuy developers and rubinius developers -- in fact, Nick Sieger (who's been hanging around in #rubinius) is now a committer for the JRuby project. What benefits do you see coming from this kind of cooperation?

Wilson: It's a pretty good time, really. JRuby people share cool optimizations from Hotspot, and Rubinius people scour the net for research papers to steal from. I hope to see the two groups collaborate on RCRs over at http://rcrchive.net I think everyone agrees, for example, that the class variable behavior in Ruby 1.8 is insane. As we get around to implementing 'rough edges' of Ruby, we are going to write up requests to smooth them out. Hopefully Nick will vote for mine, and I'll vote for his. Heh.

Nick: Well, certainly we hope to all mutually benefit and better each others' efforts. I think it's significant to see such a deep level of cooperation and comraderie from two teams with similar goals. We could compete against each other for the title of fastest or most fully-functional ruby implementation, but the friendliness of the Ruby community for one seems to have a big effect on our working together.

If you could get the core Ruby and alternative VM implementers together for a week, what issues would you like to see them focus on?

Nick: Being new to the implementation game, with no examples gained from first-hand pains re-implementing Ruby that need to be improved, I would say the first thing to do would be to just brainstorm and agree upon a top-N list of things that need to be fixed. Several areas of Ruby need whittling back and don't have to be so complicated. Re-examine the most common usage patterns and "say no" to those edge case features that have little value.

If I were to try to name specifics, some of the things going around IRC lately are threading/concurrency (needs to be easier to do right and do well) and language features that are too intertwined with the parser/syntax tree ( e.g., "defined?") that make implementors' lives difficult.

Wilson: I think that week could be profitably spent writing test cases and working on the spec framework. A comprehensive set of executable specs will help everyone.

Alternately, we could all get drunk and read through the JVM Hotspot source code.

The biggest news in Ruby land right now is probably the announcement of a fully merged YARV. What effect does this have on rubinius?

Nick: Well, it's more visible fresh blood, that should be more accessible to Ruby hackers. Hopefully some of the cross-platform compilation and execution issues that exist will go away soon and make it easy for Rubinius and JRuby hackers to steal — umm, incorporate more ideas that might allow for more implementation sharing in the future. For example, a shared or compatible bytecode.

Wilson: I think it means the race is on. Heh.

One tricky aspect is that YARV is targetting 1.9. JRuby is aimed at 1.8, while Rubinius is forming up on something like "1.9 transitional". By that I mean that we sometimes take the 'low road' and implement things in a way that is compatible with existing code, but perhaps not the letter of the ruby1.8.5 law.

An example: the defined? keyword in Rubinius returns true or false. Ruby 1.8.5 returns a string or nil.

That sounds like an incompatibility, but it turns out that every scrap of public Ruby code uses defined? as the condition of an if/unless statement. True or false works fine, and is a lot easier to implement.

I'm not sure if that makes Rubinius an 'Opinionated VM' or not. I know that I'd rather get a feature working and move to the next one than spend a week on all the implementation details from 1.8. We can get our heads together later and decide what the real Ruby Spec is.

This episode is brought to you by: Code Craft. It's a new book from No Starch Press about becoming a better programmer, good stuff.

Previous episodes of the Serial Rubinius Interview are available here:

Episode 1, in which we talk about the rubinius community
Episode 2, in which we talk about cuby and testing tools
Episode 3, in which we talk about rubinius documentation

Episode 1, in which Charles, Thomas, and Ola talk about their plans for JRuby.

The Five Things Meme

Now that Charles Nutter has tagged me with the Five Things meme, I guess I ought to give it a shot. Here are five things you probably don't know about me:

I never intended to get into computing as anything more than a hobby. I planned on being a chmical engineer instead.
I joined the army straight out of High School to earn money to go to college, and got involved in computers and networks. Then when I got out, Boeing offered me a job and I just never found my way back to school.
(From the above) I'm self-taught (well, I have an unorthodox education — I've learned a lot from a lot of people)... no degree, only a few college level classes taken here and there.
I used to manage the Koha project. It's a free software library management system. It looks like they're doing just fine without me though.
I'm a fan of the classical period, and humanities in general. I read the Illiad and the Oddesey when I was in fourth grade (my dad's method of getting me to read stuff was to leave copies sitting around) and haven't looked back. If I could figure out a way to get paid for sitting in a musty library reading old latin/greek/syriac/hebrew texts I'd drop this computing stuff in a heartbeat.
I'd really like to write a book of textual/literary analysis of the sermons/discourses of Alma the Younger in the Books of Alma and Mosiah in the Book of Mormon.

Now, I'd like to see what five things I didn't know about Kevin Tew, Doug Tolton, James Gray, Wilson Bilkovich, and Eric Hodel will reveal.

On Ruby

Wednesday, January 31, 2007

Blogging Contest: February Challenge

Tuesday, January 30, 2007

Cardinal: A Behind the Curtains Look at Parrot

Ruby Hacker Interview: Jens Kraemer

Thursday, January 25, 2007

MountainWest RubyConf: Implementors Panel

Wednesday, January 24, 2007

rubinius serial interview: episode VI

MountainWest RubyConf: Gregory Brown

Tuesday, January 23, 2007

MountainWest RubyConf: JRuby

Monday, January 22, 2007

Rubinius Library Writing Guide

MountainWest RubyConf: Open For Registration

Author Interview: Brian Marick

Friday, January 19, 2007

FamilyLearn Interview

Monday, January 15, 2007

Serial rubinius Interview: Episode V

Friday, January 12, 2007

Button, Button, Who's Got The Button?

Will rubinius Be An Acceptable Lisp

Thursday, January 11, 2007

Refactoring Ruby

Advance Directives For Hosted Projects

Monday, January 08, 2007

rubinius Serial Interview: Episode IV

The Five Things Meme

About Me

Subscribe Now: Feed Icon

Most Popular Posts

My Best

Blog Archive

Links & Blogs