Thursday, July 27, 2006

RubyInline: Going a bit Further

Last time around, I wrote about RubyInline and profiling, and raised a couple of questions. This time around I'm going to talk about RubyInline and benchmark in a (probably vain) attempt to answer the existing questions, while raising new ones. There were basically three questions that I'd like to handle:

  • What should we call this stuff?
  • How does it work?
  • How do I use Ruby objects in my C? (Yes, I'm conflating two questions here.)

The simple answer to the first is that yes, we're really just writing Ruby methods (albeit in C). To really understand why and how that is, we need to look at what RubyInline actually does.

RubyInline does a couple of things in the process of working (if I get any of this wrong, hopefully Eric Hodel or zenspider will jump in and correct me). The first time it's run, it copies the C source into a temporary directory (~/.ruby_inline) and tweaks it a bit to make it Ruby-aware. It then compiles the C to a shared library (a .so in my case) which is linked into the running Ruby script, and any methods are made available for calling.

If RubyInline is run again against the same script, it checks to see if the shared library is newer than the original source file and only recompiles if the source is newer than the object file. This helps cut the compile and link latency that would be introduced otherwise.

The idea of Ruby methods written in C might seem foreign, but you actually use them all the time. Array#uniq is just a bunch of C, but you'd never know it without digging around in the Ruby source (not recommended for the faint of heart). RubyInline just makes the process of writing Ruby in C a bit easier (well, a Lot easier).

Let's take a look at this in practice. Since my last post raised some questions about using Ruby objects (Fixnums and Arrays) from C, so I borrowed some code from example2.rb in the source, and worked from there. Here's the code:

#!/usr/bin/ruby -w 

require 'rubygems'
require 'inline'
require 'benchmark'

class Array

  # build the Array#average method in C using Ruby Objects
  inline do |builder|
    builder.c_raw "
      static VALUE average(int argc, VALUE *argv, VALUE self) {
        double result;
        long  i, len;
        VALUE *arr = RARRAY(self)->ptr;
        len = RARRAY(self)->len;
        for(i=0; i<len; i++) {
          result += NUM2DBL(arr[i]);
        return rb_float_new(result/(double)len);

  # build the Array#ravg method in Ruby
  def ravg
    Float(self.inject {|sum, elem| sum += elem }) / Float(self.length)


# build a good sized loop over a big array
max_loop = (ARGV.shift || 20).to_i
max_size = (ARGV.shift || 1_000_000).to_i
a = (1..max_size).to_a

# benchmark the C versus the Ruby versions
Benchmark.bmbm(10) do |x|"C")    { for i in 1..max_loop; a.average; end   }"Ruby") { for i in 1..max_loop; a.ravg;    end   }

Other than showing the use of Ruby Array objects from C, the only interesting thing is the benchmarking code (the last block in the file). Even this is pretty simple stuff though, it reads something like this:

Run the bmbm method (which runs a benchmarking rehearsal and benchmarking test), with the report headers left justified in a 10 character block. The first blob to benchmark is given the header 'C', and runs a loop averaging the big array once each time through the loop. The second blob will be labeled 'Ruby' and will benchmark the ruby average method.

The output of running this script looks like this:

$ ruby inline_array_benchmark.rb
Rehearsal ---------------------------------------------
C          27.790000   0.110000  27.900000 ( 29.643141)
Ruby      115.280000   0.280000 115.560000 (119.626154)
---------------------------------- total: 143.460000sec

                user     system      total        real
C          27.680000   0.030000  27.710000 ( 28.382200)
Ruby      116.040000   0.340000 116.380000 (120.795489)

It's interesting to note that the rehearsal takes a bit longer for the C version. It's tempting to say that's because it compiled the C code, but that's not really the cause. The compile time takes so little time, it's lost in the noise.

Hopefully this helps answer some of the questions that have come up. If not, feel free to keep asking — I'll keep trying to answer them.

No comments: