Thursday, August 31, 2006

Author Interviews: Hal Fulton - The Ruby Way

Hal Fulton is a longtime Ruby hacker and the author of one of my favorite Ruby books, The Ruby Way. Recently, he's been hard at work on a second edition (due out in November). The second edition will come with a change in publishers, The Ruby Way will now be an Addison-Wesley book. When he's not working on his book, Hal is active on the ruby-talk mailing list and in the Ruby community at large.

Besides Ruby, his life consists of reading, writing, movies, plays, and concerts. A member of Austin Scriptworks and had his first play produced early in 2006. He's also a huge space advocate, and lives with a cat named Meg (short for Megabyte). He visits his parents whenever he can. They say they're not getting any younger, but sometimes he suspects they are.

Hal and I have known each other for a while, we even worked together on the t-shirt for RubyConf 2002 in Seattle — Ruby in the Emerald City.


How did you discover Ruby?

Hal: I was at IBM in '99 and was talking to Conrad Schneiker. He is the one who got comp.lang.ruby going (and if you're an AIX user, he's the inventor of the smit utility). I complained to him that I was never on the ground floor of any new technology. I was a late adopter of everything, and I wanted to change that. And he said, "Well, you should learn Ruby then." It was the first time I had heard of it. I started reading the mailing list; this was mostly Japanese people speaking English, although there were these two guys named Dave Thomas and Andy Hunt hanging out there. Later on, they came out with the first Ruby book in English.

What role does Ruby play in your day to day work?

Hal: My regular job is on the usage data warehouse team for a telecom company. I have sneaked Ruby into production wherever it's practical. It's used in our control and status web page, in numerous little convenience scripts, in test harnesses, and in a few Oracle apps.

As one of the really early birds, can you tell us a bit about the Ruby community 'back in the day'?

Hal: When I started learning Ruby (version 1.4), I could find *one* Ruby tutorial on the web in English. There were no books in English. I had a friend Miho who went back to Japan to visit family; at my request, she bought a Japanese Ruby book for me (one by Matz). I couldn't read the commentary, but I could read the code and learn from it.

The newsgroup hadn't been approved yet. The mailing list had perhaps 10-20 messages a day if I recall correctly.

Ruby didn't have class variables. The standard way to fake them was to use an array or similar container class.

Google searches for Ruby at that time returned numerous things unrelated to the language. You would have to go several pages into the results before you found relevant links. There was little on the web in English. There was no RubyForge, no RubyDoc, no RubyCentral. The first international Ruby Conference was nearly two years into the future.

There were fewer standard libs. There was no REXML, no YAML, no RMagick, very few web tools.

There was no "one-click installer" for Windows. The best way to run Ruby on Windows was to use Cygwin or the mingw version.

And our computers ran on kerosene, and we walked five miles in the snow to get printouts, and it was uphill both ways.

Do you have any hints about sneaking Ruby into the workplace?

Hal: I've always used the trickle-up method. Start using it for one-off scripts. Then use it for little convenience scripts and tools. Then use it for glue code. Then start using it in testing. Then web development. Then less-important production code. Finally one day you are using Ruby in mission-critical apps in production, and your bosses may not even know. It's easier to get forgiveness than permission.

How did you first come to write a book about Ruby?

Hal: I was one of the first N Americans to learn Ruby (for some small value of N). So when Sams went looking for someone to write a book, I was one of the ones who responded. At first, I had a coauthor; he wrote pieces of the first four chapters before leaving the project.

How did you swing the second edition?

Hal: I didn't really swing it. The publisher asked me if I was willing to update it. I had given it some thought, and had collected many notes. So I said yes; but I dreaded the actual work. In many ways, the second edition was more work than the first.

How is the second edition going to be different from the first?

Hal: First of all, Ruby has changed since version 1.6 that I was using then. It doesn't seem like much as you're watching it, but when you stop to add up all the changes in syntax, semantics, the core, and the standard libraries, it's quite a bit.

Second of all, I wanted to broaden the scope a little. There were all these cool things that were not covered in the first edition, like Rinda and Ring and RMagick and RSS and YAML and many others. I wanted to cover more of these, and I did. Though, believe me, there are many others I left out for lack of time and space.

Writing a book like that is all about compromise. If it were "complete" — whatever that means — it might be 2000 pages long. And by the time you write a 2000-page book, the subject has changed and it's no longer complete. So I wanted to produce a book that was of a reasonable size, written in a reasonable length of time. And I ended up dropping some topics that I really wanted to cover, but I just ran out of time.

And of course, I added quite a bit of material. To make room for it, I deleted the four appendices (which were dated anyhow) and some of the less important material in the chapters. At the beginning of the project, I estimated I would delete 100 pages and add another 250. I'm not certain how the numbers actually turned out, though.

What sets The Ruby Way apart, why should people buy it?

Hal: It's like an inverted reference — it covers topics by category rather than by alphabet.

It's not a tutorial, but it does have a good overview of Ruby in the first chapter.

It was written with knowledge of the Pickaxe, and was specifically designed to be complementary to it — not to overlap too much.

It's a how-to book, but it also tries to cover the *why*. It's very important to understand the motivations behind the way Ruby is constructed.

It's perhaps more broad than deep. It "breaks ground" in many different areas, and tries to give hints, clues, and pointers when it can't actually cover detail.

Some have told me it's entertaining. I scattered quotations throughout the book, sometimes at section heads as well as chapter heads. Half these are profound, the other half are tongue in cheek. Once somebody suggested I delete them since space was tight. But they are far less than 1% of the text. For flavor, for entertainment, they are well worth the space they take up.

People have also told me they appreciated the humor (sometimes very dry) in the examples. And there's the geek factor: I put in countless references to Star Trek, Lord of the Rings, Alice in Wonderland, and so on.

It's supposed to teach Ruby syntax, semantics, and programming tricks and techniques. But I'd like to believe that it also makes people think and makes them laugh.

What was the most rewarding part of writing The Ruby Way?

Hal: I'd have to say the respect of my peers. It's incredibly energizing when people tell me they bought it, they read it, they loved it, they learned Ruby from it, and so on. That makes it all worthwhile.

What was the biggest challenge in writing The Ruby Way?

Hal: Because the coverage is fairly broad, it strains the bounds of my expertise. Many times I had to seek knowledge, advice, and code from others. Sometimes these were in conflict, and I had to try to do enough research to decide what was actually going into print. Because, after all, the credit goes to many people, but the blame for mistakes all rests with me.

What did you learn from writing The Ruby Way?

Hal: I learned that I didn't know Ruby as well as I thought. I was constantly turning up little-known features and edge cases and things I had never tried.

It's a fact of life. You think you know something. Then you try to teach it, and you end up learning more. Then you try to write a book, and you learn the most of all that way.

I also learned that books are never written by one person. Besides people assisting me with content, the number of people on the publisher's side that touch the book is truly astonishing.

I've learned that writing a second edition is NOT that much easier than writing the first one. I would almost say it's harder to rewrite a book than to write from scratch.

And if I may be a little cynical, I've learned that I can read a chapter five times, and have it read by eight other people, and then read it again in PDF, and it will still have errors in it that won't be found until it's on the shelf.

What are your favorite five libraries for Ruby?

Hal: Good question. RMagick is a lot of fun to play with. The feedtools library is pretty cool for RSS and Atom. The mathn lib does a decent job of "unifying" some of the mathematical pieces of Ruby. The open-uri lib is a great example of how to abstract away complexity and make an API simple and uniform. KirbyBase is excellent for a small, full-featured non-SQL database. And I'm partial to scanf, because it was written in my house, mostly by David Alan Black. (It's pure Ruby, not a C wrapper; and if you think it's trivial, take a look at the code.) I contributed numerous test cases to it, because it's fun to break someone else's code.

What do you think is next for Ruby?

Hal: Who knows? I think YARV is going to be important. I think interoperability with CLR will be important. I think the international support will improve. I also think that it's going to be used more and more in science and engineering.

I don't foresee "giant" changes in the syntax and semantics of the language itself, even in 2.0; Matz has always been conservative and deliberate in crafting Ruby.

What's next for you?

Hal: Besides Ruby and computers in general, I have a lot of interests. I'm getting more deeply into space advocacy now (see marssociety.org and marsdrive.com).

I keep thinking about the third edition of _The Ruby Way_. If it's going to be finished by 2012, I should probably get started on it right away.

Digg this

Technorati tags:

Interview Excerpt: Gregory Brown (Changing the Way You Code)

Gregory Brown is the lead developer of Ruport, a recent participant in the Google Summer of Code, and an all around nice guy. I'm wrapping up an interview with him (to be posted over at O'Reilly's Ruby Blog and the Apress Ablog shortly), and have a couple of questions and answers that just don't fit given the word count constraints. I didn't want to lose them though, Gregory's answers are just too good. So I decided to post them here. Enjoy.


How has the SoC changed the way you'll write code in the future?

Gregory: I'm not sure if GSoC is going to change much about the way I write code. However, it has changed a lot about the way I manage my development. We made a ton of use of Trac this summer to record tickets for bug fixes and feature additions (among other things). I habitually create milestones now, and I also think a lot more about how the pieces come together. A few months ago, I'd replace an entire subsystem in Ruport and just release it the next day and say "Surprise!". This summer ended all of that, we have a lot more discourse before we do things like that, and we also tend to introduce new systems more slowly.

Maybe this is just a sign of Ruport becoming a bit more mature, but I have a feeling I will carry this increased emphasis on organization and cleaner transitions into my other projects. Evolution is key to any quality software, but when it happens too quickly without an obvious reason, I'm not sure it is A Good Thing.

I also find myself leaning towards more simple solutions. Since the Summer of Code was highly time constrained, it was important not to spend days going off on some tangent. Writing the most simple thing that could possibly work, and adding to it only when you need to seemed to be essential this summer.

Interview Excerpt: Gregory Brown (Test Coverage)

Gregory Brown is the lead developer of Ruport, a recent participant in the Google Summer of Code, and an all around nice guy. I'm wrapping up an interview with him (to be posted over at O'Reilly's Ruby Blog and the Apress Ablog shortly), and have a couple of questions and answers that just don't fit given the word count constraints. I didn't want to lose them though, Gregory's answers are just too good. So I decided to post them here. Enjoy.


You talked about your test coverage, how are you measuring it?

Gregory: We recently started using RCov to do code coverage reports. This gives us a great overview of what code isn't being touched by our tests, but does not give us much insight on whether the tests are of any real quality or completeness. Still, this tool is invaluable for identifying redzones which are a great first place to look for problems.

One thing about Ruport that is somewhat of an advantage as far as test coverage goes is that almost all development is done test-first. Sometimes, we have errors in our tests, or forget certain edge cases, or run into scenarios where refactoring leaves our tests no longer accurately reflecting the actual function, but this is usually because of our own mistakes, and not as a general principle.

Some of our stuff is very hard to test, like database interaction or sending email. I'd like to set up some mock objects for this and maybe use some dependency injection, but for now, we've just left these as red zones. In the mean time, we'll probably come up with some decent functional tests.

Another thing that we do is write tests to reproduce any bug reports we have. This seems to be a solid way to improve test coverage, and avoid the problem of recurrant bugs.

Digg this

Wednesday, August 30, 2006

Binding and Benchmarking (or, What I Did for Lunch)

Today, Kent Sibilev posted a neat little trick over on his blog (which I was reading while I ate my lunch). His code allows you to replace code like this:


class Simple
  def initialize(foo, bar, baz)
    @foo = foo
    @bar = bar
    @baz = baz
  end
end

with code like this:

class Bound
  def initialize(foo, bar, baz)
    binding.local_to_instance
  end
end

This looks like a great time saver — at first blush (Kent does warn that it's not effecient). I tried it in a small script and it worked great. I can see myself doing this a lot in simple scripts, but will it work for a program that instantiates a lot of objects? Only you and your code know for sure, but I pulled out the Benchmark to see what it thought.

My first step was to wrap up the two versions of the class above, with some instantiating code in a benchmarking script like this:


class Binding
  def local_to_instance
    eval("local_variables").each do |name|
      eval("self").instance_variable_set("@#{name}", eval(name))
    end
  end
  
  alias :kernel_eval :eval
  def eval(code)
    kernel_eval(code, self)
  end
end

class Bound
  attr_accessor :foo, :bar, :baz
  
  def initialize(foo, bar, baz)
    binding.local_to_instance
  end
end


class Simple
  attr_accessor :foo, :bar, :baz
  
  def initialize(foo, bar, baz)
    @foo = foo
    @bar = bar
    @baz = baz
  end
end

require 'benchmark'

Benchmark.bmbm(10) do |x|
  x.report('simple') do
    for i in 1..100_000 do
      a = Simple.new(1,2,3)
    end
  end

  x.report('bound') do
    for i in 1..100_000 do
      a = Bound.new(1,2,3)
    end
  end 
end

Note: Kent's Binding class (which does all the magic) is up there in the code.

If you're not used to the Benchmark library, all I'm doing is setting up a 'benchmark with rehearsal' test (the whole test will run twice to provide cleaner results), with a couple of labels for the report.

Running the code above produces a report like this:


Rehearsal ---------------------------------------------
simple      0.400000   0.000000   0.400000 (  0.398161)
bound       2.620000   0.010000   2.630000 (  2.662697)
------------------------------------ total: 3.030000sec

                user     system      total        real
simple      0.140000   0.000000   0.140000 (  0.137859)
bound       2.520000   0.010000   2.530000 (  2.525122)

so, what does this prove? Well, Kent is right, his binding implementation isn't very effecient. On the other hand, it ripped through 100,000 instantiations of Bound objects in just two and a half seconds. I don't think that's liable to raise too many performance concerns. Maybe if you're creating tens of millions of objects it will start to get to you — or, maybe not. Try it. If it's too slow, profile it. If it's a problem, you can always go back to doing things the old fashioned way.

Andreas Wolff: Ruby Hacker and Tour Guide

Berlin_Andreas

My family and I recently visited Berlin (among other places). While we were there, Andreas Wolff, a Ruby friend, was kind enough to play tour guide for us. Not only did he show us around town, he even played translator in a couple of stores while my wife tried to replace things that she'd lost when her handbag was stolen in Brussels.

Outside of showing us around Berlin, Andreas hacks Ruby, maintains Rubyblog.de (linked above and in my links to the right), and heads up the Rubylation project.

Aren't Ruby hackers great?

More to Learn

Well, Ruby memory work seems to be all the rage today. I've seen blog posts from Scott Laird, zenspider, and Eric Hodel (and here) recently, not to mention a number of emails on various lists.

It looks like everyone's working in Ruby space, which isn't a bad thing. I'd sure like to see some work going on with Valgrind and Ruby though. I picked it up a while ago, but without much success. Maybe it's time to try again.

I've also seen some buzz starting to grow about a DTrace provider for Ruby. And now that DTrace runs on OS X as well as OpenSolaris, there should be a lot of Ruby hackers starting to use it. I don't know enough about DTrace, but would it be good for isolating memory issues?

Tuesday, August 29, 2006

Agile Retrospectives (a recommendation)

I was lucky enough to see an early version of this book (while it was going through a technical review phase), and I really liked it. I found a lot of information in here that I'm trying to pull back into meetings and other activities here at work. (It's a long, slow road.)

The authors have a good writing voice, and an important message. It makes for a compelling book. One of the things I enjoyed was the continuing use of example stories, it really showed how the methods would be used in situ.

If your team needs some help changing the way you do business (or maybe the way you don't do business), this is the book for you. Go grab a copy while they're hot.

Regional Ruby Conferences

Okay, I'm starting to get rolling (along with a couple of other folks) on a regional conference here in the mountain west. One of the things we're looking at doing is having a hacking night the evening before. I'm at a bit of a loss for a hacking project though. Any ideas?

Monday, August 28, 2006

Ruby training from Pune Ruby

The Pune Ruby Users Group (the ones producing the run of short interviews with a lot of Ruby folks) are going to be hosting an on-line Introduction to Ruby class. I took a quick look at the syllabus, and it looks pretty comprehensive. It doesn't include any Rails content, but the instructor said he was planning on setting up a similar class for Rails.

I think this class is probably a good opportunity for some folks to learn more about Ruby, but it's a couple of other things too. It's a good chance to develop some free, reusable training materials for Ruby (and Rails if that second class works out). It will also be a good indicator of the demand for (and marketability of) Internet based Ruby training (and by extrapolation classroom based Ruby training.)

Whether you're looking to learn more about Ruby, brush up on your Ruby skills, or get a feel for the Ruby training market, this is something worth looking into.

Saturday, August 26, 2006

RubyConf*MI Wrap Up

Well, I think my presentation went well. People seemed to enjoy it at least. That's a big load off of my shoulders. I covered nine libraries that every developer should know about (even if they don't use them). I've dropped a pdf of my slides here. I hope they're interesting/useful for you.

I really liked Patrick Hurley's presentation on 'Ruby, Performance, and C'. He showed some real presentation chops with images, quotes, and colors sprinkled liberally throughout his presentation. He talked about the fact that once you've done everything else, rewriting in C is all that's left. He did a nice job of walking us through an example.

I also thought Craig Damyanovich did a great job (especially as a first time presenter. As a nod to his love of hockey, he covered BDD and RSpec in a 'three period' format (only in Michigan). Craig was really laid back and seemed completely calm all through his presentation. I know I didn't feel that way.

Watching the interplay of the audience and the speakers was interesting. I'm not sure if it was because of the smaller group (about 60 people), the intimacy of the setting (theatre seating for 150 or so), or the fact that this was a regional conference. Whatever caused it, it was fun to watch. At one point a question directed toward the speaker turned into a five minute discussion between the speaker and several people in the audience.

Perhaps the best thing though, was hearing the confluence of ideas — four people (inlcuding me) mentioned RSpec, three of us talked about rcov, and two of us talked about Profile (and friends) and RubyInline.

I'm not sure who will end up running the next regional Ruby conference, or if I'll be able to attend, but if it's anything like this one I sure want to.

Friday, August 25, 2006

Big Day

Wow! What a day for news. So far, here are my three favorites:

  • Ruby 1.8.5 has been announced! Mauricio has posted a ChangeLog at his blog. Go grab a download and enjoy.
  • Kent Sibilev has announced ruby-debug 0.4. I wish I'd known about this package earlier. It looks like a great expansion on the stock debugger, and should prove really useful.
  • Satish Talim posted an interview he did with me over at the Pune Ruby Brigade's group blog. I'm really excited to see the way they're helping Ruby in India. I'm also really humbled to be lumped in with the kind of people they've interviewed so far.

Finally, while it's not really news, today's the day I fly out for RubyConf*MI. I'm looking forward to catching some great presentations. Hopefully mine goes well.

Thursday, August 24, 2006

Great Contest

Peter Cooper, over at Ruby Inside put his money where his mouth was to generate some new resources for the Ruby world. He ran a contest with a $100 prize, asking people to post informative articles about Ruby or Rails to their blogs. Eighteen bloggers posted twenty-four new articles ranging from a review of the recent security problems in Rails to information about using Profiler__ and ruby-prof to static web generation.

I love this idea, and I think it generated some great information. I can't wait to see what happens the next time someone runs a contest like this.

Wednesday, August 23, 2006

UtahValley.rb August meeting

Last night (22 Aug), the UtahValley.rb met again. We had eight people turn out, four new folks and four veterans. Actually, since this was the first week of class at BYU (where we meet), this wasn't a bad turnout.

I presented my (nearly finished) talk about Libraries for Developers — I'm giving it at RubyConf*MI on Saturday. Doing a rehearsal/development talk in front of some friendly faces is a great way to improve the talk as a whole. I came away with several important changes I plan on making over the next couple of days. I'm also a lot more comfortable with the talk than I was when it was just a set of slides brewing in OOimpress.

We didn't cover too much else, although I was able to mention a Ruby consulting opportunity with the team I work for at the LDS church. If you're interested in working on Ruby (and maybe a bit of Perl/Shell) stuff for a year or so for us, feel free to drop me a line.

Our next meetings will be Sep 12th (BYU RUG) and Sep 26th (UtahValley.rb). No announcements yet as to what we'll talk about, though the meeting on the 12th is traditionally an intro to Ruby (where we try to recruit new CS students).

Tuesday, August 22, 2006

Software Creativity 2.0

A while ago, I wrote a review of Software Conflict 2.0 by Robert Glass, and for almost all that time I've been sitting on a secret. Now, at last, I can tell. Developer.* has just announced a companion book, Software Creativity 2.0, also by Robert Glass.

I loved Software Conflict 2.0, and now I'm anxiously looking forward to Software Creativity 2.0. Where the first was an update, keeping much of the old content, I understand that this book will have mostly new content.

As part of their announcement celebration, Developer.* is offering a 30% discount and free shipping to people who order direct.

Readers who pre-order directly from the publisher will receive a 30% discount off the $34.99 cover price and free shipping. You will not be charged for the book until it is ready to ship. International orders welcome. Send your name and mailing address and quantity to softwarecreativity~AT~developerdotstar.com.

Tuesday, August 15, 2006

Profile and ruby-prof: Getting Specific

I'll close up my profiling trilogy with a little bit about profiling parts of your Ruby code. While ruby -r profile my_program or ruby-prof my_program are great for 90% of what you want to do, there's always the odd time that you really only care about one specific portion of your code — for example, if you're doing IO or setup tasks that you don't want to muddle your profile.

The Profiler__ library is the stock Ruby way of doing this, so let's look at this option first. Profiler__ has three methods that will be of interest to us:

  • start_profile — which starts collecting data
  • stop_profile — which stops data collection
  • print_profile — which prints the collected data (it takes a parameter naming the filehandle to which it should print the data)

Say you had an rwb test script that looked like this (yes, I know I've depracated rwb, that doesn't mean I can't use it in a contrived little example):

require 'rwb'
urls = RWB::Builder.new()
urls.add_url(10, "http://localhost")
tests = RWB::Runner.new(urls, 1000, 200)
tests.run
tests.report_header
tests.report_overall

Profiling this would be a nightmare, it relies on net/http and makes several gazillion calls to the Ruby Builtin and Standard Libraries — all these calls would dominate the run and resulting profile, the stuff you really care about (and can fix) would be buried. If I just wanted to profile the reporting functions, I could modify the test script to look like this:


require 'rwb'
require 'profiler'
urls = RWB::Builder.new()
urls.add_url(10, "http://localhost")
tests = RWB::Runner.new(urls, 1000, 200)
tests.run
Profiler__::start_profile
tests.report_header
tests.report_overall
Profiler__::stop_profile
Profiler__::print_profile($stderr)

This generates output like that below (edited to show the % time, cumulative seconds and name fields of the top twenty lines):


  %   cumulative
 time   seconds  name
 40.00     0.22  Array#each
 20.00     0.33  Array#sort
 18.18     0.43  Float#<=>
 10.91     0.49  Array#push
  3.64     0.51  Float#+
  3.64     0.53  Float#**
  1.82     0.54  Float#-
  1.82     0.55  RWB::Runner#report_heade
  0.00     0.55  Float#to_f
  0.00     0.55  RWB::Runner#results_mean
  0.00     0.55  Math.sqrt
  0.00     0.55  Class#new
  0.00     0.55  Fixnum#/
  0.00     0.55  Array#shift
  0.00     0.55  Array#initialize
  0.00     0.55  Array#[]
  0.00     0.55  Kernel.puts
  0.00     0.55  Float#to_s
  0.00     0.55  RWB::Runner#report_overa
  0.00     0.55  Fixnum#+
Which differs considerably from a profile of the entire script created with the -r profile command line option. The first twenty lines are shown here for comparison (edited as above):

  %   cumulative
 time   seconds  name
113.58    18.73  TCPSocket#initialize
 92.06    33.91  Fixnum#==
  7.52    35.15  Kernel.catch
  7.16    36.33  RWB::Runner#run_test
  7.03    37.49  RWB::Builder#get_url
  6.61    38.58  URI.split
  5.82    39.54  Net::HTTP#Proxy
  5.34    40.42  Hash#include?
  5.15    41.27  Kernel.block_given?
  4.79    42.06  URI::HTTP#initialize
  4.49    42.80  URI::Generic#find_proxy
  4.37    43.52  Net::HTTP#new
  4.12    44.20  Kernel.class
  4.06    44.87  Kernel.open
  3.88    45.51  URI::Generic#split_userinfo
  3.82    46.14  OpenURI.open_loop
  3.82    46.77  Net::HTTP#do_start
  3.76    47.39  RWB::Result#initialize
  3.40    47.95  Net::HTTP#initialize
  3.09    48.46  RWB::Runner#run_thread
  3.03    48.96  ENV.include?
(And yes, there is a rounding/floating point/addition error in the second example ... I know it's there, but haven't bothered to dig into the exact cause.)

One thing to watch out for is that Profiler__ will happily overwrite any existing profile data if Profiler__::start_profile gets called multiple times.

ruby-prof provides similar control over what gets profiled — actually, it provides more options. It uses the methods:

  • start — starts a new profiling run
  • stop — stops the current profiling run and returns the data
  • profile — allows you to profile a block passed in to it

Printing is handled a bit differently though, Report objects are created, and then printed through a print method. Each kind of report has it's own Class:

  • RubyProf::FlatPrinter — a traditional profiling report
  • RubyProf::GraphPrinter — a call graph profiling report
  • RubyProf::GraphHtmlPrinter — an html call graph profiling report
Each of these classes has a print method, which takes two (optional) parameters — the output file handle and the minimum %self that a method call must take up to be printed.

We can recreate the above example like this:


require 'rwb'
require 'ruby-prof'
urls = RWB::Builder.new()
urls.add_url(10, "http://localhost")
tests = RWB::Runner.new(urls, 1000, 200)
tests.run
RubyProf.start
tests.report_header
tests.report_overall
result = RubyProf.stop
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT, 0)

Changing it to print a call graph just means changing the second to last line to this:
printer = RubyProf::GraphPrinter.new(result)

While these may not be the kinds of profiling tasks you do every day, it's nice to know that they're there.

Happy Hacking!

If you found this post helpful, you might want to look at my ruby-prof post collection.

Monday, August 14, 2006

ruby-prof and call graphs

Sorry it's been so long since my last post. I've been on vacation for the last week, and Internet access wasn't a big part of it. I did get a bit of writing done for my book though (one of the perks of insomnia). I promised some more info about ruby-prof, so here it is.

The biggest difference between Profile and ruby-prof is the ability to generate call graphs. A call graph shows not only individual methods and their profiling data, but also the parent and child method calls associated with them. Call graphs can be generated in plain text or html. Unless you need the plain text, the html report is a better choice as it is easier to read and has some cross referencing features built into it.

ruby-prof has an optional -p flag that takes one of three options: flat,which generates a normal profiling report; graph, which generates a plain test call graph report; and graph_html, whichgenerates an html call graph report. Since call graphs are a bit less common than normal profiling data, here's a sample that we can walk through to get a better feel for what can do for you. I generated it like this: ruby-prof -p graph call_rotator test_file. (I'm also trimming the output to show just the top several methods, this report was 146 lines long.)


Thread ID: 537836452
  %total   %self total  self  children        calls Name
---------------------------------------------------------------
                  0.00  0.00    0.00            1/1   <Class::Time>#now
   0.00%   0.00%  0.00  0.00    0.00              1   Time#initialize
---------------------------------------------------------------
                  0.00  0.00    0.00    10011/10011   Comparable#>
   0.00%   0.00%  0.00  0.00    0.00          10011   Time#<=>
---------------------------------------------------------------
                  0.09  0.09    0.00    10011/10011   CallRot#read_line
  28.12%  28.12%  0.09  0.09    0.00          10011   String#split
---------------------------------------------------------------
                  0.00  0.00    0.00          12/12   Array#each
   0.00%   0.00%  0.00  0.00    0.00             12   String#gsub!
---------------------------------------------------------------
                  0.01  0.01    0.00    10011/10011   Class#new
   3.12%   3.12%  0.01  0.01    0.00          10011   OnCall#initialize
---------------------------------------------------------------
                  0.00  0.00    0.00            1/1   Class#new
   0.00%   0.00%  0.00  0.00    0.00              1   Object#initialize
---------------------------------------------------------------
                  0.00  0.00    0.00            3/6   Kernel#require__
                  0.00  0.00    0.00            3/6   Module#attr_reader
   0.00%   0.00%  0.00  0.00    0.00              6   Module#method_added
---------------------------------------------------------------
                  0.00  0.00    0.00            1/1   Kernel#require__
   0.00%   0.00%  0.00  0.00    0.00              1   Module#attr_reader
                  0.00  0.00    0.00            3/6   Module#method_added
---------------------------------------------------------------
                  0.00  0.00    0.00            1/1   Kernel#require
   0.00%   0.00%  0.00  0.00    0.00              1   Kernel#require__
                  0.00  0.00    0.00            3/6   Module#method_added
                  0.00  0.00    0.00            1/1   Module#attr_reader
                  0.00  0.00    0.00            3/3   Class#inherited
---------------------------------------------------------------
                  0.00  0.00    0.00            1/1   Kernel#load
   0.00%   0.00%  0.00  0.00    0.00              1   Kernel#require
                  0.00  0.00    0.00            1/1   Kernel#require__
---------------------------------------------------------------

The first interesting chunk of the report is the third block down. It has two lines, the top line is the calling method, the second line is the method being called. In the call graph reports, the line showing %total and %self values is the method being reported, everything above it is a calling method and everything below it is a method being called. This report shows that String#split takes up a bit over 28% of the programs running time. It is called 10011 times, all of them by CallRot#ReadLine.

A bit further down, is the chunk about Module#method_added. In this case there are two calling methods. Each calls Module#method_added three times, for a total of six calls. This is shown in the calls field, 3/6 in both cases.

Two blocks further down, you can see the report for Kernel#require__, this has one calling and three called methods. Each of the called methods also shows the number of calls made by Kernel#require__ and the total number of calls made.

Every method (calling, called, or reported) is shown with some profiling data: total, self, and children. These fields show the amount of time spent there. Because my profiling run was fairly short, there's not a lot of data to work with. If this were a program that took hours to run, we'd be able to use this information (along with the number of calls made) to identify where we really need to focus our optimization efforts.

There's one more feature of ruby-prof (and Profile for that matter) that I'd still like to cover — controlling what parts of your program are profiled. I'll look at that in another post though. Hopefully this is enough to chew on for one day.

If you found this post helpful, you might want to look at my ruby-prof post collection.

Friday, August 04, 2006

Profile and ruby-prof

I've been playing with profile (from the Standard Library) and ruby-prof (the profile replacement written by Shugo Maeda and Charlie Savage. I really like ruby-prof — it's faster (from 5x on some short runs to 100x on longer runs), it provides more detail, and it will generate call graphs. I've noticed some differences in the way they run. I'll show you what I mean.

ruby-prof doesn't sort it's output in terms of total time used, so you'll need to sort it. After a bit of munging, the (cropped) output of running ruby-prof call_rotator test_file looks something like this:


%self  cumulative total  self  children  calls self/call total/call  name
 23.64     0.13   0.13   0.13  0.00     10002   0.00     0.00     String#split
 21.82     0.30   0.39   0.12  0.27     10002   0.00     0.00     CallRot#read_line
 18.18     0.55   0.54   0.10  0.44         1   0.10     0.54     #foreach
 12.73     0.42   0.07   0.07  0.00     10002   0.00     0.00     #local
  5.45     0.33   0.05   0.03  0.02     10002   0.00     0.00     CallRot#line_is_future?
  3.64     0.44   0.02   0.02  0.00     10003   0.00     0.00     #allocate
  3.64     0.35   0.02   0.02  0.00     20004   0.00     0.00     Array#pop
  3.64     0.17   0.02   0.02  0.00     10002   0.00     0.00     Comparable#>
  3.64     0.15   0.02   0.02  0.00     10002   0.00     0.00     OnCall#initialize
  1.82     0.45   0.01   0.01  0.00         1   0.01     0.01     #open

Since ruby-prof sends it's output to stdout, it's easy to munge with standard tools (at least on Unix/Linus). It does mean that you'll need to be careful about comingling any output from your original program though. I used ruby-prof call-rotator test_file | tail +3 | sort -rn to print the methods in order of time spent in them. Since the header doesn't change, you can add that back in to make reading the table easier if you like.

Interestingly, ruby-prof runs so quickly that system activity can interfere quite a bit with the recorded timings (even to the point of changing the order of the sorted methods). Over the course of a longer running program, this should even out. I've not seen it change the order of methods actually defined in your program, but that doesn't mean it won't.

Here's the output of ruby -r profile call_rotator test_file:


  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 34.79     2.31      2.31    10002     0.23     0.42  CallRot#read_line
 13.40     3.20      0.89        1   890.00  6630.00  IO#foreach
 12.95     4.06      0.86    10002     0.09     0.15  CallRot#line_is_future?
 11.14     4.80      0.74    10003     0.07     0.11  Class#new
  7.98     5.33      0.53    10002     0.05     0.07  Comparable.>
  5.42     5.69      0.36    10002     0.04     0.04  OnCall#initialize
  4.67     6.00      0.31    20004     0.02     0.02  Array#pop
  3.61     6.24      0.24    10002     0.02     0.02  String#split
  3.46     6.47      0.23    10002     0.02     0.02  Time#local
  2.41     6.63      0.16    10002     0.02     0.02  Time#<=>

You can see that the ruby-prof output has two columns that are missing from the stock profile output. self and children break the total time into more discrete measures. self shows how much time is spent in the method itself, exclusive calls out to child methods. children shows how much time is spent in calls to those child methods. (total is the same as self in the stock output.

It's also interesting that the two methods of profiling don't pick up the same set of method calls in the program being profiled, nor is one a superset of the other. Here's a side by side comparison of the methods found by each in profiling runs of another program:


  ruby-prof                  profile
Array#<<                   Array#<<     
Array#[]     Array#[]     
Array#each     Array#each     
<Class::IO>#allocate    
<Class::IO>#open    
<Class::Time>#allocate    
<Class::Time>#now    
Enumerable#each_with_index Enumerable.each_with_index   
File#initialize     File#initialize     
Fixnum#*     Fixnum#*        
Fixnum#%     Fixnum#%        
Fixnum#+     Fixnum#+        
Fixnum#to_s     Fixnum#to_s     
Integer#downto     Integer#downto     
Integer#upto     Integer#upto     
                           IO#open   
IO#puts      IO#puts       
IO#write     IO#write       
Kernel#load     Time#-         
Time#-      Time#+       
Time#+      Time#day       
Time#day     Time#hour        
Time#hour     Time#initialize      
Time#initialize     Time#min       
Time#min     Time#mon         
Time#mon     Time#now         
Time#year     Time#year        
#toplevel     #toplevel         

The four methods not caught by profile take up significant amounts of time, but won't change your basic profiling approach. Likewise the one method missed by ruby-prof isn't a game breaker, but it's good to be aware of things like this.

In both sets of output, you can see that the read_line is the more expensive of the methods, and the one where you can focus if you need to speed things up. The ruby-prof output also shows you that more time is spent in its children than locally, so you gains will be minimized. You'll get more information from a call graph, but that's whole 'nother blog entry — I'll get around to it next time.

If you found this post helpful, you might want to look at my ruby-prof post collection.

Wednesday, August 02, 2006

Evroupskou Rails konferenci roku 2006

RubyCentral a Skills Matter jsou potěšeni Vám oznámit první Evroupskou Rails konferenci roku 2006.

Rails konference se bude konat v TUC kongresovém centru v Londýně od 14. do 15. září.

Více se dozvíte na adrese europe.railsconf.org

Potvrzení přednášející

  • Tvůrce Railsů David Heinemeir Hansson
  • Pragramtic Programmer Dave Thomas
  • Jedena z nejlépe prodávaných autorů na trhu a vášnivý odborník Kathy Sierra
  • Hlavní vývojáři Railsů Jamis Buck, Marcel Molina ml., Thomas Fuchs
  • Autoři Railsů a školitelé David A. Black a Chad Flowler
  • Autor rake nástroje Jim Weirich
  • a velice populární a jediný whytheluckystiff...
  • a spousta dalších je nahlášeno!

Více najdete na adrese europe.railsconf.org


This translation into Czech was done by Ladislav Martinčík. Please report it or email it as appropriate, but (please) don't spam.

Tuesday, August 01, 2006

RailsConf Europeo 2006

Ruby Central y Skill Matter tienen el placer de anunciar el Primer RailsConf Europeo 2006.

RailsConf estara ubicada en el TUC Congress Centre en Londres el 14 y 15 de Septiembre.

Oradores Confirmados

  • David Heinemeier Hansson el creador de Rails
  • Pragmatic Programmer Dave Thomas
  • Autora Best-seller y experta en passion, Kathy Sierra
  • Los desarrolladores del core de Rails cJamis Buck, Marcel Molina, Jr., Thomas Fuchs
  • Los autores y trainers de Rails David A. Black and Chad Fowler
  • El autor de Rake Jim Weirich
  • y el unico whytheluckystiff....
  • y muchos mas seran anunciados!

Para mas informacion europe.railsconf.org,


Pedro Visintin graciously translated this for me. Feel free to post it to any appropriate mailing lists (but, please, don't spam).

RailsConfEurope

This isn't the first time I've posted about this conference, but it's important, so I'm going to climb up on my soapbox again.

Putting on conferences is hard. If the first one doesn't sell well, it's even harder to get the next one going. Skills Matter and RubyCentral have worked together to set up a RailsConf at the TUC Congress Centre in London on September 14 & 15. The line up of speakers looks great (hey, any lineup that includes DHH, Dave Thomas, Kathy Sierra, and Why the Lucky Stiff has to be good). It's a lot easier than heading to the States. Best of all, you won't have to deal with monolingual american hotel/convention staff.

Seriously, If you're a Rails hacker in Europe, go check out europe.railsconf.org, get registered, and go!

Grrr. Comment Spam

I just got hit with comment spam, that's nothing new. I've already cleaned it up. The part that irks me is that the topic was something I'd have gladly posted about had the commenter just sent me an email. (In fact I was talking to other people involved with the topic at OSCon, and was starting to try to get the word out through other channels.)

I'd just like to make a simple request. If you want to make a commercial announcement here at on-ruby, please get in touch with me by email. If it's something I agree with, I'd be happy to help get the word out. If you take matters into your own hands and post spam into the comments, I'll delete them.

speaking at RubyConf*MI

I've nearly got my slides done for my presentation on "Libraries for the Ruby Developer" for RubyConf*MI. Registration is open (and only $20), and things are moving along toward what looks like it will be an awesome conference.

Other than having to actually write the slides the worst part of this is that, coming out of OSCon, I'm being painful reminded of how good a presentation can be. The bar is set pretty high, and I'm crawling around on a bunch of websites trying to soak up as much presentation-fu as I can.

If anyone has a hint or two that they'd like to share, I'd love to read them. Thanks.