Tuesday, October 10, 2006

Improving Ruby Performance, One Library at a Time

I've been looking at performance lately, and several threads are starting to come together for me:

Zed: The whole process is really just the scientific method. Since I have limited information from Ruby about performance I have to just test, evaluate, adjust, and repeat until the measurements improve. What really helps is using statistical tests to confirm that each change made a difference, or at least didn't hurt things. Without these tests I could make changes that seemed to improve things but actually made no difference.
Zed Shaw

Dave: I spend a lot more time thinking about the algorithms than anything else. I use gprof to find the bottlenecks in my code and try to rework the algorithm so that that part of the code gets called less. Then I may try and optimize the code but only in extreme cases. The other tools I really like for C development are gdb and valgrind. For those who don't know, valgrind is a debugging and profiling tool which is particularly good for finding memory related errors in C programs. I usually use it for debugging rather than profiling and I don't know how I lived without it. Unfortunately it doesn't play nice with Ruby as Ruby's garbage collector throws up a lot of red flags so I've had to overcome this by building a pretty large suppression file to get valgrind to ignore all of the Ruby errors. I still worry that I'm also suppressing errors that could be raised by Ferret but it seems to be doing a good job. Another tool I'm really starting to like is gcov which is great for checking test coverage as well as profiling.
Dave Balmain

zenspider: people get so myopically focused on using C to make things faster that they don't bother looking at their algorithms or data-structures. It is sad. Ruby may be slow for method dispatch, but bad code can be slow in ANY language. . . . C doesn't make ruby fast. Avoiding method dispatch makes ruby fast. You can do that using pure ruby quite a bit of the time by applying your noodle.
zenspider

John: What this really tells me is simple ... algorithms matter...
John Duimovich

Ruby isn't the fastest language on the block, but it's fast enough for me. Does that mean it's fast enough? Probably not. There are three main places that Ruby could be improved: in my code, in the libraries that I use, and in the Ruby core. John, zenspider, Dave, and Zed all have some good advice, but it all boils down to John's — algorithms matter, and where they're used matters too. I'm most able to change my own code, but the greatest effect comes from making changes at the most core code we can.

Have you looked at the performance of the libraries you rely on? Maybe you should. If you find ways they could be improved, contribute a patch, or (at least) talk to the implementor. Consider it a call to action. If every Ruby user just made one small improvement, think of the effect it would have on the language as a whole — sure, it costs a bit more, but it's worth it!

If you found this post helpful, you might want to look at my ruby-prof post collection.

2 comments:

Keith Fahlgren said...

So, here's a novice's question:

Can someone give a good example of refactoring code to reduce Ruby's method dispatching?

pate said...

Keith,
zenspider's blogpost did a great job.

http://blog.zenspider.com/archives/2006/09/recursive_functions_in_rubyinline.html

I'll let you go look at it yourself, but here are the (trimmed) results:

% ./fib.rb 10000 15
# of iterations = 10000
total real
fib-ruby 14.510000 14.628854
cfib 0.150000 0.147117
fib-cached 0.010000 0.012523

Not only does his caching fibonacci method beat the pants off of the straight Ruby implementation, it trounces the inlined C version as well.

There's probably room for another post or two here though.