On Ruby: performance

Showing posts with label performance. Show all posts

Thursday, February 21, 2008

JRuby 1.1 RC2 Real World Performance

With the recent release of JRuby 1.1 RC2 I’ve rerun my ‘Real World Performance’ benchmarks. In addition to my normal run I’ve added a second longer run, and the results surprised me. Let me tell you what I did, and then we can take a look at the results.

In both cases I use an application that builds a catalog of good and bad patterns. Then it reads in syslog output and compares the log entries against the catalogs, sending different kinds of notifications depending on the configuration. While this sounds like it might be IO bound, profiling shows that much of the time is actually spent in Array operations and comparisons.

The original, shorter benchmark does ten iterations. Each iteration starts JRuby, creates the catalogs, then processes 73607 log entries.

The new, longer benchmark does three iterations (I didn’t want to take too much time). Each iteration starts JRuby, creates the catalogs, and processes 407749 log entries.

With that out of the way, here are the original style results:

Version	Average Run Time	stdev	time/syslog entry
JRuby 1.0.3	21.63	0.21	0.00029
JRuby 1.1 RC1	15.79	0.27	0.00021
JRuby 1.1 RC2	15.74	0.24	0.00021

For JRuby 1.0.3, I set the JAVA_OPTS like this:

JAVA_OPTS="-Djruby.objectspace.enabled=false \
  -Djruby.jit.enabled=true -server"

and ran each iteration like this:

jruby -C logwatcher.rb < 20_minute_log

JRuby 1.1 had the following JAVA_OPTS:

JAVA_OPTS="-Djruby.objectspace.enabled=false \
  -Djruby.compile.mode=JIT -server"

RC1 was executed like this:

jruby -C logwatcher.rb < 20_minute_log

and RC2 like this:

jruby -X-C logwatcher.rb < 20_minute_log

(Between RC1 and RC2, the JRuby team added the -X prefix to the compilation switch to prevent a conflict with MRI’s -C switch.) The -C switch turns off compilation, which is faster on these shorter runs. I also tried using the +C, but am ignoring the results for now.

The results of the new style benchmark looks like this:

Version	switches	Average Run Time	stdev	time/syslog entry
JRuby 1.0.3	-C	127.90	15.8	0.00174
JRuby 1.1 RC1	-C	231.86	30.2	0.00315
JRuby 1.1 RC2	-X-C	218.93	34.89	0.00297
JRuby 1.1 RC2	-X+C	193.15	8.56	0.00262

The first thing that jumps out at me is that the 1.1 release candidates are significantly worse that the 1.0.3 release. The other thing is that performance (measured per entry) degraded on the longer run. The former is something that the JRuby team might want to look at, while the latter means I probably need to look at my code.

I used the same JAVA_OPTS settings shown in the original style tests, but I did run the RC2 version with compilation turned on (and did see an improvement). I should probably set aside a couple of hours to let these tests run against an even larger data set at some point.

Please, don’t assume that your app will end up with the same performance characteristics that LogWatchR has. Run your own performance measurements … and think about posting them, it will help all the implementations get better.

Friday, January 11, 2008

Real World Rubinius Performance

Well, I have good and bad news from the Rubinius front. This morning, I built the latest Rubinius from the git repository, and gave LogWatchR another try … and it worked! This is a huge step forward from my perspective, since I’ve had all kinds of wierd failures in the past.

The bad news (well, bad is an overstatement, let’s say ‘not so good news’) is that the performance is pretty bad at this point. To be fair, the Rubinius team is still focused on completeness. I’d expect performance to improve once they turn their eye to it.

Here’s my 30 second recap:

My ‘real world performance’ timings run a small Ruby app called LogWatchR, which scans syslog data for cataloged good and bad patterns. I feed it about 73,000 syslog entries (about 20 minutes worth) that I’ve got archived. I measure the execution time against repeated runs, and find the average run time and standard deviation.

Rubinius’ first successful run clocked in at an average of 143.03 seconds, with a standard deviation of 0.56 seconds.

Given that things took so long, I thought I’d try a quick profiling run too (I only ran against 1000 log entries though, to save time). Here’s the abbreviated version of what the profiler told me:

../shotgun/rubinius -r profile logwatcher.rb < short_log
Total slices: 160, 2680000 clocks
 % time   slices   name
   8.75       14   String#[]
   8.12       13   Hash#[]
   6.88       11   Array#<<
   6.88       11   Object#kind_of?
   5.00        8   #.single_load
   4.38        7   String#split
   4.38        7   Hash#keys
   4.38        7   Regexp#match_from
   3.75        6   Hash#[]=
   2.50        4   String#substring
   2.50        4   Array#replace

Now for the amazing part, the sampling debugger took almost no extra time, clocking in at 3.009 seconds of real time while a regular run took 2.805 seconds. Wow! I’m looking forward to seeing what the Legion of Rubinius Heroes can do with performance over the next couple of months.

If you’re interested in seeing more performance information, you might want to look at:

JRuby 1.1 RC1 Real World Performance

Real World Performance on Boxing Day

I’ve also written about profiling if you’d like to read more about that.

Tuesday, January 08, 2008

JRuby 1.1RC1 Real World Performance Update

With the release of JRuby 1.1RC1 I’ve run a new set of LogWatchR performance tests. This time, I’ve only run a set of 1.0 and 1.1 versions of JRuby. If you’re really interested in seeing more about 1.8 and 1.9 performance, you can always go look at my older post on the topic.

As a bit of clarification up front, this test measures the execution time of a simple minded log analysis tool I wrote/use against a 20 minute sample of syslog output. Contrary to the common assumption, this is not an IO bound process (see here for further details). I’m running this test with the following flags through 1.1b1:

JAVA_OPTS="-Djruby.objectspace.enabled=false \
  -Djruby.jit.enabled=true -server" 
jruby -C

Version 1.1RC1 says that jruby.jit.enabled=true is no longer valid, so I’m usingthis instead:

JAVA_OPTS="-Djruby.objectspace.enabled=false \
  -Djruby.compile.mode=JIT -server" 
jruby -C

Build	Avg Execution Time (in secs)	Std Dev
Jruby 1.0.2	21.03	0.69
Jruby 1.0.3	21.22	0.39
Jruby 1.1b1	18.8	0.95
Jruby 1.1RC1	15.41	0.22

It looks like the JRuby team is continuing to do great things with performance. This build is a 27% improvement over the 1.0.2 release and an 18% improvement over the first 1.1 beta.

I’m not a JRuby or Java performance geek, so if anyone on the team would like to lend me some performance enhancing clues I’d be happy rerun the tests.

UPDATE: Based on Charles comments below, I've now tried this with:

+C, which compiles every method
-C, which turns off compilation
the default compilation (no C switch at all) which will compile any method called more than 20 times

Since I'm making fairly small runs, any compilation is a pretty big hit to my run times (e.g., +C takes 25 seconds to complete on average). Maybe this weekend will be a good time to try some much longer runs.

Wednesday, December 26, 2007

Real World Performance Profiling

It looks like my post on Ruby 1.9.0 performance is drawing some criticism over on on reddit.

I already updated the original post to deal with a comment by ‘gravity’. In a later comment, ‘Ganryu’ wrote “I don’t get it… isn’t this mainly IO dependent?” Since he doesn’t have access to the code for LogWatchR he can’t do the profiling to find this out, but it’s the same kind of assumption that a lot of people make when they decide to ‘optimize’ code.

LogWatchR runs fast enough for my purposes (it takes less than 15 seconds to analyze 20 minutes worth of log entries), so I’ve never bothered profiling it before. Since it’s come up, I decided to give it a whirl. I cranked up the latest version of ruby-prof and saw the following:

/usr/bin/ruby-prof logwatcher.rb < 20_minute_log
Thread ID: 1075677140
Total: 32.65
 
 %self     total     self     wait    child    calls  name
 39.75     24.43    12.98     0.00    11.45    73607  Array#include?
 35.07     11.45    11.45     0.00     0.00 25329928  String#==
 10.75      3.51     3.51     0.00     0.00    73607  Hash#keys
  2.91      0.95     0.95     0.00     0.00   211698  String#=~
 .
 .
 .

Given that, I feel confident in saying “No, Ganryu, this isn’t mostly IO dependent.” On the other hand, I’m a bit puzzled by the massive number of calls to String#==. There are only two places that I call == directly, they’re both in a single method and it’s only called once per log entry so I would think that I’d only see it about 146,000 times. I guess that it’s being called implicitly by something I’m doing, but I’m not sure what. Sounds like a good investigation for another day.

The moral of this little post? If your code’s not running fast enough for you, profile it before you decide where to ‘fix’ it. You’ll save yourself a lot of time and grief.

Update: Doh! I just realized where that huge number of == is coming from, it's in the Array#include? which is a part of the code I've never been convinced is needed. It looks like I could go a lot faster by getting rid of it, something to remember if I ever need to worry about speed.

If you found this post helpful, you might want to look at my ruby-prof post collection.

Real World Performance On Boxing Day

Well, Ruby 1.9.0 landed yesterday, as expected. I’d be remiss if I didn’t start out by thanking matz, ko1, and all the other hackers involved in getting this milestone release out the door. It’s a great step for Ruby, and one that we’ve been waiting a long time for.

The bad news is that 1.9.0 is just a development release branch leading up to 2.0, and it doesn’t yet run Rails or Mongrel. There have been some scattered results on the mailing list of other problems as well. I’m sure they’ll get ironed out over the next several releases (which I hope will be frequent during the next year). There’s even a 1.9 specific library already—rev was released earlier this morning.

Fortunately, 1.9.0 is able to run LogWatchR just fine, so I’ve rolled out a new set of results from my ‘Real World Performance Test’. I hadn’t yet run a test using the JRuby 1.0.3 release, so I’ve included that as well. This time, I decided to show you the raw data for each version of Ruby I tested as well as the final results. I’m still using Ruby 1.8.5-12 for my baseline, as that’s our standard version at work (we may convert to JRuby 1.1 once that’s available—we’ll see). In any case, here are the numbers:

Ruby 1.8.5-12	Ruby 1.9.0-14709	JRuby 1.0.3	JRuby 1.1b
11.40	10.53	28.07	20.97
10.56	10.70	26.67	18.99
10.39	10.36	22.04	18.76
10.47	10.41	21.95	20.43
10.49	10.41	21.79	18.76
10.41	10.53	21.97	18.86
10.81	10.42	22.45	19.35
10.43	10.84	22.23	18.97
10.97	10.34	21.81	19.03
10.41	10.39	21.78	21.43

And here are the results:

	Ruby 1.8.5-12	Ruby 1.9.0-14709	JRuby 1.0.3	JRuby 1.1b
Average	10.63	10.49	23.08	19.56
Std Dev	0.33	0.16	2.3	1.0
Perf	100.00%	101.34%	46.08%	54.38%

I was a bit discouraged to see that 1.9.0 isn’t that much faster for my needs. JRuby is still slow for me too, but the regexp work that Charlie and the boys have been working on holds a lot of promise, I think JRuby 1.1b2 will tell a different story. I’m anxious to see rubinus, but it’s not quite there yet.

Update: Someone on reddit asked why there's not more background on what these numbers mean. I've got more information in some of my previous posts on the topic:

Some Real World Performance Notes

More Real World Performance Notes

JRuby 1.0.1 Real World Performance Notes

JRuby 1.1 Performance Numbers

The short story is that the times above are the time required for LogWatchR to run through about 75,000 syslog entries (about 20 minutes worth) and report on known bad or unknown log entries.

Monday, August 27, 2007

JRuby 1.0.1 - Real World Performance

The other day, Thomas Enebo and the JRuby gang cut a 1.0.1 release of JRuby and I finally got around to benchmarking it against my LogWatchR app. I used the same data set and Ruby versions as previously (I need to upgrade my 1.9.0 install to see how that work's been going, but it won't happen today). (You can see the previous version of this benchmark here.)

This time around, JRuby showed significant performance improvements. (Oddly, the 1.8 and 1.9 benchmarks looked better too, I don't know if that's from a small code change or just that my system is running a little bit lighter these days.) My stock Ruby install ran the tests in an average of 9.294 seconds (+/- .41 seconds). My 1.9 install run them in 9.394 seconds (+/- .11). The JRuby results are a bit more involved:

no flags
14.754 sec (+/- .26)
jruby -J-Djruby.jit.threshold=0 -J-server -O logwatcher.rb
19.370 sec (+/- .31)
jruby -J-Djruby.jit.threshold=0 -O logwatcher.rb
13.226 sec (+/- .30)
jruby -J-Djruby.jit.threshold=1 -O logwatcher.rb
13.256 sec (+/- .25)

All this means that JRuby is now 142-159% as slow as the installed version of Ruby for LogWatchR. That's a pretty big improvement over the 223-278% it was at the last time I ran the tests. Not bad for a test that Charles Nutter told me didn't have many performance fixes in it. I can't wait to see what happens when the JRuby guys actually start to work on performance.

rubinius is still not up to running LogWatchR, but it looks like the 0.10 version (expected in October) will be the release I get to start using. In the meantime, rubinius performance is starting to look pretty good too.

If you enjoyed this article on profiling, you might also want to look at these:

Some Real World Performance Notes

More Real World Performance Notes

JRuby 1.1 Performance Numbers

Wednesday, April 25, 2007

More Real World Performance Data

With the new release of JRuby, I’decided to rerun my LogWatchR benchmark for ruby 1.8.5p12, ruby 1.9.0, and JRuby 0.9.9. Neither XRuby nor rubinius can run the YAML library from the Standard Library yet, so neither of them will be included in these benchmarks yet.

In running this set of tests, I ran into some anomolies with last weeks data. I’ve figured it out now, and need to make a quick apology. The 1.8.5 results last week were run with my laptop on battery power (and thus stepped down), while the 1.9 tests were run while it was connected to regular power. This was a bigger cause of the variation in times than improvements in 1.9. I’m reporting both sets of data this week, but will be excluding battery driven runs starting with the next report.

There was a note on programming.reddit.com asking me to publish LogWatchR so that other people could repeat my benchmarking. I will be releasing LogWatchR at some point (hopefully soon), but I’d rather see people run their own applications for each of the implementations. If it won’t work now, work with the implmentors—in every case I’ve seen, they’re happy to help and interested in your results. So, please, give your application a try and let us know how it looks!

Without further ado then, here are the results:

On Battery Power
Ruby Implementation	Time
ruby 1.8.5 (2006-12-25 patchlevel 12) [i686-linux]	31.0 +/ 3.1
ruby 1.9.0 (2007-04-18 patchlevel 0) [i686-linux]	31.1 +/ 3.27
ruby 1.8.5 (2007-04-23 rev 3539) [i386-jruby0.9.9]	67.5 +/ 7.15
On Regular Power
ruby 1.8.5 (2006-12-25 patchlevel 12) [i686-linux]	12.0 +/ 1.2
ruby 1.9.0 (2007-04-18 patchlevel 0) [i686-linux]	12.4 +/ 1.8
ruby 1.8.5 (2007-04-23 rev 3539) [i386-jruby0.9.9]	33.3 +/ 6.83

Update!

The JRuby guys pointed me in the right direction to squeeze some extra improvment out of their platform. Instead of running jruby logwatcher.rb < 20_minute_log, I can run JAVA_OPTS=”-Djruby.objectspace.enabled=false -Djruby.jit.enabled=true”; jruby -C logwatcher.rb < 20_minute_log and take advantage of their compiler and JIT work. Following their advice I got the following results (on regular power):

Ruby Implementation	Time
ruby 1.8.5 (2007-04-23 rev 3539) [i386-jruby0.9.9]	26.7 +/ 3.79

These results are about a 20% improvement. That’s a good argument for really knowing your platform. JRuby still isn’t competing with Ruby’s speed, but it handles 20 minutes worth of log entries in about 27 seconds so I’d have to say it’s fast enough for what I’m doing. (And, I have this sneaking suspicion that the JRuby team has some more optimization work in their plans.)

If you enjoyed this article on profiling, you might also want to look at these:

Some Real World Performance Notes

JRuby 1.0.1 Real World Performance Notes

JRuby 1.1 Performance Numbers

Thursday, April 19, 2007

Some Real World Performance notes on Ruby 1.8 and 1.9

Update:

The performance numbers for Ruby 1.8.5 (below) are incorrect due to a mistake on my part. For updated in formation, please see my newer performance post.

p>One of the knocks against Antonio's Ruby Performance Shootout is that it uses a synthetic set of tests (the one's Ko1 wrote to exercise the parts of YARV he'd already done some optimization on). In an effort to get a 'more real' set of numbers (at least for me), I decided to use my LogWatchR tool to do some benchmarking.

LogWatchR is a program to watch syslog (we aggregate several hundred boxes' syslog entries to a pair of servers at my $JOB). I wrote it to use catalogs of good and bad entries (with dependencies, number of occurence thresholds, etc.) written in YAML and stored in external files, so that even a Junior Sys Admin could add new catalog entries pretty easily. To keep myself honest, I have a 20 minute sample of log entries and a simple pair of catalogs I use:

73607 log entries in the sample
1 entry in the good patterns catalog
5 entries in the bad patterns catalog

I can use this to make sure LogWatchR is performant, and works as expected.

In order to time LogWatchR, I ran the 20 minute sample through the program 5 times on each implementation of Ruby. Here are the results I got:

version	running time (in seconds)
ruby 1.8.5 (2006-12-25 patchlevel 12)	34.829 +/- .287
ruby 1.9.0 (2007-04-18 patchlevel 0)	12.967 +/- .095

At this point, all I can say is I'm really impressed with the work that the 1.9 team has done to speed Ruby up. Anytime you can get a better than 2.5x performance boost, you've gotta be doing something right.

JRuby 0.9.8 and rubinius don't work yet. They both have YAML related failures. I've talked with Evan about rubinius, and he says he's hoping that YAML will be working soon. I've also sent in a bug report to Ola, and hope that we'll see a fix in 0.9.9.

I plan on rerunning this benchmark periodically, so hopefully we'll see JRuby and rubinius included in the results soon.

If you enjoyed this article on profiling, you might also want to look at these:

More Real World Performance Notes

JRuby 1.0.1 Real World Performance Notes

JRuby 1.1 Performance Numbers

On Ruby