Thursday, December 04, 2008

Benchmarking and Refactoring

This blog post is pulled from the Profiling and Optimizing Ruby tutorial I wrote for IBM DeveloperWorks a couple of years ago. If you'd like to reuse it, please note that the initial publication at DeveloperWorks must be attributed. I've reposted it here so that people can access it without signing up for a DeveloperWorks account.

Another section of this tutorial, Benchmarking Makes it Better was posted yesterday, it might be worth reading first if you don't have a good handle on benchmarking already.

If you're not comfortable with profiling Ruby Code, you might want to look at some of my other articles on Profiling:

The original tutorial used the following "shell_grabber.rb" script as example code:


file = "/etc/passwd"
File.open(file).each do |line|
  if line.match('/bin/bash') then
    print line.split(':')[0]
    puts " uses bash"
  end
end

Following the profiler

Now that you've seen profiling and benchmarking at work, it's time to put it all together. Back in listing 5 () you saw the profiling results of profiling our shell_grabber.rb script. The first five lines provide the biggest opportunities for optimization, so I've repeated them in


  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 42.55     0.20      0.20        1   200.00   470.00  IO#each
 21.28     0.30      0.10      690     0.14     0.20  Kernel.puts
 10.64     0.35      0.05     2070     0.02     0.02  IO#write
  8.51     0.39      0.04      690     0.06     0.07  Kernel.print
  6.38     0.42      0.03     1242     0.02     0.04  String#match

We've got two different places to try to improve our code; the use of a print and a puts in the same block (lines 4 and 5 in listing 12) which take up .69 seconds and 1380 total calls in our profiling run, and the construction of our match method which we call 1242 times (using .42 seconds). If we can clean either of these up, it will represent a win. If we can get both — so much the better.

Since these are two separate refactorings, we'll walk through them individually. Both follow the same pattern though:

  • Isolate the code you want to change
  • Benchmark your options
  • Make the change (if appropriate)
  • Test your new version
  • Check in your changes
In this tutorial, you actually get to cheat a little bit. Since this example only has one method, there's only a very small isolation step to worry about.

Cleaning up your match method

One way to explain the pain you're feeling here is that line.match('/bin/bash') is being repeated on each iteration. Ruby allows you to build a Regex object once then refer to it multiple times later. Listing 13 shows a benchmarking script which tests the impact of building our Regex prior to instead of during our loop. Here's what the benchmarking code would look like:


require 'benchmark'

n = 1000
file = "fixture"

Benchmark.bmbm(25) do |x|

x.report("build regex in block") do
    for i in 1..n do
      File.open(file).each do |line|
        if line.match('/bin/bash') then
          $stderr.print line.split(':')[0]
          $stderr.puts " uses /bin/bash"
        end
      end
    end
  end

  x.report("build regex prior") do
    for i in 1..n do
      re = /\/bin\/bash/
      File.open(file).each do |line|
        if line.match(re) then
          $stderr.print line.split(':')[0]
          $stderr.puts "uses /bin/bash"
        end
      end
    end
  end

end

Here are the results:


$ ./re_bench.rb 2> /dev/null
Rehearsal ------------------------------------------------------------
build regex in block      17.610000   0.820000  18.430000 ( 18.692166)
build regex prior          5.100000   0.340000   5.440000 (  5.465884)
-------------------------------------------------- total: 23.870000sec

                               user     system      total        real
build regex in block      17.550000   0.790000  18.340000 ( 18.421562)
build regex prior          5.170000   0.340000   5.510000 (  5.514125)

There's a pretty obvious win (over a 70% reduction in running time) by moving the algorithm out of the loop. This is a simple change to make:


file = "/etc/passwd"
File.open(file).each do |line|
  re = /\/bin\/bash/
  if line.match(re) then
    print line.split(':')[0]
    puts " uses bash"
  end
end

Having made the change, we need to test it to verify that everything works as planned. The simplest way to test a script this short is to verify the output. In this case we can run the output of the original script and of our new version through diff to ensure that they're the same. diff orig_output new_output exits with a return value of 0, meaning that there's no difference between the output of our two versions. For any real code, you really want to run a real test suite written with Test::Unit or Rspec.

While it's not strictly necessary, we can also look at the real time reductions in our new version. Using time with both versions of the script shows 0m0.064s of real time for the original version and 0m0.046s of real time for the new version — not quite the 70% reduction we saw in the benchmark, but a healthy change nonetheless. (Don't worry too much about not seeing the full improvement, time results can have hidden impacts from all manner of system activity — see also my blog post Benchmarking, Lies, and Statistics.)

Once you've run your tests and timed the results again, you should check it into your source control system so that you can revert if you later find a better way to do things.

Another Approach

It turns out that there's another way to improve the performance of your regex handling. Using the match method turns out to be slower than the using =~, but don't take my word for it. Let's benchmark it. A good benchmarking script is here:


require 'benchmark'

n = 1000
file = "fixture"


Benchmark.bmbm(25) do |x|

  x.report("build regex in block") do
    for i in 1..n do
      File.open(file).each do |line|
        if line.match('/bin/bash') then
          $stderr.print line.split(':')[0]
          $stderr.puts " uses /bin/bash"
        end
      end
    end
   end
  
   x.report("build regex prior") do
      for i in 1..n do
        re = %r{/bin/bash}
        File.open(file).each do |line|
          if line.match(re) then
            $stderr.print #{line.split(':')[0]
            $stderr.puts "uses /bin/bash"
          end
        end
      end
    end
  
    x.report("use =~ pattern") do
      for i in 1..n do
        File.open(file).each do |line|
          if line =~ /\/bin\/bash/ then
            $stderr.print #{line.split(':')[0]
            $stderr.puts "uses /bin/bash"
          end
      end
    end
  end
end

And the results are:


Rehearsal ------------------------------------------------------------
build regex in block      17.480000   0.470000  17.950000 ( 17.997961)
build regex prior          4.980000   0.330000   5.310000 (  5.322441)
use =~ pattern             4.230000   0.300000   4.530000 (  4.629864)
-------------------------------------------------- total: 27.790000sec

                               user     system      total        real
build regex in block      17.590000   0.480000  18.070000 ( 18.620850)
build regex prior          5.090000   0.310000   5.400000 (  5.498786)
use =~ pattern             4.300000   0.310000   4.610000 (  4.652404)

Our first change represents a 70% improvement over the original, but our new change is a 75% improvement according to the benchmark — a change worth making. Since you're keeping everything in some kind of source control, your first step will be to revert your last change. Then you can change the script to match this:


file = "fixture"
File.open(file).each do |line|
  if line =~ /\/bin\/bash/ then
    print line.split(':')[0]
    puts " uses bash"
  end
end

Once the changes have been made, you can go back and verify that the scripts generate the same output using diff as shown previously. Once they pass, you can check your new code in and move on.

Printing Once

We have two issues to consider with the way we're printing lines in our script. First, we we've got two distinct printing calls. Second we're using puts which adds the overhead of adding a \n to every line that doesn't already have one. Before we start making changes, let's do some benchmarking. Here's a banchmarking script:


require 'benchmark'

n = 50_000

Benchmark.bmbm(15) do |x|

  x.report("original version") do
    for i in 1..n do
      $stderr.print i, "and"
      $stderr.puts i
    end
  end

  x.report("single puts") do
    for i in 1..n do
     $stderr.puts "#{i} and #{i}"
    end
  end

  x.report("single print") do
    for i in 1..n do
     $stderr.print "#{i} and #{i}\n"
    end
  end

end

And here are the results:


$ ./print_benchmark.rb 2> /dev/null
Rehearsal ----------------------------------------------------
original version   0.370000   0.050000   0.420000 (  0.443343)
single puts        0.210000   0.020000   0.230000 (  0.240232)
single print       0.170000   0.020000   0.190000 (  0.181233)
------------------------------------------- total: 0.840000sec

                       user     system      total        real
original version   0.230000   0.030000   0.260000 (  0.272866)
single puts        0.220000   0.020000   0.240000 (  0.239319)
single print       0.170000   0.010000   0.180000 (  0.180624)
$

Both single printing call versions show an improvement over the original two printing calls version of the code, but the print version is significantly better than the puts version (a 33% improvement versus a 12% improvement). That makes your decision easy, the new version of the script (using a single print method) is shown below:


file = "fixture"
File.open(file).each do |line|
  if line =~ /\/bin\/bash/ then
    print "#{line.split(':')[0]} uses bash\n"
  end
end

Again, run your tests so that you know you've got a good version. time let's us see just how well our optimization worked. In this case, we got 0m0.046s for the previous version and 0m0.026s for our new version — a 40% speedup.

Refactoring Roundup

In the course of this post, we've made a reverted one change, and made two others. Our code is a bit more than 10% smaller, and runs much faster. Although you don't have unit tests, you've been able to verify your output using a functional test (and thus verified your code quality) at every step of the way, so you also know that you've not introduced any new errors.

While the script that you've been working on is small, you'll follow the same methodology on bigger programs.

First, make sure that your spending your time wisely. Going through a lot of work to speed up a script that takes under a second to run is probably not worth your time for a command line tool, but if you're building a program that will be part of a heavily used website even nano-seconds may count. (You'll need to make sure you've got enough iterations to be able to measure differences above the nano-second scale though, since any variation that small is liable to be caused by outside factors.) time and profile are the tools you'll rely on here.

Second, isolate the code you want to change. In our case, the isolation was pretty simple, but when you're working with an object with multiple public and private methods it will be a bit more difficult. There aren't really any tools to help with this, it all comes down to reading the profiler output and knowing your code.

Third, benchmark your options. Once you've identified and isolated the code your going to change, you can move it (perhaps with small changes) into a benchmarking script. Once you have benchmarking output you can make informed decisions about how to change your code.

Fourth, make the changes indicated by profiling and benchmarking. Now that you know what works the best, you can transplant that back into your actual code.

Fifth, test your new code. Always test your code. Test your code before every commit. Unit tests make this better and easier for almost all of your code, but be ready to roll your own tests in the odd cases that it doesn't.

Sixth, check in your changes. Revision control is incredibly important, it will save you at some point.

2 comments:

Derek said...

Beneath the "Printing Once" header, the third test is supposed to represent a single print statement, but you still used puts.

pate said...

Derek,
thanks for catching that. I've updated the script appropriately.