I already updated the original post to deal with a comment by ‘gravity’. In a later comment, ‘Ganryu’ wrote “I don’t get it… isn’t this mainly IO dependent?” Since he doesn’t have access to the code for LogWatchR he can’t do the profiling to find this out, but it’s the same kind of assumption that a lot of people make when they decide to ‘optimize’ code.
LogWatchR runs fast enough for my purposes (it takes less than 15 seconds to analyze 20 minutes worth of log entries), so I’ve never bothered profiling it before. Since it’s come up, I decided to give it a whirl. I cranked up the latest version of ruby-prof and saw the following:
/usr/bin/ruby-prof logwatcher.rb < 20_minute_log Thread ID: 1075677140 Total: 32.65 %self total self wait child calls name 39.75 24.43 12.98 0.00 11.45 73607 Array#include? 35.07 11.45 11.45 0.00 0.00 25329928 String#== 10.75 3.51 3.51 0.00 0.00 73607 Hash#keys 2.91 0.95 0.95 0.00 0.00 211698 String#=~ . . .
Given that, I feel confident in saying “No, Ganryu, this isn’t mostly IO dependent.” On the other hand, I’m a bit puzzled by the massive number of calls to String#==. There are only two places that I call == directly, they’re both in a single method and it’s only called once per log entry so I would think that I’d only see it about 146,000 times. I guess that it’s being called implicitly by something I’m doing, but I’m not sure what. Sounds like a good investigation for another day.
The moral of this little post? If your code’s not running fast enough for you, profile it before you decide where to ‘fix’ it. You’ll save yourself a lot of time and grief.
Update: Doh! I just realized where that huge number of == is coming from, it's in the Array#include? which is a part of the code I've never been convinced is needed. It looks like I could go a lot faster by getting rid of it, something to remember if I ever need to worry about speed.
If you found this post helpful, you might want to look at my ruby-prof post collection.