On Ruby: December 2007

Monday, December 31, 2007

Rubinius on JRuby

It looks like the JRuby guys are getting serious about running Rubinius on their platform:

"I added the minimum dependencies needed to get it rbx building and running bin/ci the same with JRuby" Charles Nutter

They're not all the way there yet, Charles reports that he's still hitting failure at the end of the build. It's getting closer though, and with the history of cooperation the Rubinius and JRuby teams have built I believe they'll have this licked soon enough.

Friday, December 28, 2007

Ruby.NET Interview With M. David Peterson

M. David Peterson came out to the UtahValley.rb’s ‘Ruby on .NET’ night to represent the Ruby.NET project a while ago. The IronRuby presentation ended up going longer than we’d scheduled, so I wanted to ask him some questions about his view on the two Ruby projects and Ruby.NET specifically.

How much are the IronRuby and Ruby.NET projects working together?

M. David As far as I know, there is no collaboration, something I would LOVE to see fixed, ESPECIALLY as it relates to the RubyLib’s themselves, as for what should be obvious reasons being enabled to use the various RubyLib’s via both Ruby.NET and IronRuby would benefit both projects as well as developers writing Ruby code targeted at the .NET platform.

Where/how could they better cooperate?

M. David The development of both the RubyLib’s as well as extensions to those libs would be the ideal collaboration point, in my opinion. One of the future pain points that is bound to come into fruition is the (ability|inablity) to use code targeted for Ruby.NET w/ IronRuby and vice versa. For example, making an external to the .NET Framework Class Library (FCL) should work exactly the same regardless of the runtime engine/compiler being targeted. If it doesn’t, both projects are going to find friction in the development communities as it relates to adoption of either platform.

To me, anyway, the two most important points for both the Ruby.NET and IronRuby projects are,

Can I quickly and easily run my existing Ruby code, w/o change, via both runtime implementations?

If I make a call to the .NET FCL in my code, will that code run the same via Ruby.NET and IronRuby?

What’s the easiest way to get started with Ruby.NET for Windows, Mac, and Linux users?

M. David Begin with a specific development task and then start writing the code to make that work. Attempting to get an existing Ruby library to run can be frustrating due to the fact that the Ruby.NET project as it relates to full compliance with the Ruby language is not complete. And the real key benefit to Ruby.NET is the interop layer with other .NET languages such as C#, VB.NET, IronPython, etc. So while there is still more work to do before full compliance with the Ruby language is achieved, there is still some amazing things you can do as it relates to using the Ruby language to build and extend from existing .NET libraries.

What are some ways the Ruby.NET team is working with the larger Ruby community?

M. David Let’s just say that’s a work in progress. ;^)

What more could they be doing?

M. David A lot! Joining in the efforts to ensure cross-engine interop is critical in my opinion, so taking part in the ongoing efforts to develop a test suite that ensures proper compliance is therefore critical.

In addition I think it would make a lot of sense to join in the efforts of defining the future directions of the language itself. .NET has a lot to offer when it comes to providing solid use cases for language features. I think the biggest mistake any language design team could make would be to ignore the fact that with .NET there is a HUGE base of languages and therefore language features to pick and choose from that could benefit them.

Why should we trust Microsoft?

M. David You shouldn’t. At least as it relates to whether or not MSFT will do the “right thing” when that right thing is a choice between what’s right for the development community and what’s right for their share holders. MSFT is a profit driven company and will make their decisions based on what will generate revenue and what will not.

But in the case of the MPL’d projects, you don’t have to trust them. The .NET open source communities are active and vibrant, the Mono Project representing the most active and vibrant of them all, filled with passionate open source .NET developers. If MSFT falls short on any of their MPL’d projects, the community will jump in and either fork the project and/or fill in the holes as needed. Seo’s IronPython Community Edition project is a good example.

Ruby is a community-based language, with several OS implementations already in existence. With the added value of being able to interop directly with a HUGE base of development languages, the benefits of continuing forward with the Ruby.NET project regardless of what MSFT might do is obvious. Of course Ruby.NET is completely independent of MSFT and as such, there’s really no reason to be considered with what MSFT does or does not do.

Wednesday, December 26, 2007

Real World Performance Profiling

It looks like my post on Ruby 1.9.0 performance is drawing some criticism over on on reddit.

I already updated the original post to deal with a comment by ‘gravity’. In a later comment, ‘Ganryu’ wrote “I don’t get it… isn’t this mainly IO dependent?” Since he doesn’t have access to the code for LogWatchR he can’t do the profiling to find this out, but it’s the same kind of assumption that a lot of people make when they decide to ‘optimize’ code.

LogWatchR runs fast enough for my purposes (it takes less than 15 seconds to analyze 20 minutes worth of log entries), so I’ve never bothered profiling it before. Since it’s come up, I decided to give it a whirl. I cranked up the latest version of ruby-prof and saw the following:

/usr/bin/ruby-prof logwatcher.rb < 20_minute_log
Thread ID: 1075677140
Total: 32.65
 
 %self     total     self     wait    child    calls  name
 39.75     24.43    12.98     0.00    11.45    73607  Array#include?
 35.07     11.45    11.45     0.00     0.00 25329928  String#==
 10.75      3.51     3.51     0.00     0.00    73607  Hash#keys
  2.91      0.95     0.95     0.00     0.00   211698  String#=~
 .
 .
 .

Given that, I feel confident in saying “No, Ganryu, this isn’t mostly IO dependent.” On the other hand, I’m a bit puzzled by the massive number of calls to String#==. There are only two places that I call == directly, they’re both in a single method and it’s only called once per log entry so I would think that I’d only see it about 146,000 times. I guess that it’s being called implicitly by something I’m doing, but I’m not sure what. Sounds like a good investigation for another day.

The moral of this little post? If your code’s not running fast enough for you, profile it before you decide where to ‘fix’ it. You’ll save yourself a lot of time and grief.

Update: Doh! I just realized where that huge number of == is coming from, it's in the Array#include? which is a part of the code I've never been convinced is needed. It looks like I could go a lot faster by getting rid of it, something to remember if I ever need to worry about speed.

If you found this post helpful, you might want to look at my ruby-prof post collection.

Real World Performance On Boxing Day

Well, Ruby 1.9.0 landed yesterday, as expected. I’d be remiss if I didn’t start out by thanking matz, ko1, and all the other hackers involved in getting this milestone release out the door. It’s a great step for Ruby, and one that we’ve been waiting a long time for.

The bad news is that 1.9.0 is just a development release branch leading up to 2.0, and it doesn’t yet run Rails or Mongrel. There have been some scattered results on the mailing list of other problems as well. I’m sure they’ll get ironed out over the next several releases (which I hope will be frequent during the next year). There’s even a 1.9 specific library already—rev was released earlier this morning.

Fortunately, 1.9.0 is able to run LogWatchR just fine, so I’ve rolled out a new set of results from my ‘Real World Performance Test’. I hadn’t yet run a test using the JRuby 1.0.3 release, so I’ve included that as well. This time, I decided to show you the raw data for each version of Ruby I tested as well as the final results. I’m still using Ruby 1.8.5-12 for my baseline, as that’s our standard version at work (we may convert to JRuby 1.1 once that’s available—we’ll see). In any case, here are the numbers:

Ruby 1.8.5-12	Ruby 1.9.0-14709	JRuby 1.0.3	JRuby 1.1b
11.40	10.53	28.07	20.97
10.56	10.70	26.67	18.99
10.39	10.36	22.04	18.76
10.47	10.41	21.95	20.43
10.49	10.41	21.79	18.76
10.41	10.53	21.97	18.86
10.81	10.42	22.45	19.35
10.43	10.84	22.23	18.97
10.97	10.34	21.81	19.03
10.41	10.39	21.78	21.43

And here are the results:

	Ruby 1.8.5-12	Ruby 1.9.0-14709	JRuby 1.0.3	JRuby 1.1b
Average	10.63	10.49	23.08	19.56
Std Dev	0.33	0.16	2.3	1.0
Perf	100.00%	101.34%	46.08%	54.38%

I was a bit discouraged to see that 1.9.0 isn’t that much faster for my needs. JRuby is still slow for me too, but the regexp work that Charlie and the boys have been working on holds a lot of promise, I think JRuby 1.1b2 will tell a different story. I’m anxious to see rubinus, but it’s not quite there yet.

Update: Someone on reddit asked why there's not more background on what these numbers mean. I've got more information in some of my previous posts on the topic:

Some Real World Performance Notes

More Real World Performance Notes

JRuby 1.0.1 Real World Performance Notes

JRuby 1.1 Performance Numbers

The short story is that the times above are the time required for LogWatchR to run through about 75,000 syslog entries (about 20 minutes worth) and report on known bad or unknown log entries.

Saturday, December 22, 2007

Design Patterns In Ruby, a review

Welcome InformIT readers! You might also like to see my review of Eloquent Ruby. Enjoy!

UPDATE: If you like this review, and want to learn more, check out interview with author, Russ Olsen.
The Professional Ruby Series from Addison-Wesley is rapidly becoming a heavyweight in Ruby book circles. I just received a complimentary copy of Russ Olsen’s Design Patterns In Ruby from them, and it looks like a great addition to their, already solid, line-up of books.
Russ breaks the book into three parts: Patterns And Ruby, Patterns In Ruby, and Patterns For Ruby. I think this is a good division and provides good coverage. Part II makes up the bulk of the book, while Parts I and III combined take up about a third the total.
I really like the first chapter of Part 1, “Building Better Programs with Patterns”. I’m less sold on the second chapter, “Getting Started with Ruby”—if you’re putting together an advanced book on Ruby, including a tutorial chapter seems a bit silly.
Part III is presents three patterns (one to a chapter) for Ruby Development: DSLs, Meta-programming, and ‘Convention over Configuration’. I think the third is more of a Railsism than a Rubyism, but it’s certainly worth considering. Also in Part III are a short conclusory chapter and two appendices. Appendix A discusses installing Ruby (again, I’m not sure why this one was included) and Appendix B provides references to further information.
Since Part II is so large, I thought I’d hit it last. Its thirteen chapters cover fourteen of the patterns presented in the Gang of Four Book. In selecting these, Russ says he leaned toward those most useful in writing code and those that change most between the original and the Ruby implementation. The patterns covered in this book are: Template Method, Strategy, Observer, Composite, Iterator, Command, Adapter, Proxy, Decorator, Singleton, Factory Method, Abstract Factory, Builder, and Interpreter.
Each of the Patterns chapters in Parts II and III feature three sections that I really liked: ‘Using and Abusing Foo’, ‘Foo in the Wild’, and ‘Wrapping Up’. In addition to be a good unifying touch, these were a great overview of each of the seventeen covered patterns I especially enjoyed the ‘Foo in the Wild’ sections making use of existing Ruby and Rails code to show the pattern in use.
All in all, Design Patterns In Ruby is a really good book. I’m planning on spending a lot more time with it, and it looks to have a regular spot at my desk.

Tuesday, December 18, 2007

Interview with Topher Cyll, Author of Practical Ruby Projects

Topher Cyll, author of Practical Ruby Projects and I have traded a couple of emails since his book was released by Apress. Here's what we talked about:

What makes a Ruby project ‘practical’?

Topher Good question! I think the first requirement for a “practical project” is that you actually want to do it! Whether you do it for work, enjoyment, or just to learn doesn’t really mater. All the projects in this book are meant to be fun and intellectually engaging (art, games, algorithms, theory), but I think they all touch on useful, real world skills.

My other big requirement for a “practical project” would be that you can finish it in a reasonable amount of time. This is especially important in a book. We read books because it’s fun or we want to learn. There’s nothing practical about a project that you’re trying to do in your spare time, but just can’t seem to finish! Each of the projects in this book is well defined and easy to complete. Not to mention all the source code from the book is freely available (and MIT licensed), so you can take as many short cuts as you like.

Who are the eclectic programmers you’re trying to reach? What makes them eclectic?

Topher Seriously, this book was inspired by all the great Rubyists I’ve met at conferences and user groups. In general Ruby folks just seem to be curious, reflective, light-hearted, and (almost uniformly) working on some project in their spare time. Everyone I’ve met loves to tinker and try out new things. That makes them eclectic, and it also makes them amazingly fun people to write a book for.

How did you settle on the projects in the book?

Topher Every single project was something I’d always wanted to do. A few predated the book, but many were written right along with the chapters. I’d start by asking myself, “Is this cool?”

Live coding music. Cool?

Animation with SVG. Cool?

Beautiful Mac applications. Cool?

Implementing Lisp. Cool?

And so on…

If the answer was yes, it went into the book! (Hint, all those made it in).

Now, I can’t promise my definition of “cool” is exactly the same as yours, but if you like to play around with interesting and creative projects, I bet we’re pretty close.

Can you tell us about any projects you were thinking about including but which didn’t make the cut?

Topher* A couple did fall along the wayside. For example, I wanted to do a chapter on Mousehole (Why’s cool web rewriting toolkit), but the timing with the newer 2.0 codebase just wasn’t going to work out.

I also really wanted to do a chapter about BioRuby, but didn’t have the right expertise to present a convincing project (maybe someday, with the right collaborator and written as an article).

But the eight best made it in, and I think they present a pretty unusual array of subjects!

Any plans for a followup book/blog/website?

Topher Now that the book is finished, I’m going to resume blogging on my website. I came up with a lot of great ideas for posts while writing the book, and it’ll be a nice to work on some smaller units of writing for a bit.

What’s the next step for a budding Ruby hacker who’s just finished your book?

Topher I say dive into coding something that you’ve been meaning to build! Thanks to all the great Ruby libraries out there, there aren’t many projects that can’t be written in Ruby. And, personally, I learn best when I’ve got a practical application in front of me.

I hope everyone that’s interested has a read through, and I also hope all of you eclectic programmers out there keep blogging, speaking, and sharing. You’re what makes it so much fun to be a Rubyist right now!

Satish Talim'sNew Ruby Class, and an Interview

Satish Talim has announced another round of his beginning Ruby class, which is great news for the Ruby community. I wanted to learn a little bit more about this opportunity, so I tooks some time to interview three former particpants: Marcos Souza, Chris Porter, and Michael Uplawski. Here’s what they had to say:

How did you hear about Satish’s Ruby class?

Michael I have been a fan of Paul Lutus’ Web-Site for a long time, but until recently had ignored his Ruby-pages. As I was looking for a way to turn programming into a hobby once more and to gain back a bit of the fun, I finally read Mr. Lutus’ recommendations, then plaughed rather aimlessly through the web. At the time, I was experimenting with, then disappointed by the language ‘D’. The appearance of Satish’s site and the way, he introduced to the course made me rest longer than just a few minutes.

Eventually I thought I could give Ruby a try. I checked back on Paul Lutus’ page, but there is no mention of rubylearning.com. So it was rather accidental, that I was finally glued to Satish’s class.

Chris I saw it announced on a blog, Ruby inside.

Marcos I have a previous experience on a web class in the past, so when I came to Ruby, googled with “free ruby course” and find it on the first line (lucky me!)

What made you decide to sign up for it?

Marcos First, I try the web tutorial and like the objective and direct style, then the next interest become a PDF version of it. To get it we need to sign up …

I read the eBook version and do all the exercises just in one month.

Then I start a review process with the class.

Chris I had been meaning to learn Ruby after seeing some great screencasts on Ruby and also Rails. I thought about just reading Pickaxe but when I saw this I thought it would be a interesting to try a different approach. I liked the idea of having a course structure and a teacher to ask questions to so I thought I’d give it ago.

Michael The idea of learning the basics without being pressed to arrive at some specific objective convinced me. I was about 70% certain, that I would break up after a few days (which I do now, but this is due to different conditions and I will be back in January, latest).

Tell me how the class works.

Chris Each week or so Satish posts a new lesson which comprises a couple of pages of his online tutorial, a short review of that material and a couple of problems. You work through the tutorial and post your solutions and any questions in the thread. Usually there is discussion amongst the group about certain issues or grey areas and people post their own problems or links to other pages on the web.

Marcos That’s the best part: each one propose different answers, we agree, disagree, discuss, and refactor some code… That is really cool. And we learn a lot this way.

Michael When I signed-in most of the lessons had already been published. I just read Satish’s tutorial, most of the time online on the web but sometimes also the PDF-version. There are assigments, a few scattered on the pages and the final assignement, you have to solve at the end of a lesson. The solutions are published on the rubylearning.com Forum, where Satish arranged for one thread per lesson. You just attach your new solutions to those, already published.

You get additional insight by reading the code of the other participants. Questions are asked right on the spot if they concern the current lesson or result from the code fragments published.

Whoever can, will contribute to clarifying the points at issue, but discussions may deviate. Satish himself gives hints to either solve a problem or to improve the code, that we handed in. Frequently references to other sources on the web are made, where an individual aspect of Ruby might be dealt in detail.

What was the most important thing you learned from it?

Michael Ruby is fun. It is a language, that you can use for fun. Why this is so, I can guess and maybe derive through my experience with using C/C++ and Java. Ruby, as any other “new” language, combines the comforts of more established languages and lacks some of their defects.

Marcos In fact I confirm that this is the best way to learn something. To get the best results you just need one action be: PARTICIPATIVE

Chris I learned that there is always a better way of doing things, especially in Ruby ! People would post solutions to problems and others would add their own solutions or improvements. It’s tempting to write Ruby just as you would in your previous languages but you can often find a simpler or more elegant solution if you utilise some of the unique language features.

Are there any books or tools you learned about because of the class that you just can’t live without today?

Chris I love noobkit: http://www.noobkit.com/ and made good use of Aptana. I got some good recommendations for books but I haven’t read them yet.

Marcos I’m reading Ruby By Example (by Kevin Baird), a brief overview exercise.

MichaelI have not bought any book on Ruby, yet. In contrast to what was said somewhere else, the bookshelves in our shops are not crammed with literature on Ruby. I have learned from bad experience, that I need to touch a book and gallop through some pages, before I buy it. Recommendations from the web are nice to pick a few from the bunch, if there were any…

It sounds like you all liked the interactive parts of the class. Where there parts you didn’t like?

Michael This is a rather inconvenient question. I never thought about it and rather took the course as something coming along unexpectedly, put another way: a completely positive experience by nature… Would I like parts of it improved or changed (if you don’t mind)? Part of the quality, which we find in the forums, we achieve by meeting on the same level of (un-)experience. Good ideas and sometimes downright /solutions/ are really developping in the community or in private but may always be inspired by the discussion, be they amateurish all the same.

It would be an improvement, if we could collect these solutions or situations, which occured in the forum and have someone present us one immaculate piece of code, as used occasionally by the masters of Ruby, themselves.

Chris Well it’s hard to criticise a free course but .. ! I would have preferred it if more or less everyone was on the same lesson at once. That might have made some of the discussion livelier though I understand that people should go at their own pace. Satish usually covered everything in the lessons in a remarkably concise way, one or two maybe could have been a bit longer.

Marcos The common forum format admits participants create topics, lots on the same subject… Sometimes it makes difficult to locate information. I think the new format could fix this.

If Satish were to announce a new class for 2008, would you rather see him put together a more advanced Ruby class or an introductory class for some other language (which one)?

Chris I think an advance class would be great and the same format would work well for Rails too.

Marcos I will focus on Rails next year… If Satish include a “basic Rails class” I’m IN for sure!

Michael Two questions, really, maybe three.

1. Languages: Python, D, Ada95 (what, if not my own favourites).

2. Advanced Ruby class: Of course, I would like to plunge deeper into the topics which have always been facinating: Sockets, general hardware-control, implementing and combining a few Design-Patterns in Ruby..,

3. Would I like to see: NO. Satish did a great job with the introductory and the beginning advanced lessons on the language-core. There is but one authority, who can decide about how this work could or should be extended: Satish Talim. I am a little afraid, that adding too much contents or attaching completely new topics to his teaching-project will have it deviate in the end. As you address me directly with this interview: I want to learn Ruby. Full stop. That is why I came to rubylearning.com in the first place. I do not want the ambience of the site altered. Everybody else will have a different opinion. Mine is this.

Why would you recommend someone else take the class?

Michael If you are looking for a way to just learn Ruby, I can tell you, that Satish Talim is a great teacher. Chosing his free (like in free) course to learn the basics, is a most intelligent move. See, I did it. ;-)

Chris Definitely, it’s good to be able to talk with other learners as you go and Satish’s tutorials always cover the topics in a concise and easy to understand way. I think I learned more quickly and understood it better than had I done it on my own.

Marcos Just that. I really think that is the best way to learn.

Thursday, December 13, 2007

Philippe Hanrigou Interview (and more troubleshooting hints)

I reviewed ‘Troubleshooting Ruby Processes’ last week. Since I really liked it, I decided to take a little bit of time to talk to Philippe about it. Fortunately, he was gracious enough to answer my questions. If you’d like to grab a copy of his shortcut for yourself, you can buy one here. Philippe passed along a bunch of great hints and links that aren’t in his book, but you can read them all below.

You focus on three great tools, what three tools would you have included if you had more time/space/expertise?

Philippe DTrace would be at the top of my list. It is the only tool that can give you a dynamic and integrated view of the entire software stack: from external devices activity, to operating system dynamics, to the behavior of your Ruby code. This alone is a killer feature that gives you tremendous insight into the holistic behavior of your software stack. In addition, DTrace is a blessing to use in a production environment: it can always stay on, has zero overhead when not in use, and can enable or disable debugging probes in real time on a running kernel. In short, DTrace should be your troubleshooting tool of choice if you deploy Ruby applications on Mac OS X, Solaris or BSD. Unfortunately, due to publishing timing constraints, I was unable to include a chapter on DTrace in my short cut. I plan to publish a detailed article on DTrace early next year though, so stay tuned. In the meantime, Joyent’s examples provide a good starting point for Ruby and Ruby on Rails developers. Big kudos to the Joyent guys for all their work on Ruby DTrace instrumentation.

Another technique that I would love to document is how you can instrument the Ruby interpreter yourself with some minimal ad-hoc tracing code to target a specific problem. Inject your code at a well-chosen spot, recompile Ruby, rerun your program with your freshly instrumented interpreter… and get your questions answered! Obviously, this approach does require some understanding of Ruby internals but this is not as scary as it might seem. Eigenclass’s Self-study guide to the sources, Patrick Farley’s blog and the Ruby Hacking Guide are good resources to demystify the (presumed) complexity of Ruby internals.

There are tons of other powerful tools that can also be used to troubleshoot Ruby applications’ traditional ecosystem, most notably the network and the database. These tools are already fairly well-documented though, so if I had time to document yet one more thing, I would concentrate on visual tools built on top of DTrace such as Chime and Instruments.

If you were going to write a JRuby version of this book, what tools would it include?

Philippe I must admit that while I have an extensive Java experience I have limited exposure to JRuby at this point. Ola Bini graciously lent me some of his expertise on this topic, which I combined with some helpful tidbits below:

In most cases, the tools that are useful for JRuby are the same ones that are useful for troubleshooting any Java application.

In particular, the JVM’s thread dump feature is a powerful introspection mechanism for investigating dead-locks and understanding your application dynamics. Since Ruby threads are also Java threads in JRuby, you can tell a lot from a JVM thread dump. When you start the JVM with one of the -XX flags you even get a Java histogram as part of the thread dump.

On the other end of the spectrum, the easiest way to investigate a problem with JRuby is often to directly change the JRuby code, either by adding logging or a Java exception at the right spot and then rerunning the faulty code. This is surprisingly easy and terribly efficient!

JRuby also includes some support for helping with dumping runtime information. Say that you want to keep track of how often a specific piece of code is reached during runtime, you can do something like this:
    Map rt = getRuntime().getRuntimeInformation(); 
    Integer count = (Integer)rt.get("SomeInvocationCount");
    if (null == count) {
        rt.put("SomeInvocationCount", new Integer(1));
    } else {
       rt.put("SomeInvocationCount", new Integer(count.intValue()+1));
    }
Everything in RuntimeInformation will be automatically dumped on exit. By combining this technique with instrumenting JRuby to provide additional logging or to raise exceptions, most of the time you get a fairly good grasp on what’s going on.

JConsole) is another standard Java tool that proves to be very useful in the context of JRuby development. You can use it to attach to a running process and then inspect memory usage, GC statistics, existing threads and so on. Besides, JConsole can also be used to trigger a lot of interesting JMX events.

Finally, when it comes to memory leaks, the usual Java tools also come to the rescue. A useful technique is to get a heap dump with jmap and then inspect it with jhat or any standard Java heap dump analyzer. Give the SAP Memory Analyzer a try—Ola had a great experience with it.

What kinds of troubleshooting/development tools do the Ruby community need to develop to ‘take the next step’?

Philippe I believe that the Ruby community would benefit a lot from having a robust thread dump tool for MRI. The thread-dump project got started on this, but the current implementation does not seem to work on any of the platforms that I have tried it on. At this point, I am determined to implement a robust thread dump tool myself, and I would gladly welcome help from the community. Contact me if you are interested in participating.

There is also a whole ecosystem of powerful tools that could be built on top of DTrace. Using DTrace probes as a foundation, you could build powerful visual tools each specializing in a particular task, e.g. profiling, performance analysis, memory leak detection, etc. These tools could potentially give you a holistic view of your system including your Ruby code but also your operating system, the network and the database. This is an area where the Ruby community can really shine as it can be done in pure Ruby, and no specific system or interpreter level knowledge is required.

For the most hardcore members of the community it might be worth investing some time and energy to help push SystemTap forward. Unfortunately, while many Ruby applications are deployed on Linux, DTrace is not available on this platform. Due to licensing and unresolved issues there is actually very little chance that DTrace will ever be ported to Linux in the foreseeable future. The closest alternative to DTrace on Linux is SystemTap, which shares the same objectives but is not quite as mature yet. In particular, SystemTap still does not seem to provide support for traces in user space programs. So SystemTap could use some love before bridging the gap between Ruby programs and the operating system.

Finally building troubleshooting tools for Ruby would be a lot easier if any Ruby developer could easily access the Ruby interpreter to instrument and extend it. This is why I consider the success of a project such as Rubinius to be the best way for take the platform to a whole new level in the long term. Any time and effort that the community invest in Rubinius will be well spent.

If you want to read more about Rubinius, you might take a look at my collection of Rubinius posts.

Ruby Dev Tools Survey Results

From the results of my Ruby developer tools survey, it looks like there are three real tiers of tools that a lot of people are using:

The big guys: Test Unit (53%), RSpec (50%), and rcov (47%)
The middle tier: autotest (38%), and ruby-debug (32%)
Everyone else: ruby-prof (17%), heckle (6%), and dcov (2%)

The only surprise in the top tier is that RSpec has already captured as much mindshare as it has. Of the 50% who are using it, I wonder what has seduced them over to the RSpec side? In fact, that’s my next survey—go ahead an fill it out (before the 17th of December). I do think it’s interesting to see that all three of ‘the big guys’ are testing tools. I think this speaks to the testing culture that runs through the Ruby community.

It’s good to see that autotest also did well (again, speaking to our testing culture). I hope more developers will pick up heckle and flog (I still can’t believe I didn’t include that one in my list) as obvious next steps to improve testing.

For all that the ruby community says it doesn’t need a debugger, ruby-debug placed ahead of ruby-prof in the rankings. That surprised me a bit. Maybe I should have included the standard lib profiling and debugging tools too, just to get a cleaner picture of where our debugging and profiling usage lies.

I think my biggest disappointment was dcov though. I guess this, too, reflects our community. Poor or incomplete documentation is a frequent knock on Ruby. I would have hoped that a tool designed to help us overcome that would have been more broadly accepted.

Ok, so those are my thoughts. Now, it’s your turn. What tools did I miss? Why do you like the ones you’re using? Are you going to add a new tool to your arsenal, and if so, which one?

Friday, December 07, 2007

Breaking Rubinius News (And An Interview Too)!

Ok, I’ve been sitting on some news for a while and it’s finally out in the open, so I guess I can talk about it.

Engine Yard has just snapped up some seriously big guns in the Ruby world. From my conversations with them, is sounds like they’ll be putting in time on Rubinius as well as other Ruby enhancing projects. Here’s what Ezra told me about it:

The big plan for rubinius is that we will be hiring a bunch more top guns to get rubinius production worthy. These folks include Wilson Bilkovitch, Josh Susser, Ryan Davis and Eric Hodel and Yehuda Katz. I can’t think of any better people to get rubinius to 1.0 and beyond and running fast.

Update: Josh Susser isn't going to work for Engine Yard as noted below. I'll add something in a bit to clear things up. He told me "I should point out that I'm not going to work at EY. I'm at Pivotal now, and happy that things worked out that way, and no hard feelings anywhere either. I'll still be contributing to Rubinius, but not as an EY employee.". Sorry to pass along dated/incorrect info.

Update Two: Evan Phoenix says "In addition to Ryan and Eric, Wilson Bilkovich and Brian Ford will be starting with EY to be paid to work on Rubinius in the January. Again, I’m so amazed and thrilled that EY is providing Rubinius with the funds to let these guys work on a project they love fulltime." Maybe this'll teach me not to use a quote I've been sitting on.

Some of this seems to have been an open secret at RubyConf, but with Eric Hodel’s recent post on the current Rubinius spring I think it’s time to celebrate.

Speaking of the sprint, I asked the hackers involved if they’d be willing to answer a couple of questions in between coding up new rubinius goodness. This is what they told me:

This sprint looks like a big deal. What do you hope will come from it?

Eric Hodel Pure awesome.

This one has been part organizational, setting up the details of EY employment, and half hacking on things such as the compiler and RubyGems.

Josh Susser I’ve been looking for a way to get more involved in Rubinius, so to me it’s been an opportunity to find an area where I can contribute that makes sense. I love doing VM work, but I’m not much of a C hacker, which is a little problem if you’re going to work on a VM that’s written in C. As it turns out, one of the goals for the project is to build a simulation of the VM in Ruby itself. This will be the seed of the eventual Squeakification¹ of Rubinius, but in the near term it will help with understanding the operation of the C VM and doing experiments and explorations. If you hear the name Popgun somewhere, that’s the VM-in-Ruby simulator.

Brian Ford Lots of face-to-face time. The opportunity to work in person with folks and talk over process issues, brainstorm, pair, and socialize. We started off making a big and unrealistic but fun list of both serious and fun goals. As with most things, process will likely end up being much more important than the actual checklist of results.

Wilson Bilkovich For me it’s an opportunity to take a vacation from my ‘day job’ and get some actual Rubinius work done. I’ve had far too little time for that recently. We have a long list of sprint goals. Hopefully at least a couple of the major ones will get checked off.

What’s the best part of getting together for a sprint like this?

Brian Personalities. Some bits of technology can be inspiring, but I find people far more inspiring than most things.

Wilson Being able to siphon knowledge out of peoples’ heads at a much higher rate.

Eric Pairing. Having instant feedback for problems is unbeatable.

Also, Wilson and I added unit_diff support to mspec to aid testing of parts of the compiler.

Josh Aside from just getting to hang out with a lot of awesome guys, it’s a really different experience getting to talk in person about stuff. The IRC channel is okay, but it is so high-bandwidth that I have trouble keeping up. I get up to grab a drink and when I come back there are 300 new lines of stuff to catch up on. Also, mango lassis are yummy.

Are there going to be more Rubinius sprints? When, where, and what do you hope to accomplish in them?

Brian Certainly there will be more. The hope is to accomplish a ton of Rubinius development in the shortest time possible. A related goal is to reach out to other Ruby developers in a meaningful way so that their pain points using Ruby permeate our consciousness while we work to make Ruby better.

emacs vs vi? dvorak vs. qwerty?

Brian Textmate, emacs, vi, but I spend most time coding in Textmate. Qwerty, but I paired a bit with Nathan Sobo and he set me up with the input switcher, so now I have no excuse not to learn dvorak.

The first Rubinius specific project (other than Rubinius itself) was recently started at RubyForge. How big a milestone is this? What should we be seeing next from the Rubinius community?

Wilson Surreal.

Brian It is interesting from the perspective that it is a project just for Rubinius. But one of the goals of Rubinius is work wherever MRI works. So, that implies some sense of anonymity ( i.e. you should not even notice your program is running on Rubinius vs MRI). However, we do intend to use Rubinius as a platform to extend the state of the art of Ruby (e.g. with great concurrency primitives), so I fully expect to see new Rubyforge projects popping up that are Rubinius specific.

¹ Squeak

Wednesday, December 05, 2007

Troubleshooting Ruby Processes

Philippe Hanrigou has written an excellent ‘shortcut’ for Addison-Wesley’s Professional Ruby Series. This is a really different kind of Ruby book, and one that’s long overdue.

Troubleshooting Ruby Processes takes a quick look at some of the Rubyland tools that developers can use to find problems, then dives into three (unix) system tools that aren’t as well known (or used) as they should be—lsof, strace, and gdb. This coverage makes up the bulk of the book.

I’ve spent some significant time with lsof and strace (less with gdb), but there were still things I learned from Philippe’s coverage—like using +r on lsof to cause it to stop refreshing when it no longer found a specific file. wow! Who knew? Philippe does a good job of introducing each of these tools, and keeping his discussion in a context useful to Ruby and Rails developers.

One easy to overlook feature of this shortcut is the supporting web page Philippe is maintaining. It is only mentioned as an aside in the conclusion of the book. Hopefully, readers don’t overlook it, because this may prove to be the most useful part of the book as it matures and collects additional tidbits from Philippe and from readers.

You can buy the book direct from Addision-Wesley. I think you’ll find it’s a great addition to your virtual bookshelf.

NetBeans 6.0: Interview with Tor Norbye

I just saw the news the NetBeans 6.0 is out, featuring some great Ruby and Rails support. Tor Norbye and the rest of the NetBeans crew deserve some major kudos for all their work.

Tor was nice enough to take a couple of minutes to answer some questions about Ruby and NetBeans for me. So, here goes—

Why should Ruby/Rails folks who are busily hacking away in emacs/vi/textmate/whatever take the time to download and try out NetBeans?

Tor Well, there are a couple of features that they might find boost their productivity. You may not want to switch to NetBeans, but you can add it to your toolbox and pull it out when you need it.

(1) The first such feature is integrated debugging. After some pretty minimal setup (installing the fast ruby debugger gem) you can easily debug your Ruby and Rails applications, single stepping into and over statements, setting breakpoints, looking at the call stack, looking at local variables, drilling into data structures – even just hovering over variables in the editor to see their current values as tooltips. This doesn’t only work in Ruby files – it works equally well in ERB/RHTML files.

(2) The second feature I’d like to highlight is semantic editing, where by utilizing a parse tree, NetBeans can add some extra functionality. The most obvious is Quick Fixes. Briefly, quickfixes are little semantic checkers that look at the parse tree and look for potential problems. When these are found, it puts a little lightbulb in the margin to alert you to the problem, and the user can then look at the problem and decide what to do about it (most of these semantic checks also offer options to fix the problem automatically).

Thus, these quickfixes are like “lint for Ruby”, and even if you prefer to stay in your current editor, you might want to open your sources in NetBeans occasionally and let it check your code. Examples of bugs it looks for are:

Accidentally assigning to a local variable instead of the intended attribute of this class (a pitfall listed early on in the Pickaxe book—as I was learning Ruby I immediately dogeared this page because I decided I wanted NetBeans to detect this problem and I recently got to un-dogear it :).

Accidentally reusing a local variable as a block variable

Accidentally assigning to a variable in an if-block

Calling a deprecated Rails API (e.g. using @session instead of session, or link_to_image instead of link_to, etc.)

(Many of these hints are not in the base 6.0 download, but they are on the 6.0 Stable Plugin Center which you can access from within the tool. For a full list of the current hints, see the RubyHints wiki page)

(3) There’s some other editing functionality available that you might find useful:

Extract Method and Introduce Variable: This lets you very easily break down long complicated methods into smaller units; again, by using the parse tree NetBeans can ensure that the code fragment is correctly pulled out by passing in all required inputs and passing back out all required outputs.

Rename Refactoring: While not as accurate as in Java, this is still better than relying on search/replace, since NetBeans for example will not mistake a reference to the local variable “foo” as a reference to the unrelated method “foo”

Go To Declaration, which lets you quickly jump around in classes. Useful for exploring new classes.

(4) ERB/RHTML editing. While many editors support Ruby well, the ERB files aren’t equally well supported. Quickfixes, refactoring and reformatting for example, all work in RHTML, so if you’re editing ERB files a lot and want to reformat them etc. you might want to try NetBeans for that.

I realize I’ve gone on way too long, but it’s very hard to stop myself :)

What was the hardest thing to get right?

Tor Type inference, by far. It’s impossible to do it accurately. Luckily, Rails in particular follows a lot of patterns, so NetBeans can use heuristics that work out well in most cases. I tuned these up to the very end of the release, and in the process I’ve learned a lot. But the more I get done, the longer the TODO list grows with new ideas…

What still needs the most work?

Tor Well, I think Type Inference is certainly the area where there is the most potential work available—I could probably spend ten years improving it. However, I feel okay with the current level of inference; I think Ruby users “get” the amount of inference we’re doing and when they can and can’t rely on it. I have some targeted things I’d like to achieve in the next release but I don’t think they’ll be dramatic.

I think our weakest area right now is the terminal handling (for running IRB and Rails consoles etc). Many Rails tools like to emit terminal escape codes to colorize the output; RSpec for example will use colors to highlight its summaries, ActiveRecord colorizes SQL logging output, etc. Our output window doesn’t support terminal escapes, so we simply strip the codes out for now, and furthermore, we’ll need full proper terminal emulation as well as pseudo terminal support to properly handle “readline” interaction which is required for running IRB. Similarly, tying IRB sessions (or even running scripts like the Rails generator) into the IDE with consistent color highlighting, completion etc. would be highly desirable.

If you could wave a magic wand and add one Ruby/Rails related feature to NetBeans, what would it be?

Tor It would probably be the above output window improvements. I’ve looked at terminal emulator code before, and I’m not looking forward to it…

What non-IDE tools do you think Rubyists should be using more?

Tor Well, one area I find a bit lacking in Ruby is the documentation area. While most classes are well documented at the top of the file, individual methods and attributes are often, let’s say “sparingly”, documented. This might be because users are intended to look at the source code for any libraries they are using and would naturally read the top of the file to learn about the library. This doesn’t work so well for people exploring the API via something like code completion (or even “ri”). It would also be nice if the libraries would be clearer about what is “exported API” and what are “implementation” methods not intended for public consumption. One of the things I added late in the 6.0 cycle is making code completion filter out any methods marked ”# :nodoc:”, and that helped a bit, but I’ve seen a lot of code that I believe are implementation artifacts and shouldn’t really be exposed from the class, so would be prime candidates for #:nodoc. So, in short, I’d love to see the documentation improved a bit. This is not exactly a tool per se, or perhaps I’m really advocating the use of “rdoc” more to view the docs for the code you’re writing. (I’m sorry for bringing things back to an IDE at this point—I can’t help it since I’ve worked on IDEs since 1996. One feature I added to help with this in NetBeans is that while you’re writing a comment, you can just type Ctrl-Space, and the IDE will pop up an HTML view of the comment you’re writing (processing all the RDoc conventions).

Anyway, it’s been really fun to join the Ruby community; there are a lot of enthusiastic and talented programmers here!

flog: Profiling Complexity

One tool that I should have included in my survey, but forgot, is flog, yet another great tool from Ryan Davis and Eric Hodel. flog is like a profiler for your code’s complexity instead of it’s performance¹.

Why worry about complexity? Well, there are a three good reasons I can think of:

If you’re dealing with legacy code, knowing where the real complexity is will help you prioritize your code reading as you try to figure out the code base
In my experience, the complex little knots of code are where bugs are most likely to lie, so flog can tell you where to focus your testing
Finally, those complex sections of code also become great candidates for refactoring—it’s always easier to debug, optimize, or add features to code that’s easier to understand.

flog is a gem so it’s easy to install, and once installed it’s easy to run. To run it against my LogWatchR tool, I just need to drop into the logwatchr/lib directory and do:

flog logwatchr.rb > flog.report

(Since this generates a pretty length report, I redirected it out to a file.) Here’s the trimmed output from running this:


  Total score = 211.720690020501
   
  WatchR#analyze_entry: (34.2)
     9.8: assignment
     7.0: branch
     4.5: mark_host_last_seen
     3.2: pattern
     2.8: []
     2.8: is_event?
     2.0: alert_type
     2.0: alert_target
     1.8: alert_msg
     1.8: notify
     1.6: event_notify?
     1.3: notify_log
     1.3: join
     1.3: split
     1.3: each
     1.3: now
     1.3: each_value
     1.3: record_host_if_unknown
     0.4: lit_fixnum
  WatchR#event_threshold_reached?: (31.6)
    21.3: []
     2.6: branch
     1.8: tv_sec
     1.6: -
     1.5: length
     1.4: >
     1.4: assignment
     1.3: >=
     1.3: mark_alert_last_seen
     1.3: delete_if
.
.
.

I’m skipping the report on WatchR#analyze_entry because, while its total score is higher than WatchR#event_threshold_reached?, it accumulates points a lot less evenly. The code for WatchR#event_threshold_reached? looks like this:


  def event_threshold_reached?(host, event_type, time)
    @hosts[host][event_type][:alert_last_seen].delete_if { |event_time|
      time.tv_sec - event_time > 
      @hosts[host][event_type][:alert_last_seen_secs]
    } 
    mark_alert_last_seen(event_type, host, time)

    if @hosts[host][event_type][:alert_last_seen].length >=
        @hosts[host][event_type][:alert_last_seen_num]
      true
    else
      false
    end
  end

The report shows a lot of complexity surrounding the hash key lookups. This corresponds to a change I keep meaning to make, but haven’t gotten around to. I think the whole nested hash structure is ugle and hard to maintain, so I’ve been planning on replacing it with a better object structure. It looks like flog agrees with me.

WatchR#event_dependencies_met? (not shown above) also reports a higher level of complexity based on hash traversal, so finally sitting down to make the change from a nested hash would be a win here too.

¹ If you were looking for an article on profiling, you might also want to look at these:

On Ruby