Friday, February 27, 2009

Tinyrb Interview

After reading about tinyrb I wanted to ask it's developer, Marc-André Cournoyer (@macournoyer), a few questions about Ruby and what he's doing with it. Here's what we talked about.


Poking at your blog, it looks like you've done a lot of work building ruby implementations, or parts of them. Why? What's the value in this kind of hacking?

Marc-André VM implementation has been one of my interest for about a year, right after releasing Thin I think. I bumped into tinypy and thought it was the coolest idea ever. I tried porting it to Ruby with tinyrb and my first attempt (last summer) was a total disaster. But I learned a lot about Ruby's internal and YARV bytecode. It's very enlighting to understand how things work inside, that your if is compiled to a branchunless instruction. It's not just magic anymore.

Speaking of magic, when implementing a language there's this kind of magical moment when you run your first chunk of code. You've created all the parts but seeing them work together is a nice feeling. You're hooked once you've been thought that and know you'll be spending a lot of time on it. It's the perfect mix of art and science. You're creating a way to express yourself but at the same time reading all those crazy research papers.

I wish more people would stop redoing all those 20 lines Rails plugins and find the courage to learn something new again. Software is an amazing domain to be in. There's no limit to what you can do and the only thing you have to invest to push your limits is time. 2 years ago I was doing ASP.NET and didn't know the difference between a GET and POST request. A year ago I didn't know very much about C. I'm sad when I ear someone say they have no side project or that they just do Rails stuff. That's like eating the same meal every day.

What have you learned about Ruby from working with/looking at implementations of other languages?

Marc-André I've learned that Ruby is not that dynamic. There are much more powerful languages out there. Io for example, allows full introspection of the message chain. Meaning you can, amongst other things, control the evaluation of method arguments. Also it's prototype based like JavaScript and unlike Ruby, which is class based like Java. It's another way to structure your code. Learning new languages and programming paradigms helped me think outside of the box when I go back to Ruby.

Ruby is probably the best combination of simplicity, speed and power. But, if you think Ruby is the most powerful and extensible language, like I did, it's time for you to look at other languages.

Other than Ruby and C, what languages are you using or investigating, and why?

Marc-André I felt in love with Io's simplicity and power. It's an amazing language to play with. I don't know if it's usable for larger projects, but we sure can learn a lot from it. All language constructs are implemented in Io itself. You can add operators at runtime. And there's no parser, just a lexer that creates a chain of messages. Also, I had trouble with using space instead of dot for message separator at first, but after spending more time with it, I find it makes the code lighter to the eye. But because of the way it lets you evaluate arguments lazily, it's impossible to compile it to bytecode, which makes it bloody slow.

Lua is becoming famous for it's small and fast VM in the language community and it already is in the gaming industry. They did a couple things differently, like using a register based VM. The code is relatively simple and well structured. There are a couple great papers about Lua on the Internet which makes it a great starting point if you want to study a VM.

Potion was one of the main inspiration for tinyrb internal design. Although I'm not sure about the language syntax, I like the way it is implemented. In fact, I started coding on my own programming language, called Min, about the same time _why started Potion and we shared some of the same concepts, like using an open extensible object model, from a paper (pdf) by Ian Piumarta. A couple parts of tinyrb are directly derived (stolen) from Potion. So I owe _why a big thanks for this.

How serious is your TinyRb project? How far do you plan on taking it?

Marc-André If by serious you mean stable, then yes I hope to bring it to a stable state someday and make it usable for some limited "real world" usage. But I had the feeling all the Ruby implementations right now are a bit too serious. Each are supported by a company. When there's money involved, there's less freedom. I have no problem saying that tinyrb is the less serious Ruby implementation of them all. I want it to be the Ruby VM people use to learn and play with so they can write better code and maybe contribute to other implementations.

As for my specific goals with tinyrb. I'd like keep low memory footprint and be fast and complete enough to run small web/desktop apps. You know, when you don't need the full thing. By being small, it would enable you to use Ruby for small daemons that need to use as little memory as possible on servers and this kind of stuff. Also, I'd like to add features such as Sandboxing so you could run tinyrb inside another VM when you need to eval unsafe code.

How much do you interact with the other Ruby implementation projects?

Marc-André I've been a passive follower of Rubinius for a while. Evan and Brian noticed tinyrb and answered my questions about VM design. I hope someday to contribute something back as tinyrb is helping me understand more of Rubinius internals.

I have lots of admiration for all the people working full-time on a Ruby implementation. It requires constant learning about very complex things but at the same time answering questions of people with various knowledge level. It's very demanding, I'm sure.

What kinds of contributions would be most welcome for tinyrb?

Marc-André Anyone that wants to help can take a look at the TODO file in the project. The main goal right now is to run RubySpecs. So anything that can push it in that direction would be awesome. Here's a short list of things I need help with: write a better grammar (which I totally suck at), implement more core libs (IO, Dir), help find out what is missing to run RubySpecs or simply compile and run tests on your machine and report any error or warning.

But if you want to hack on something else, that's cool too. As long as it doesn't involve a kilt and 2 potatoes I'm OK with it.

Since you've managed to get your hands pretty dirty dealing with Ruby, what do you wish the language did differently?

Marc-André I wish Ruby had a simpler parser and trimmed down the syntax to remove useless stuff. I'm not sure what makes it that complex but having implemented my own in tinyrb I'm sure there's a way to keep what useful and simplify it.

I wish they'd remove everything that does not comply to the principle of least surprise. For example, the difference between proc and Proc.new, namespace lookup weirdness. Maybe if MRI team would use RubySpec a bit more we'd see less of this. I'm just guessing, I'm not following 1.9 development that much. But looks like they are breaking a couple RubySpecs on each release.

I wish MRI/YARV code had consistent indentation.

I wish there was a way to implement macros in Ruby, much like in Io with lazy arguments evaluation. This way we could implement all language constructs in Ruby (if, while, def, etc.) but that would kill the speed we've all been waiting for.

I wish Matz would stop wars, hunger, poverty and bring peace around the world, but I guess he's too busy saving us Rubyists first.

Click here to Tweet this article

Thursday, February 26, 2009

blog buttons for MWRC

Are you planning on coming to MWRC 2009> If so, it's time to grab a spiffy new button to liven up your blog and let everyone know what good taste you have. You can get them here.

'Course, you'll be getting an attendee badge, not an organizer badge ... but hey, you could always step in and help with 2010.

Tuesday, February 24, 2009

MWRC 2009 Mini-Interview: Jim Weirich

In my latest MWRC mini-interview, I talked with Jim Weirich @jimweirich) who is a second time presenter. MountainWest RubyConf is well known for the quality of presenters and attendees, you owe it to yourself to be a part of it this year. At $100 for two days, it's a great value — what are you waiting for, go register


There's no description of your talk 'The Building Blocks of Modularity' on the website. Other than oatmeal, what can we expect from it?

Jim I've been thinking about principles of software design lately. You know, the "rules" one programs by. We all have them, whether it is something simple like the "DRY Principle", or the "Keep Methods Short" rules, or more elaborate rules like the set of SOLID principles from Bob Martin.

So, what makes a good set of principles? It seems to me that quite a number of these principles deal with the issues of program complexity and keeping software maintainable and under control while continually changing it. Is there something in the fundamental nature of software that causes us to gravitate toward these common principles?

Meilir Page-Jones likes to talk about software in terms of connascence. Connascence is simply the idea that different parts of a software program must changes together in order for the entire program to work correctly. For example, to change the name of a function, I need to change the name everywhere the function is used. Every location that references the name of that function is related to all the other locations through "Connascense of Name". By understanding the different kinds of connascence, we can begin to understand the principles we use to build more modular and maintainable programs.

Your 'Shaving with Occam' was a favorite at MWRC 2008. Which talks from last year stood out to you?

Jim The talk that stands out for me was the Shoulda talk by Tammer Saleh (@tsaleh). That was the first time I had heard of Shoulda and since then I've become a big fan of that library.

What are you most looking forward to at MWRC 2009?

Jim I have to choose? Playstation and Wii talks! Testing talks on Cucumber and GUIs! I think more than the talks I'm looking forward to just meeting great people and talking about Ruby.

What do you think 2009 holds in store for Ruby, the language and the community?

Jim It's always hard to predict the future. With Ruby 1.9.1 out now, I'm hoping that more and more gems and libraries will be upgraded to use it. There are some really exciting features in 1.9 and getting the community on board with 1.9 will be a critical step in bringing Ruby into the future.

If you could attend a Regional Foo Conf for some other language, what language would it be?

Jim I would love to attend a conference on Clojure. I've been following the language a while, but haven't really had the time to actually program in it. With massive multi-core system becoming common place, concurrency is only going to become more and more important. Languages like Erlang and Clojure that attempt to address the concurrency problem head on are going to be big players in that arena.

Click here to Tweet this article

Tuesday, February 17, 2009

Raganwald, Pareto, and Infrastructure Operations

Raganwald wrote:

My conjecture:

  1. 20% of the features are responsible for 80% of the headaches of software development, and;
  2. 20% of the features are responsible for 80% of the value of the software to its users.
My question:
Are those the same 20% of the features on your project? If not, why not?

What about infrastructure:

  • 20% of the app causes 80% of the operation headaches.
  • 20% bears 80% of the user load.

How much do these overlap?

What are the outliers and how do you deal with them?

If you support more than one application, do these suppositions still apply? If so, how do you deal with them across application developer groups?

Monday, February 16, 2009

Matt Bauer Interview

A little over a year ago, I picked up a copy of Visualizing Data, and wrote a review of it. About a month ago, I discovered that Matt Bauer (@mattbauer) was writing Data Processing and Visualization with Ruby. The book's out as a rough cut, but not yet on bookshelves. It looks plenty interesting though, so I asked Matt to join me for a quick interview.

Update: Just in case you want a direct link to the rough cut, it's here.


Data visualization seems to be an increasingly popular topic. What are some interesting ways in which you're seeing it used?

Matt I think some of the interactive animations of data are really impressive. It's a great way to show a lot of data and their interactions. The recent code swarm videos are an example of this. I've also seen animations that loop and allow for various data dimensions to be added and removed to see it has an affect or not on an aggregate. It's a great way to quickly identify what data dimension is responsible for some observed change. Take an international shipping company whose having 10% of it's shipments from Hong Kong arriving late to San Diego for example. The company likely has a lot of data such as origin, vessel, crew, inspections, route, weather, destination ports, times, contents, maintenance records, and a ton more data dimensions. Using a looping animation that shows the path of packages on a global map over time, various data dimensions or groups of data dimensions could be added to see if the path color (representing delay time) changes at all. You could even do a Minard style map too. The ability to interact with the data is a much faster way to understand the data than looking at a number of individual static graphs.

Converting data into audio is also an interesting way to represent large amounts of data when looking for abnormalities. The idea is rather simple actually. Each data dimension is a separate track or instrument with the overall beat being determined by one dimension. For example, requests per second could determine the beat, drums the database activity and a hi top the memcached cache misses. It can take some time to create a pleasant enough orchestration but once the right instruments are assigned to the data dimensions, it makes it incredibly easy to hear problems. It's much like a mechanic listening to an engine and knowing if it's working properly or not. Again, this works best for doing a quick check of a system such as when a user calls since listening to it all the time is more likely to cause a headache rather than avoid it.

How does Tufte fit in to all of this?

Matt Tufte really comes into play for that second set of graphs. If his ideas and principals are followed, you should have a successful graph, illustration, table, report, etc. That's not to say only use the graphs, illustrations, tables, and reports he uses. It's to say come up with your own graphs, illustrations, tables, and reports that work with the data you have. Just make sure you stay true to his ideas and principals.

Can you give us a quick walk through of your approach to finding the right visualization for a dataset?

Matt My approach is two fold as there are two graphs (graph sets) to most data. The first set of graphs is figure out what the hell the data is. It could have a logarithmic distribution, maybe exponential. Maybe four of the variables are dependent but the other two aren't. The point is, you need a number of graphs to figure it out. I often start with a simple scatter plot and go from there. This isn't so bad with software like Tableau or other graphing programs. Once I know what I'm looking at, then I move to the second set of graphs. The purpose of the second set of graphs is to sell the next person on what you see in the data as quickly as possible. It's the second set of graphs that take the most amount of time.

Can you talk to me a bit about the commonalities and differences between data mining, collective intelligence, and the kind of data processing you're writing about?

Matt Collective intelligence is made up of multiple components: cognition, cooperation and coordination. Of the three parts, data mining can be used to provide cognition. That is data mining or determining patterns from data can be used to predict future events which is a necessary part of collective intelligence. What I'm writing about is dimensional data modeling which is the technique use to allow data warehousing and data mining. When I talk to less technical people I tell them I'm writing a book on how to use all the data they collect to make business decisions which will result in increased profits. The book starts with a couple chapters about dimensional data modeling theory. It then shows how to implement the theories in an RDMS and using ActiveRecord to query it. ActiveRecord works but it's not the best pattern to use. As a result I next talk about using Coal, a dimensional data modeling framework I've developed and used on a number of projects. I'm in the process of extracting it and open sourcing it; soon I hope. I also talk about extracting, transforming (cleaning up/normalizing) and loading data to and from various systems. The book ends with discussions on visualization techniques ranging from sparklines to mpeg videos.

How does dimensional data modeling fit into non-relational DBs (.e.g, CouchDB or BerkeleyDB, which you mentioned earlier)?

Matt The most popular non-general purpose RDBMS systems out there are probably the OLAP systems from companies like Microsoft, Oracle and IBM. I'm not positive but I think often times their a general purpose RDBMS with additional code for doing cubes and aggregations quickly. CouchDB and BerkeleyDB as you mention, aren't an RDMS system. BerkeleyDB is a really excellent, fast, highly concurrent Btree and HashTable for the most part. That's not to belittle it; just the best way to explain it. It a great place to start if you want to build a database system yourself. In fact, MySQL in the beginning used it as it's backend. You could use BerkeleyDB as a store for dimensional data. One thing to remember though is BerkeleyDB is doesn't have a query language. So unless you have a fixed set of queries, you'll likely have to write code to breakdown your query language into gets and puts for BerkeleyDB to work. CouchDB too could work as a dimensional data store. I don't think I would though. It's the same reason I don't like most DBs out there for dimensional data store, they store the data inefficiently for the task at hand. Most RDBMS are row stores meaning they store all the attributes (columns) of a row together. This works great for transactional systems where most calls are like User.find(1) and you need to operate on the entire state of the User model. It's not great when you're just concerned with the age attribute for all rows. The real solution is to use a column store like MonetDB or Vertica. I personally would like to build a better open source one but am having problems finding time. With a column store, each column in a row is stored separately on disk. This makes a query on a column for all rows very fast. It also allows for great compression and encoding. Column stores have shown 100x-1000x improvements compared to row stores.

Ruby isn't the fastest language around, what makes it the right language for data processing and visualization?

Matt Ruby doesn't have the fastest execution time but I'd argue no language is going to have a fastest enough execution time. The truth is when processing large datasets you often run into physical limitations. For example, a 100GB dataset on a Fibre Channel drive theoretically takes about 2 minutes to read. So even before you add code, you're looking at a minimum of 2 minutes. A faster language cannot change that. So in order to speed things up you have to look for better algorithms such optimized b-trees, encoding, compression, indexes, projections, etc.

So if execution time isn't important, why Ruby then? Why not use Java, C or Erlang? I think there are two main reasons. The first is Ruby's ability to easily access and transform data and Ruby's ability to integrate with almost anything. The success of a data processing project often rests in the quality and quantity of data to process. Ruby with it's scripting ability and large number of gems make it easy to create programs to fetch data from a variety of databases, web services, web sites (scraping), ftp sites, etc. Of course data from multiple sources often have different names for the same thing and this is also where Ruby shines. Ruby's regular expressions, blocks and dynamic typing make data transformation much easier than in other languages.

The second main reason to use Ruby is to interact with the diverse number of components needed to query and report in data sets. This includes everything from data stores like BerkeleyDB, PostgreSQL, and Vertica, to various visualization libraries like Processing, Graphviz, ImageMagick, and FFMpeg. In short, you can use the best tool for each job and control them all with Ruby or just one language.

What makes you the right person to write about it?

Matt I've been intrigued with data ever since college. My degree is actually in biochemistry but I worked at the Space Science Engineering Center providing support to scientists as they studied weather data from satellites, sea buoys, inframeters, Antarctica ice cores and other remote sensing equipment. Most of the work was done in C or Fortran and the data was typically structured as a number of matrices some of which were absolutely huge. It wasn't uncommon for a program take four days to run using the latest SGI Origin hardware available. After college I worked for a number of places that dealt with very large datasets including the United States Postal Service, later at the Federal Reserve and now as a consultant. During my years I've had to build everything from databases to visualization systems. I've also spent much time working with the end users of such systems to understand how they interact with data. This includes typically 2D graphs to complex animations to completely immersive cave systems. It's quite easy to have two people interpret the exact same complex data visualization completely different. I know what makes a successful project and maybe more importantly I also know what guarantees complete failure.

Click here to Tweet this article

Friday, February 13, 2009

Stackful Rubinius Interview

Yesterday, Brian Ford (@brixen) unveiled some sharp new work that Evan Phoenix (@evanphx) has been doing on the rubinius internals. (You can read about it here and here.) It's been a while since I've run an interview with members of the Rubinius team, and this seemed like the kind of news that people might like to follow up on, so I decided to shoot off an email to Brian and Evan. Here are the answers I got.


An almost 2x speed-up is great work (especially with more optimizations to come). Beyond the Blue Book, where everything started, what are your major influences these days?

BrianI've been trying to spend a lot of time in Virtual Machines by Smith/Nair, Garbage Collection by Jones, Advanced Compiler Design and Implementation by Muchnick, Types and Programming Languages by Pierce, and reading up on JVM papers and other papers that Evan or I come across.

Evan I've got the same "Virtual Machines" book, but I mainly read papers I find, talk with people who've solved similar problems (Slava Pestov, Charles Nutter, etc), and read a lot of other peoples code.>

Getting my head around a problem and how others have solved it is the best way I come up with a solution of my own. Understanding the trade offs of other solutions, their constraints, etc, is key to figuring out how to adapt things.

Lot's of people have talked about Rubinius as a possible future host VM for Ruby proper. With Ezra's offer (on behalf of EY) to take on Ruby 1.8.6 maintenance, what are the odds that Rubinius will be a part of the plan?

Evan I don't see any problems with this. EY is clearly looking to be a stable hand on the rudder of 1.8.6, providing bug fixes and such. Their idea, I'm assuming, is to provide the best 1.8.6 they can under the circumstances. Rubinius is in an entirely different boat, because it's working under a completely different set of goals and constraints.

They're both wonderful consumers of the RubySpecs.

Brian I just heard from you about Ezra's offer this morning. Given the Ruby expertise at Engine Yard and their commitment to providing an outstanding experience of Ruby for their customers, it sounds like a great idea. Rubinius has been trying to help out the larger Ruby community since the beginning through the RubySpec effort. Leveraging the RubySpecs as a common thread among the alternative implementations, I expect Rubinius to continue to be a big support for usage of MRI 1.8.6. At the same time, Rubinius and MRI 1.8.6 will always be distinct beasts. The architecture does not permit sharing code (other than Ruby standard library files).

One of the things that attracted a lot of attention for Rubinius was the early pronouncement that Rubinius would be (mostly) Ruby in Ruby. How far along that path is the project? What are the next obstacles you hope to clear?

Evan There is always going to be a barrier between the Ruby layer and whatever is below it. For quite some time, there has been talk of having some kind of translation layer, that could translate a subset of Ruby into a lower language, to run normal Ruby on top of. We've got no problem with this as a goal, but we simply haven't put any work into it lately.

We have to be pragmatic about how to apply our architecture decisions, so it's a balancing act.

Brian Rubinius has been a mixed architecture since the beginning, first with a C-language VM and now with one written in C++. However, our compiler and vast portions of the core library are written in Ruby. We pushed for using FFI so that even more code could be written in Ruby.

As we make the execution of Ruby code even faster, we can potentially replace some of our primitive operations with Ruby code. The VM has a major division between the primitives for data types like Array and the core VM structures. The primitives for data types are an area where an easily compiled dialect of Ruby could be used (Evan worked on this with Cuby, later renamed Garnet). At the same time, better techniques for runtime-type handling may make a dialect unnecessary.It is an open problem that we are looking at.

For the moment, we're focusing on performance and compatibility toward running Merb/Rails and the other micro web frameworks. In doing so, we write a lot of Ruby code, which is a great thing. It's also a great place for contributors to dive into the project. Hint, hint.

There has been some talk of hosting other languages on the Rubinius VM. To date, I don't think that's really come to fruition. Is it realistic? Do you think it will happen?

Brian Sure, it will be done eventually. The other day, the author of Heist popped into the IRC channel with a problem. (Heist is an attempt to write a Scheme in a little Ruby and a lot of scheme.) Turned out he uncovered a bug in instance_eval. Evan fixed it and Heist is running on Rubinius.

I'm also looking at JamVM and the possibility of running Java code directly on Rubinius. At the core, Rubinius is a pretty slick stack-based VM. You can do a whole lot with that under the hood.

Click here to Tweet this article

Tuesday, February 10, 2009

MWRC speaker interview: Ben Mabey

Our next MWRC mini-interview is with Ben Mabey (@bmabey or on github). Ben is well known in the Ruby testing community. He's made a number of presentations at URUG meetings, and is going to be doing a presentation on "Outside-In Development with Cucumber" at this years MountainWest RubyConf. Come see Ben and the rest of our presenter, but register soon so we don't fill up with out you.


What experiences have sold you on testing in general?

Ben When I started doing Rails in 2005 I tried my best to TDD my models and write Rails functional tests. Having tests greatly lowered my stress level when I was deploying applications. Unfortunately my earlier test suites fell into disrepair as I became the victim of many common TDD pitfalls. It wasn’t until I was more experienced and had read a lot about the subject that it started to dawn on me that TDD was really about design. Once I realized this, there were big benefits. On my most recent project we had a really cool experience when we were asked to add new functionality to the system. What seemed like a difficult and time consuming task turned out to be quite easy and we accomplished it by writing a 50-line class that delegated to other (already existing) objects. Testing our objects in isolation and refactoring constantly yielded a very flexible and reusable design. It was neat to see how our design emerged out of the red-green-refactor cycle.

What experiences have sold you on BDD?

Ben My best experiences with BDD started when I began working outside-in with the RSpec Story Runner back in October of 2007. (Cucumber is a rewrite of that project which smoothed out the rough edges and has added some really sweet extensions to the grammar.) Recently, I sat down with a stakeholder and reviewed some concrete examples in table form using Cucumber’s Scenario Outline feature. Through the conversation we were able to clear up some misconceptions I had of how the app was expected to work. As a result, it saved me a good deal of time since the actual requirements were far less complex than my original interpretation. I then used these executable examples to drive my entire development process. I have found that BDD not only helps in writing well designed code but helps in preventing unneeded code from ever being written! My presentation will focus on the outside-in approach and how Cucumber fits into this process.

How much emphasis do you place on code coverage?

Ben I use code coverage stats as an indicator, not a dictator. Meaning, if I come into a project and find low code coverage I lose confidence in the system and in the ability the tests give me to refactor quickly. However, if an app has high coverage (even 100%) I don’t assume every part of the project is well tested. I have run across situations where RCov gives you 100% line coverage but by simply looking at the code and tests it is obvious that large paths of functionality are not being tested. I highly recommend Jason Rudolph’s series on how to fail with 100% test coverage.

What about other measures of code quality?

Ben I am a big fan of Jake Scruggs’ Metric-fu library. I like to set it up on all my projects and have it run on a CI server (I generally use CC.rb). The Metric-fu plugin is a collection of useful tools to analyze ruby code. My favorite tools included are flog (an ABC metric based tool) and Saikuro (cyclomatic complexity). I like them because they point out long or complex methods that are prime candidates for refactoring. While not a code metric tool per se, I also really like kablame which will use your SCM’s blame command to identify who is writing the tests and shame those who are not. Outside of these metric tools I also think tests are a great indication of code quality and design; if code is hard to test then the design is probably lacking.

Other than your own, what MWRC presentation are you most looking forward to?

Ben I am really amazed by the presentations MWRC has lined up this year. The past two years have been great and this year looks to be just as good, if not better. As a TDD/BDD junkie I am really excited to see Brian Marick present. I have heard great things about his presentations and I’m hoping to learn a lot. I also keep wondering what tricks James Britt has in store and can’t wait to try some of them on my own Nintendo Wii. I am of course also very excited to see Jim Weirich speak again- I always walk out of his presentations feeling like a better programmer.

Click here to Tweet this article

Monday, February 09, 2009

Real World Haskell: Pre-Reading Survey

A long time ago, I was an aficionado of a language that told me that the three traits of a great programmer are laziness, impatience, and hubris. Then I discovered Ruby, which taught me about the Principle Of Least Surprise and that programming should be fun. Now, in Real World Haskell, Bryan O'Sullivan, John Goerzen, and Don Stewart promise me three things as I read their book to learn about Haskell: novelty, power, and enjoyment. That sounds like a pretty good deal to me.

After conducting an interview with Bryan, John, and Don I kept looking for a break in my reading list where I could put RWH, and I finally decided to make the room instead of waiting for it to occur on its own. Yesterday, my copy arrived.

I was immediately struck by the size of the book. Programming in Haskell (which I wrote about, briefly, here ) is a relatively modest 155 pages while RWH weighs in at 640 — and what I've read so far is very approachable.

Reading through the ToC and Introduction, I've built the following list of questions I want to keep in mind as I read:

  • What value do I gain from strict, static typing? How does this compare to the value I gain from strict, dynamic typing?
  • What about Lazy Evaluation?
  • What about Polymorphism?
  • Why bother with whitespace?
  • How do I think in Haskell?
  • What about Composition and code re-use? (How does it conpare to Factor/Forth?)
  • How do I keep code readable? How is this different than in Ruby?
  • How does the FFI work? How does this compare to Ruby's?
  • How does concurrent programming work? How does this compare withErlang? With Ruby?
  • How does STM fit into things? What should this teach me about threads? About Actors?
  • How do I profile, benchmark, and optimize?
And of course, the biggest question of them all: When should I be reaching for haskell instead of Ruby, bash, or C?

I've also put together three little goals for myself. By the time I finish the book, I want to use haskell to write:

  • a wiki
  • a twitter scanner
  • a log analyzer

I'll try to write about my progress through the book, insights into the questions above, and progress toward my three goals. Feel free to share your thoughts as well.

Click here to Tweet this article

MWRC speaker interview: Andrew Shafer

It's Monday morning, so it must be time for another MWRC mini-interview. Andrew Shafer (@littleidea) was good enough to answer several questions for me.

This interview has a lot of meat to it, so enjoy. Then, go register for MWRC!


You're a programmer working on a very sys-admin-y tool. What could sys admins learn from developers?

Andrew This is a fascinating question on several levels. I like to talk about Developers and Sysadmins tribes, but the truth is, there are a lot of subcultures. For example, Ruby developers are not Java developers or kernel developers. The fact is there are a lot of smart people out there with tribal optimizations who could all learn a lot from each other. I know less about the different sysadmin tribes, and I hate to over generalize, but from my observations the sysadmins can benefit a lot from learning to think more symbolically, especially the way things are going with infrastructure. There is a tendency to get stuck in the procedural details and to avoid abstraction. The other big thing is version control. And forget about CVS, RCS or anything else, go straight to git. Do it now. Seriously. What are you waiting for?

What can programmers learn from sys admins?

Andrew I don't want to over generalize here either but it's probably almost the opposite lesson. Ruby programmers tend to work with fairly high levels of abstraction. Sometimes there is a lot of value in the details. If you are only manipulating symbolic abstractions, you often don't know how things really work. Sysadmins tend to know a lot more about the operating systems, hardware and networking. If you want to build modern web applications at scale, the developers have to consider that stuff. This is less technical but sysadmins probably think more holistically about how they fit in to an organization, since their customers are often internal and they wear pagers. Admins probably have less tendency to blow people off with a 'works at my desk' shrug. (or then again, they might be BOFH disciples who will make sure it doesn't work at yours...)

Either way, there's a lot of things for these tribes to learn from each other. This might not be so clear when one server can handle the whole stack. I'm a big fan of pairing, but there are definitely big benefits to having devs pair with admins on projects or issues that blur the application/infrastructure boundaries.

Why should an average Rubyist be interested in Puppet?

Andrew There is the obvious answer of being able to automate provisioning things like Rails consistently. For anyone who might not know, Puppet is a framework which allows you to describe how a system should be configured and then makes sure it stays that way. Puppet has it's own language, which generates Ruby objects, which get transported to hosts and then does the work of comparing the system to the desired state and changing whatever isn't in the proper state.

But Puppet has a lot of interesting pieces of code. The most interesting is probably either the parser or the resource abstraction layer, both places are where some magic happens. Puppet was first released in 2005 and people have learned a lot about DSLs and adopted Ruby idioms along the way. Some Puppet code is an unfamiliar dialect to most Rubyists, but the exercise of understanding how Puppet works will teach most people a few new things about Ruby.

Besides your own talk, what are you most looking forward to this year?

Andrew That's tough, do I have to pick just one thing? Last year, Jim Weirich's keynote blew my mind and I'd go out of my way to hear him talk about almost anything, but Brian Marick is another one of my favorite speakers who always has a few new twists and wrinkles for my cortex to process. If we can have a MWRC awesome speaker celebrity death match between those two, I'll look forward to that the most. Last year, the most practical talk for me was Philippe Hanigou's jedi knight dtrace talk, so I'm looking forward to getting his insights from Smalltalk. Jay Phillip's Adhearsion talk might be a show stopper too. Ben Mabey and David Brady, I have met at the Utah Ruby User Group, so I want to see them represent. Can I just go through the whole speaker list and say why I'm excited about that talk? I always look forward to hanging with the Ruby tribe.

Why should people come to MWRC?

Andrew MWRC 2008 was my first Ruby conference and I was blown away. If this year is half as good, it is worth 10X the price of admission. Watching Jay Phillips and Yehuda Katz debate about testing in the hallway for 10 minutes was worth the $100. At the time, I was coming up the Ruby onramp from years of Java and C/C++. I came in with enthusiasm but no expectations, and I left thinking I finally found my people. I was struck by the raw passion and the touch of irreverence. There wasn't just a bunch of people talking about a technology, it was a group who was literally bending technology to their will and daring you to join them if you could. You shouldn't come to MWRC because you want to learn about Ruby, Google and Amazon can solve that, come to MWRC because you want to be inspired.

Click here to Tweet this article

Thursday, February 05, 2009

Beautiful Architecture: my survey

In my last post, I alluded to my efforts to read more intentionally and admitted that I chose to jump off of the wagon at nearly the first opportunity. I haven't abandoned my attempt, and I thought it might be worth sharing some of my thoughts and notes as I survey[0] Beautiful Architecture

The first thing that jumped out at me is a statement on the page before the ToC: "All royalties from this book will be donated to Doctors Without Borders." While this has nothing to do with the text, I appreciate passion and commitment. Being willing to make a commitment like this strikes a chord with me.

Moving on to the table of contents, I found that each chapter is an essay from a different person or team. I recognized some, but not all of them. Who are they? Why did Diomidis and Georgios select them? This isn't a knock on the book, but a recognition that I need to do some studying to understand who these people are and why I should be listening to them. Perhaps knowing more about their backgrounds will also help me better understand their positions (and biases) and improve the value of the book to me.

Some of the chapter titles stand out to me as well. "A Tale of Two Systems: a Modern-Day Software Fable", "Data Grows Up: The Architecture of the Facebook Platform", "When the Bazaar Sets Out to Build Cathedrals", and "Rereading the Classics" all evoke a desire to dig into them and see what they have to say to me.

As I step into the preface, I start to find some questions that I really want to find the answers to:

  • How will architecture impact the my role in infrastructure and operations?
  • How should I approach data centricity vs. application centricity?
  • What can I learn from functional programming/architectural approaches?
  • What trade-offs should I be looking at between stability, extensibility, performance, and aesthetics?
  • How do I define beautiful architecture, and do I see that beauty in my projects or the systems I work on in my day job?
I'm not sure where I'll come out on the other end of this book, but it looks like it's going to be a fun ride.


Pragmatic Thinking and Learning recommends a five step approach to reading that Andy calls SQ3R — Survey, Question, Read, Recite, Review. Not only does this look like a great idea, it reminds me of the SPQR shirt I used to where as a classics geek in high school, so you know I've got to try it.

Beautiful Architecture: a first look

I recently got my copy of Beautiful Architecture. Following the SQ3R reading pattern I picked up from Pragmatic Thinking and Learning, I started my survey in the Table of Contents, and what did I find but a chapter on Emacs and Creeping Featurism by my friend, Jim Blandy

All discipline shot, I stopped my survey and jumped in to read jimb's essay. It was all I expected it to be — jimb's a really smart guy after all. The only nit that I'd pick is that in one spot he sells emacs short, saying:

...Emacs has only a limited understanding of the semantic structure of the programs it edits, and can't offer comparable [refactoring] support.

In truth, all emacs need to provide refactoring support for a language is an external program that it can use to provide that support. In Ruby-land, I have high hopes for combinations of tools like reek, flay, and RFactor underlying emacs and making ruby refactoring easy.

Tuesday, February 03, 2009

MWRC speaker interview: Jeremy McAnally

Here's the second MWRC speaker interview (you can read the other one, with David Brady and Kirk Haines here). This time, I'm talking with Jeremy McAnally (@jm), who's presenting "Jive Talkin’: DSL Design and Construction"

If you want to learn more about DSLs, or about Ruby in general, make sure you're at the MountainWest RubyConf on March 13th and 14th. Register for your spot now!


What makes DSLs so interesting to Rubyists?

Jeremy There are two technical reasons I think. First, because we write such readable code in the first place because of Ruby's lack of line noise in its syntax, we like to do that whenever possible. So when we start thinking about syntax for what we're going to be building, we figure why *not* build an internal DSL if it's minimal work on top of what we're going to be doing on the front end and will save us some time on the back end.

Secondly, Ruby is really, really dynamic and so we're able to bend and build objects as we see fit. These features make it super easy to "customize" (for lack of a better term) the syntax of our system.

On a more social level, I think Ruby appeals to people who like clean code, therefore the cleaner the code (i.e., the less extra language-y crap around what really matters) the better. That's essentially what a DSL really is: stripping your "lexicon" down to its purest, most specific form for what you're working on. Ruby lets you do that a lot more than other languages, as I've learned through my C# experience and in my recent Objective-C adventures. Every time I go back and tinker with these languages, I always come back more thankful for Ruby's features and syntax.

How can a Ruby Nuby get a toehold into DSL building?

Jeremy Well, this is sort of the whole topic of my talk in a sense, but in a nutshell: experiment. Play with new language features, try to bend Ruby to fit new ideas, do something crazy with lambdas. Once you have a feel for Ruby's really flexible features, then build your lexicon and start implementing it. Don't be afraid to iterate; none of my DSLs have ever just popped out in their perfect form. The rg DSL took 2-3 passes to even get it into something I would use, much less something I was satisfied with.

This will be your second year at MWRC, what was your favorite part of last years conference?

Jeremy I feel like MWRC is one of the few real community conferences out there. A lot of regional conferences these days are getting more and more commercial and about the "experience", and less and less about what actually matters: socialization, awesome talks, and hacking. Hopefully MWRC along with the other solid community events can continue to be shining examples of what a conference should look like.

Besides your own talk, what are you most looking forward to this year?

Jeremy I'm hoping James Britt's talk can convince my wife to buy me a Wii, but, uh, more realistically, I'm interested in the talk on Rhodes. I've been trying to get into iPhone development, but it'd be awesome build my app once (in Ruby no less!) and deploy on iPhone, Palm, etc. Of course, everyone always likes to hear what Jim Weirich has to say, and I'm really excited about the Adhearsion talk and its little sandboxed surprise. :)

Why should people come to MWRC?

Jeremy Uh, to see me speak, of course. Also, you guys concentrate a lot on the social aspect which makes for a great experience, you aren't in it to make money so I don't feel like I'm being accosted by vendors all the time, and there are always awesome talks. I'm still watching videos from two years ago.

Click here to Tweet this article

Monday, February 02, 2009

MWRC Speaker Interview (from twitter)

Twitter presents opportunities for trying things differently. With the MountainWest RubyConf coming up, I wanted to interview some of the speakers — this proved to be my chance to try things differently. I like the way it turned out.

I started out interviewing David Brady (@dbrady), but Kirk Haines (@wyhaines) joined the party. Then Jim Wierich (@jimweirich) jumped in on at the end too.

It was kind of like doing an interview in the hall at RubyConf and having other hackers join in the fun. While I'm sure it wasn't for everyone, I liked the way it turned out and will probably do some more interviews this way in the future. Let me know what you think.


Pat Eyler @dbrady you came to MWRC last year, this year you're speaking there. Why do you think MWRC is worthwhile? [tweet]

David Brady I was completely blown away by MWRC last year. I learned twice as much as I did at RailsConf, for a tenth the cost. Also, people look at single-track conferences like it's a compromise. It's not. 1 track means 1 back channel, 1 coherent group. MWRC had more IQ per dollar than a yugo full of rocket scientists hauling a sled full of border collies. [tweet tweet tweet]

Kirk Haines I'll give you my answer to that. Last 2 years it has been well managed & interesting. I expect the 3rd year to be the same. [tweet]

Pat Eyler @dbrady The notes for your #mwrc presentation say "TourBus", that's it. What are you talking about and why is it important? [tweet]

David Brady TourBus is a web load-testing tool that balances scalability and complexity. It's important because your app doesn't scale. [tweet]

Pat Eyler @wyhains you're #mwrc presentation description isn't much better, "Vertebra" — what are you going to be covering in your talk? [tweet]

Kirk Haines Hmm. I sent a much more detailed description. So, my talk: An overview of Vertebra, and then drilling down into some of the the interesting Ruby problems & the solutions to them. [tweet tweet]

Pat Eyler @wyhains can you give an example of an interesting problem? [tweet]

Kirk Haines Building a communications core to handle fault tolerant XMPP communications in a transport-flexible way, without going insane from trying to manage a sea of threads, mutexes, special cases, etc. The solution involves an evented core w/ all segments of work encapsulated in slick little deferrable objects. It's pretty cool. [tweet tweet tweet]

Pat Eyler @dbrady Besides your own talk, what are you most looking forward to this year at #mwrc? [tweet]

David Brady Honestly the thing I look forward to most is the hackfest. Another reason to love 1-tracks: you all conf, then you all go hack. As for the content, I'm excited by the "focused passion" of the talks: Wii's, Craftsmanship, Rhodes, Cucumber, etc. Lastly, I'm a bit of a hero worshipper. I'd pay to hear James Edward Gray, Jim Weirich or Jeremy McAnally talk about oatmeal. [tweet tweet tweet]

Jim Weirich @dbrady Revamping talk to make sure it includes copious references to oatmeal. [tweet]

Pat Eyler @dbrady What are you going to be hacking on at the hackfest? Do you have a project in mind, or will you find something there? [tweet]

David Brady Not a clue. One of my Resolutions is to ship code every single day. Currently I'm into nanoapps, so six weeks is a LONG way away. [tweet]

Pat Eyler @wyhains what are you looking forward to at #mwrc 2009? [tweet]

Kirk Haines The same as @dbrady, more or less. Hackfest and interacting with great people + opportunity to learn some cool new stuff. [tweet]

Click here to Tweet this article