On Ruby: January 2009

Thursday, January 29, 2009

Wicked Cool Ruby Scripts Review

No Starch Press has put out a number of books that I've really enjoyed (The Manga Guide to Statistics and Ruby By Example among them), so I was very excited to see Wicked Cool Ruby Scripts announced.

Subtitled "Useful Scripts That Solve Difficult Problems", I was hoping to find a good collection of idiomatic scripts that I could recommend to folks getting started with Ruby. That's not really what I found though.

The book contains 58 scripts which represent a fairly wide swath of problems, but most of the programs are so short that they don't really show good, idiomatic Ruby. I question why some of the scripts are included (e.g., adding a user to a linux system — writing an wrapper around useradd doesn't seem useful, even as an exercise).

I'm not saying that a reader won't learn something from this book, but I don't think it would be a good first or second book on Ruby. If you're already a rubyist and you're looking for a book with some good ideas, this might be a good book to pick up.

Tuesday, January 27, 2009

CouchDB Contest

Ok, it's contest time again!

O'Reilly has offered up two free keys to the rough-cut editition of Relax with CouchDB, so Jan and I decided to find a cool way to give them out.

Here's the plan — we want to hear about your CouchDB project ideas, you can write about them in the comments here, or just link to your own blog. Just an idea isn't enough though, let us know why you think CouchDB would be a good fit for it. You can submit ideas until February 13th, after which Jan and I will look them over and pick out two winners.

Have fun, and good luck!

Friday, January 23, 2009

Reader Interview: Relax with CouchDB

As I was preparing to do a second interview with the writers of Relax with CouchDB, I decided to approach the community that's already reading the book (even though only 4 chapters have been released so far). Chris, Jan, and Noah are writing this book in a very open fashion to try to improve the final product and to build a community around it before it hits the shelves. I thought it would be interesting to see what some of the readers thought. I was lucky enough to get Rich Morin, @rdmorin, to respond (I've been a fan of his since the Prime Time Freeware days, see below). Rich has been participating in the Freeware community for a long time, and I think he brings some real wisdom to the table.

What attracted you to CouchDB?

Rich I've been looking around for a while for a way to handle the mixture of unstructured, semi-structured, and formally structured data that I encounter in mechanized documentation projects. I've designed a few schemas that make my DBM friends grow pale, but none that looked very easy to use, given that I'd have to encode my queries in SQL.

Ontiki is the current incarnation of a long-term mechanized documentation project. I'm planning to use CouchDB for it, along with Erubis, Git, and Merb. It should stretch the traditional notions of wikis quite a bit.

I heard about CouchDB in a talk by Ezra Zygmuntowicz (of Engine Yard and Merb fame) and decided to look into it. Although it's still a Work In Progress, CouchDB looks extremely promising. I particularly like the fact that it uses Ruby-friendly data structures (eg, lists, hashes, and scalars) and that it should scale extremely well.

Why are you participating in this kind of open review/development of Relax with CouchDB?

Rich As a frequent buyer of technical books, I generally look for books on topics of interest. I've bought WIP books before and have no problem with seeing material that still needs work. In fact, I like the fact that I can help to influence which questions get answered, etc.

What value do you think this level of openness will do for the book, good or bad?

Rich Several years ago, I was a small-scale publisher of book/CD combinations (Prime Time Freeware, for anyone who remembers :). When we did the book on MacPerl, we actively solicited user input and review (though we didn't charge for access to the PDFs). The responses ranged from nitpicking to well-reasoned arguments about pedagogical style. Almost all of them were useful (sometimes extremely so) in improving the book.

Author Interview: Relax with CouchDB (Round 2)

I had a great opportunity to trade emails with Jan Lehnardt in a second interview about Relax with CouchDB. This time, we touched TDD, refactoring, and of course, the book.

The initial chapters have been available for over a month now, gathering feedback. What's been the biggest change you've made due to feedback?

Jan We still have things to integrate, but we took a lot of notes. The biggest thing we've seen is where we tried to explain concepts in CouchDB by contrasting them to how things are done in the RDBMS world. Production systems often do not follow theory to the book because of performance reasons (denormalization comes to mind). So we are saying in CouchDB your data is denormalized, thus fast, and actually true to the "CouchDB Theory" but now people are (rightfully) pointing out that the RDBMS systems have been used wrongly. Fact is: We don't want to say bad things about the RDBMS world, we just tried to explain things by comparison and a lot of people coming to CouchDB have an RDBMS background, so we thought it is a good idea to contrast them.

We learned that this is not the best approach and we are moving things a little towards explaining CouchDB on its own instead of comparing it to relational databases in the first chapters. Again, I'm not saying anybody is more right or wrong here, it was just poor choice on our part because we didn't know we'd cause such a ruckus :)

PS: CouchDB is not a relational database and we all support the idea of using the right tool for the job. This is sometimes an RDBMS and sometimes CouchDB :)

As I looked over Chapter 4, one blurb stood out to me: Applications "live" inside design documents. You can replicate design documents just like everything else in CouchDB. Because design documents can be replicated, whole CouchApps can be replicated. Can you explain this in a little more depth?

Jan CouchDB is an application server in disguise. It can host HTML+CSS+JavaScript applications like any other web server, but it also provides an HTTP API to a database system. This is the perfect basis to write standalone applications using the web-technologies everybody knows.

CouchDB's replication becomes a distribution channel not only for data (what books to you have in your library?) but also entire applications (I enhanced the library application to also handle my board game collection, do you want my patch?). Think of GitHub, but for applications and peer to peer distribution.

You can also read more about this topic in a series blog posts by Chris

Standalone Applications with CouchDB

My Couch or Yours? Shareable Apps Are The Future

Refactoring is on my mind a lot right now, and with that comes testing. How testable are CouchDB apps? What kinds of tools or frameworks exist to do testing?

Jan We are currently working with TDD experts to find a good solution to allow CouchApp developers to test their applications inside-out.

Since this is all web-technology, we expect we can re-use some of the existing tools. We just want to go the extra mile and make it really easy for the developer.

What about refactoring proper, what's the state of the art in CouchDB refactoring?

Jan That depends a bit on what you mean. Refactoring CouchApps has not been tackled yet. But CouchDB is schema-free so you can just play around and change things. Documents (that includes the design documents that hold your application) are versioned, so you can go back to an old revision (not forever, but for a little while)) if you screwed up.

About refactoring your data: Say you have an app that stores user profiles and you started out with separate fields for first- and last name. But user-feedback and UI-design found out that a single `name` field is better suited for your app. Your map function to get all first and last names originally looked like this:


 function(doc) {
   emit([doc.firstname, doc.lastname], null);
 }

And your new one looks like this:


 function(doc) {
   emit([doc.name, null);
 }

You can consolidate both to support legacy data:


 function(doc) {
   if(doc.name) {
     emit(doc.name, null);
   } else {
     emit(doc.firstname + " " + doc.lastname, null);
   }
 }

You change your UI-code to deal with a single `name` value and this view will consolidate old and new documents.

Yes, this is a little dirty, but also pretty neat. At some point, you'd want to clean up all your data and get rid of the special cases. Our off-the-hand suggestion is that for minor versions, where you want to add features quickly and make updates painless, you use the duck-typing above and for major versions you take the time to consolidate the cruft and update your data properly and prune your map and reduce functions.

This is good advice (hopefully), but we might be able to provide you with tools and libraries that handle the dirty work for you so you can concentrate on improving your app instead of fussing with the database. After all, this should be relaxing!

Thursday, January 22, 2009

Pragmatic Thinking & Learning

I've been reading Pragmatic Thinking and Learning and I really love it. It's one of those books that's really hard to put down, but really hard to read straight through. I find myself constantly going back to sections, making notes, and thinking about how to apply ideas.

I'm working at applying two of the ideas right now. I'm building a personal wiki (using emacs wiki mode) to better manage my personal, scouting, and professional lives. I'm only a couple of days into this, but I foresee a lot of wiki gardening as I sort things into a better management system for myself.

I'm also working on becoming more intentional in my reading, both using the SQ3R model Andy describes and picking the books that I read more carefully.

It's interesting to see how books like this cut across different aspects of your life, improving all of them. I think I'm going to develop the same kind of 'go back and reread it again' relationship with Pragmatic Thinking and Learning that I have with The Pragmatic Programmer. It's just that good.

Looking for serious Ruby Hacktitude?

I just noticed that zenspider and some of my other buddies up in Seattle.rb have hung their shingle out again. If you need some big Ruby guns, these guys are the deal — hire them now before someone else does.

Besides, if they're not working on your project, they're liable to extend Wilson to produce PPC output or something equally dangerous. For the sake of humanity, don't let them sit too long.

Wednesday, January 14, 2009

Ruby Best Practices: mini-Interview 2

As Gregory Brown (@seacreature) and I discussed the Ruby Best Practices Contest, we also talked about the current state of Ruby Best Practices (or get the Rough Cut). You can read the resulting interview below.

Before we get to that though, we should announce the tow winners of the contest: Jamis Buck (actually entered by Daniel Berger) and Eric Hodel. Eric and Jamis both get free Rough Cut Access to Ruby Best Practices — who knows, they might even find some of their own ideas in there.

How is the writing process itself going? How are you dealing with feedback from readers/reviewers?

Gregory The process we've put together really seems to be working. When I first pitched the book, I set the pre-condition that we'd need to have a broad panel of advisers from the get-go. If this book is going to have the title "Ruby Best Practices", I want it to be able to stand on more than my own name, for sure. So I asked a few of my friends, a few of my students, and a few folks I think of as truly masterful Rubyists to be part of an active review process. So far, that has been working really well.

Every chapter gets two levels of review, at a minimum. I start by posting to our internal reviewers a plain text file and a PDF, and collect their comments. After a about a week, I make revisions based on their suggestions and send things off to O'Reilly for a RoughCut update. Usually within a day or so, the changes hit Safari and those who pre-ordered the book can comment on them. That group has already caught a couple problems and made suggestions, so it's definitely useful to have more eyes on the content. I think it's really important to release changes as often as possible, to encourage further review. Of course, we use some great tech to help us with that.

Because the book's sources are in Asciidoc, there is no separate typesetting process. My plaintext files are the same source that the PDF is generated from. This means I can make whatever revisions I want whenever I want, and I don't need to be worried about getting bogged down by the formatting. We can also go back and make changes to old chapters later on without much worry.

So far, O'Reilly has basically let me run things exactly how I wanted to, and it has been working out great. Fast, tight iterations will help make the book grow organically based on the input from our internal reviewers as well as the wider reader base. That's what I'm hoping for, anyway.

I hear that you're converting testing examples from RSpec to Test::Unit is pretty huge. Why are you making the change?

Gregory Oh, for the "Driving Code Through Tests" chapter? I just sent that out to the internal reviewers, so we'll see what they think. :)

But actually, it was their suggestion. The initial RSpec based chapter was meant to reflect the change in the tides in the Ruby community, but it wasn't too hard to convince me that I should reconsider things. I used to think RSpec was easier to teach than Test::Unit until I wrote a chapter about it, and found myself having to show reference implementations of how things like foo.should be_bar work, whereas assert foo.bar? needs no extra explanation. I had to explain what a proxy object was for folks to understand how to build custom matchers, and what a lambda was for people to know how to test their exception raising. I'll be blunt, that sucks.

But to be fair, I use RSpec for a good bit of my work, and I think it's okay once you get the hang of it. But the ideas I wanted to express in RBP weren't really tech-specific. I was mostly talking about overarching strategies like keeping your tests atomic, writing some helpers here and there, that kind of stuff. So some of the 'features' of RSpec were actually getting in the way. Test::Unit (well, now minitest/unit I guess), is standard Ruby, is dirt simple, and was able to demonstrate the strategies I wanted to share with folks with less magic.

So, maybe it's not such a big change. RSpec, Test:Unit, Minispec, Test:Spec ... How much does the specific tool really matter versus being serious about using it?

Gregory The interesting thing is I didn't really need to change the chapter much, aside from translating the source. Some sections were dropped and others will be added in, but for the most part, it's the same chapter and it follows the same general blueprint. This really underscores the fact that what matters are the ideas behind testing, not the technology. I'm hoping the reader will benefit from the simplicity of Test::Unit for my examples, but then take the knowledge and apply it wherever they want, whether it be to RSpec, Shoulda, or whatever it is the cool kids will be using when the book hits the shelf.

I think the specific tool matters, but only when it comes to comfort, not concepts. It's arguable that certain frameworks encourage you to embrace a concept more than others, but I don't think that's as big a factor as people might think.

How important is mocking as a Best Practice? What about Stubbing? Where do they fit in relation to each other?

Gregory Well, I think that mocking and stubbing are both pretty important when you need them, and completely worthless when you don't. I've seen systems in which every interaction with every other class was fully mocked out with all the behaviors specified. These systems are absolutely nightmarish to work with, because the tests become so brittle that you need to re-write your giant, complicated mocks every time you refactor. Tests should aid in refactoring, not get in the way, whenever possible. I've also seen (*cough* written *cough*) systems that stub out so much and produce canned results for everything which gives you wonderful passing tests but, well... don't actually test anything.

But of course, responsible use of mock objects to minimize your dependencies on expensive external resources can really help make your tests run faster and more clearly show what's going on. Stubbing out a method here or there to prevent you from testing things you don't care about can definitely be helpful. In general, I prefer to use whatever technique is the right combination of cheap, fast, and easy. That combination changes wildly from project to project so anyone who tells you that you should never make use of mocking / stubbing is crazy, and so are the people who want you to do it all the time.

As far as where all these things fit together, I think that's more a question for folks with a stronger grasp of theory than me. The best I can do is point you at Martin Fowler, then tell you that I don't really care about their distinctions at the high level. I only care about using the right approach for the job. Libraries like flexmock that intentionally blur the lines between mocks and stubs are great in the hands of responsible users, if you ask me. :)

While Ruby Best Practices is going to end up Free, O'Reilly has even more open in the writing of a couple of other books that are cropping up. How does this go 'round compare with your completely open work on the Ruport Book?

Gregory Well, the Ruport Book taught me a lot, both good and bad, about writing a community oriented book. The idea of having an internal review team is something we had on the Ruport Book as well, and I carried it on to RBP. It seems like because the topic is more general, this has worked out much better on this book than it did for the Ruport Book, simply because most of my internal review team knows a hell of a lot more about Ruby than I do!

But the Ruport Book and RBP are two very different books, so it's hard for me to compare them.

I'm having difficulty expressing my feelings on this, but basically, the way this book is formed in my head doesn't feel much like a textbook. It's more like a quasi-fictional narrative "based on a true story" with notable mentions of various Ruby hackers and their projects. It's sort of like I've stitched together a series of blog posts so that it bleeds into this textbook undercurrent. The book is meant to be read from cover to cover, or at least in whole chapters. This makes it tricky to make wide open in the early phases.

But once the work is done, I want it to be able to bend under the pressures of peer review. I want people to be able to freely use and remix what I've done, however they see fit. And of course, that's already planned to happen and so it will, it will just take some time.

I believe it can be a virtue to keep something a little quiet until you know for sure where it is going. It's sort of like waiting to do a first release of an open source project until you clean up the code a bit and make it useful and approachable to others. That's exactly what I'm doing with RBP now, and hopefully, that will benefit those who might be interested in contributing to the book down the line.

Lots of people coming into Ruby from other languages bring Best Practices (and sometimes Pessimal Practices) with them. What languages/communities do you think offer the best opportunity for finding good practices for Ruby?

Gregory That's a good question. I don't really know. I think we end up getting more bad patterns than good from other languages. This isn't to say anything bad about other languages, but it is to say that best practices aren't always transferable. Look how many patterns from Java aren't even remotely relevant in Ruby. Or how many cool functional programming techniques fall down and die when you try to import them. I think maybe the fact that we only got a few responses to our blog contest here was an indication of that, to some extent.

However, I think many communities within Ruby are a hotbed for best practices. RBP draws almost all of its key points via the influences of notable open source Ruby projects. Rails utterly changed the way people write Ruby, in many ways for better, in other ways for worse. Merb has really made me re-think issues like performance. Smaller libraries also pack a punch. FasterCSV is the source of a lot of inspiration for me, and in turn, I think FasterCSV has been influenced to some extent by Ruport. After-all, it has its roots there, anyway.

So if you pay attention to the conversations between well trodden Ruby projects, both in words, and in code, you'll see a whole lot of great strategies emerge. My hope is that this book will give people a snapshot of some of them so that they can pick up new skills in the way many of us hackers are doing: One new patch at a time.

I see that you've set up a twitter account for RBP. How do you think that will impact the book?

Gregory It'll motivate me a little to keep posting updates, give me a way to keep people informed about book related news (such as this interview), and give readers (or potential readers) a way to connect with me and share their thoughts. I'm not a big twitter addict, but I'll at least let people know when to expect new content and whenever something I think might be interesting comes along. Beyond that, I guess we'll see how it goes.

Tuesday, January 13, 2009

Sequel Interview with Jeremy Evans

With the recent release of Sequel 2.9.0, I've finally taken the time to interview Jeremy Evans, the project's maintainer. Back in Jan 2008, I interview Sharon Rosner (the former maintainer). Sequel has come a long way since then, so it's about time I revisited the project.

You took over Sequel from Sharon Rosner almost a year ago. How and why did you end up holding the reins on this project?

Jeremy My first experience with Sequel was with Sequel 1.2, in February of 2008. I was looking for an additional ORM to support in my Scaffolding Extensions project, and had already added support for ActiveRecord and DataMapper. At the time, Sequel didn't really have support for model associations, and the recommendation was to just implement your own methods to get associated objects.

I worked on a simple associations implementation for Sequel that was only around 60 lines and handled the three main association types (many_to_one, one_to_many, many_to_many) with full reflection support. I posted on the Sequel mailing list with it and Sharon liked it. It became the basis for the association support Sequel introduced in 1.3, and I was given subversion commit rights.

In the middle of March 2008, Sharon emailed four of the developers with subversion commit rights and asked if we could assume leadership of Sequel, since he was planning to move on from programming and do something completely different. I was one of the two developers that responded, and initially I was only going to be responsible for sequel_model. I ended up taking over maintenance of sequel_core a couple months afterward, since the other developer didn't have enough time to maintain it.

With Merb being merged into Rails, do you see a future for Sequel in Rails 3.0?

Jeremy Yes. I'm guessing the merb_sequel plugin will be ported over to Rails 3.0.

It's fairly easy to use Sequel in Rails already, just by requiring it and setting up a database connection with Sequel.connect. I have 5 Rails projects that use Sequel in this manner.

It's good to have people using the projects they're working on. What kinds of changes have you made to Sequel based on the way you use it?

Jeremy The association code was designed to be easy to use by my Scaffolding Extensions project. The eager loading code was designed to be fairly similar in usage to ActiveRecord's, making it easy for me to convert projects from ActiveRecord to Sequel.

On my largest Rails project, I often had cases where associated objects needed to refer back to their parent objects:


 artist.albums.each{|albums| p album.artist}

That causes issues because ActiveRecord will do a query for each album to get the artist. I originally handled this in ActiveRecord by doing something like:


 Artist.has_many :albums, :include=>:artist

I never liked this way of doing things, so I changed Sequel's association implementation so that the parent association is cached (the ActiveRecord parental_control plugin does something similar, I hear).

I only add things to Sequel itself if they are generic and not tied to any specific implementation. The most recent example of this is support I added to Sequel for Giftsmas. Giftsmas runs on PostgreSQL and uses triggers to handle some constraints. I added {create,drop}_{language,function,trigger} methods to the PostgreSQL adapter, which are generic enough they can be used by any PostgreSQL user. I then created the sequel_postgresql_triggers project, which has specific implementations of some common column types (counter/sum cache columns, immutable columns, timestamp columns). Giftsmas uses sequel_postgresql_triggers in it's migration to set up the database.

I didn't add the code in sequel_postgresql_triggers to Sequel itself because they are specific implementations, and there are other (perhaps better) ways of doing the same thing.

What kind of tools are you using to help with code quality — test coverage, complexity, refactoring, etc.?

Jeremy I test coverage before every release, anything less than 100% gets fixed. 100% code coverage means nothing, but less than 100% code coverage means something.

I don't use any IDE refactoring support (I code in vim and SciTE depending on the situation). I don't use any ABC metrics to measure code complexity. There are parts of Sequel that could definitely do with refactoring, but I generally wait to refactor until I'm going do make changes to the code that refactoring will help. I don't refactor for it's own sake.

I use TDD when the problem space is known and I know what outputs I want for each input. I'd say I do TDD for new features about 50% of the time, and about 90% of the time for bug fixes.

In an interview with InfoQ, you talked about eight things you thought had gotten better with Sequel. What's improved since then?

Jeremy That was shortly after the 2.7.0 release, so the major improvements since then are:

Database stored procedure support in the MySQL and JDBC adapters.
Much better support for database schemas.
Much improved compound SQL statement support (i.e. UNION, EXCEPT, and INTERSECT).

For full details, please see the release notes:

2.8.0
2.9.0

What do you see happening in the next major Sequel release?

Jeremy I expect the 2.10.0 release of Sequel to include:

A Firebird adapter.
A DataObjects (the underlying database connection libary used by DataMapper) adapter.
Better handling of MySQL CREATE TABLE options, such as the ability to specify an engine.

I don't have any other major plans for 2.10.0, but most of Sequel's new features originate in the community (in ideas if not code), so I'm sure there will be other improvements.

Sequel 2.10.0 will probably be released in the first half of February 2009.

That's pretty aggressive. How do you keep your release cycle short?

Jeremy I put all but the simplest commits through the same test suites I put the releases. Releases are generally time based, not feature based. If a feature isn't ready and is going to hold up the release more than a week, it just goes in the next release. For example, the firebird adapter is already mostly complete and was going to be in 2.9.0, but the developer working on it asked for a little more time to polish it, so it didn't make 2.9.0 and will have to wait for 2.10.0.

How closely do you watch other ORMs (Ruby or non-Ruby)? Which ones seem most interesting to you? What do you learn from them?

Jeremy I browse the DataMapper mailing list and periodically chat with the DataMapper developers on IRC. I generally look at the "What's New in Rails" posts to see what is going on in ActiveRecord. However, I don't follow either very closely. I also spent a bit of time looking at Lone, a PostgreSQL-specific ruby ORM.

DataMapper is interesting in that their focus is very different from Sequel's, in that they are aiming to be a persistance framework for generic classes. I don't have any experience with their recent code, though, the last time I used DataMapper was in the 0.2.5/0.3 timeframe.

I'm sure ActiveRecord taught me a few things, as I used it for years, but that's probably at a more subconcious level. Lone's prepared statement support gave me some ideas for the prepared statement interface in Sequel. I don't think I spent enough time using DataMapper to learn anything specific.

Why do you think sequel is a compelling option for a Ruby ORM?

Jeremy

Sequel code is generally clear and consise, and it has a very rubyish syntax.
Sequel::Model's associations are the most powerful of any ruby ORM, as Sequel allows the user much more control over how the associations work.
Sequel is especially compelling whenever you are dealing with sets of objects instead of single objects.
Sequel is very easy to contribute to. You don't need 3 +1s. Submit a pull request, post on the mailing, or ping me on IRC and I generally review patches quickly and decide if they are a good fit or not.
Bugs in Sequel are generally fixed quickly. The Sequel bug tracker has no open bugs currently, and that is how it is most of the time.
Sequel's internals are clean and easy to extend and modify if you need to.
Sequel's connection pool doesn't require the user or framework clean up connections manually.

Any closing thoughts or advice for people looking at Sequel?

Jeremy The Sequel community is very friendly, so if you have any issues getting things setup or have questions about how things work after reading the documentation, stop by the Google Group) or IRC (#sequel on Freenode) and hopefully we can help you out.

Thursday, January 08, 2009

Editor Interview: Talking about Open Content with O'Reilly's Mike Loukides

With a recent string of interviews with authors working on (open) books for O'Reilly, I wanted to see what the folks inside O'Reilly had to say about this trend. Mike Loukides (@mikeloukides) was good enough to answer my questions in a short interview. There's some great stuff in here whether you're an aspiring author, interested in open content, or how thinking conversations (in the web 2.0 sense) impact book marketing. Read on to see what Mike had to say.

Mike, it looks like you guys are riding a string of Creative Commons books either out or in the pipeline (Real World Haskell, Relax with CouchDB, and Ruby Best Practices). Historically, you've done a number of other open/free books as well. I'd like to pick your brains a bit about your willingness to head down this road.

Noah Slater told me "Our editor told us a surprising rule of thumb, that releasing a good book under a free license makes it sell more copies, and releasing a bad book under a free license makes it sell less copies." Why do you think this is?

Mike First: I'm not trying to back down from what I told Noah, because that is what I tell authors. But there's no hard data, and very little soft data: it's just our sense of what happens. It would be next to impossible to do a controlled experiment.

And I do want to correct one thing. I don't think in terms of good books and bad books; I think in terms of successful and unsuccessful. Plenty of good books are unsuccessful. What really drives success is the community that forms around the technology and the book.

That said, the mechanism is fairly simple. If there's a strong, thriving community around the technology—let's say CouchDB, since you're talking to Noah—the free online edition of the book will increase buzz, and make more people aware of the print book. A lot of people will download the online book, and decide they want the print book.

On the other hand, if the community around the technology is small, isn't thriving for one reason or another, etc., the existence of a free version will soak up what limited demand exists.

So I think free licenses for books is an intensifier: if a book is going to succeed, a free license will make it more successful. If it's going to fail, a free license will make it fail worse.

In the same vein, what can prospective authors do to make sure the open book they want to write is a good book? What makes the proposal stand out and say "I'd be a winner under a free license!"?

Mike Although I have a fair amount of ego tied up in making sure the book itself is as good as possible, I think in the long run it's only partially about the book. It's really about the technology and the community. If a community is growing, and people are excited, a good book will be successful (free license or not).

That said, there is an awful lot that authors can do to make their book more successful. None of this is magic: blogs, trade show talks, tutorials, all of that. A free online version of the book gives you a few more tools to play with. The authors of our Haskell book have done an excellent job of motivating the Haskell community.

As far as the book itself: readers want practical books. Readers want books that help them to solve the problem. If Noah and the other CouchDB authors had wanted to write a couple hundred pages explaining the principles behind CouchDB, non-relational databases, REST, and so on, without a single line of working code, it would be a disaster. I do get lots of proposals like this. They're generally disguised as "books targeted at management".

That's not to say that I'd turn down more abstract books on topics like software engineering. But I also think the case for writing that kind of a book with a free license isn't as strong. Would Martin Fowler's Refactoring or Kent Beck's Test Driven Development have been more successful if there was a version with an open source license? Possibly, but I don't think it's as clear a case.

O'Reilly seems to embrace open, flowing communication. You've got active bloggers, you work closely with User Groups, and you seem to have jumped on Twitter in a big way. How does this willingness to have conversations with your customers change the way you bring books to market?

Mike That's probably a better question for someone in Marketing.

But yes, all of these things give us more tools to work with. It certainly helps to get people talking about a book early on, it certainly helps to get people motivated and excited so that they want to read the book. And with some books, like the Haskell book, we got huge amounts of technical feedback from the public. That was a real help in making it a great book.

Other than technology titles, what other genres do you think would benefit from more openess (license, development, and conversational)?

Mike That's a really interesting question. One thing I like to point out is that we almost lost Shakespeare's entire works because there was no such thing as copyright protection in the 17th century. Plays were trade secrets, and the few plays that were published were generally published in unauthorized editions: several years after the fact, a few actors sat around a table and tried to remember the lines.

At the same time, I think the DMCA was a ridiculous intellectual property land grab. Music has always thrived on artists ripping off bits and pieces from each other: that's really central to how musical creativity works. But in the current climate it's entirely too easy to write a song and end up being sued because it's similar to some song that was published 5 years ago and happened to be buried in your subconscious. (Part of the problem in music, I think, is that you're dealing with relatively short sequences taken from a relatively limited repetoire—24 notes in a typical singer's range, 88 notes on a piano keyboard.)

It's worth noting that Noel Paul Stookey, from Peter, Paul, and Mary, started something equivalent to the FSF back in the late 70s. So free licensing didn't start with RMS and the GPL. (I'm trying to look up info on what Paul did back then, but I can't find it now. If you dig into this and find anything, I'd appreciate a link. I don't think he got very far with it. But in 78, I don't think it was needed the way it is now.)

Sticking with books: I think anywhere you can build a book out of a conversation, you'll do well. We've got some interesting experiments coming out: 97 Things Every Software Architect Should Know (Feb.) was built on a wiki, with contributions from roughly 40 software architects. We had the book about halfway done, with contributions from a couple dozen architects, and then opened it to the public. The response was great.

So, if you start from a conversation, almost anything is possible. There's a sense in which all the design patterns books are really about conversations. So can I see freely licensed books in fields like software engineering? Definitely. Can I see it in science? It runs against the way scientific institutions currently work, but open, collaborative science—doing science in public, on a wiki, as it were—was discussed a lot at our last SciFOO camp, and you should have seen how excited people were. Science changes a lot when it becomes more open and less tied to traditional publishing institutions.

Sunday, January 04, 2009

New Year's Road Trip

In December, we heard that the Sahpeech chapter was going to hold a Fellowship with a Brotherhood ceremony in Gunnison on January 2-3. Since we had an Ordeal candidate and a Brotherhood candidate just waiting for the opportunity, we decided to take a road trip.

On Friday afternoon, we packed up our gear (and lots of warm clothes) and dinner to eat on the road, and we headed off. The youth talked about scouting, movies, and the things they'd done over Christmas break, then they settled in to a discussion of the OA and upcoming chapter events.

We weren't the only travellers at the fellowship. There were two young men from the Nez Pierce chapter, two young men and an adult from the Sioux chapter, and one from the Todebeda Cheda Toonle chapter down in St. George. It was pretty cool to see arrowmen and candidates from all over the lodge.

The candidates, elangomats, and other arrowmen present put in a lot of service. We cleaned an elementary school from top to bottom, mounted a bunch of video projectors on classroom ceilings, and shoveled a lot of snow to clear the elementary school and nearby High School. We even retired a flag while we were there.

Before wrapping this up I'd like to welcome Ian, our newest Ordeal member, and congratulate Mike, our newest Brotherhood member. Mike's also the Ordealmaster for our upcoming May Fellowship and he's looking for elangomats to serve there, if you're interested please send an email to lakotasecretary@gmail.com and I'll pass your name and contact information along to him.

Thanks to the Sahpeech arrowmen for all the work and planning you put into this.

Oops, somehow I managed to post this to the wrong blog ... oh well. If you're interested in my scouting and OA activities, have fun with it. If not ... just ignore it and I'll post something more pertinent to 'On Ruby' early this week.

On Ruby