Friday, January 23, 2009

Author Interview: Relax with CouchDB (Round 2)

I had a great opportunity to trade emails with Jan Lehnardt in a second interview about Relax with CouchDB. This time, we touched TDD, refactoring, and of course, the book.


The initial chapters have been available for over a month now, gathering feedback. What's been the biggest change you've made due to feedback?

Jan We still have things to integrate, but we took a lot of notes. The biggest thing we've seen is where we tried to explain concepts in CouchDB by contrasting them to how things are done in the RDBMS world. Production systems often do not follow theory to the book because of performance reasons (denormalization comes to mind). So we are saying in CouchDB your data is denormalized, thus fast, and actually true to the "CouchDB Theory" but now people are (rightfully) pointing out that the RDBMS systems have been used wrongly. Fact is: We don't want to say bad things about the RDBMS world, we just tried to explain things by comparison and a lot of people coming to CouchDB have an RDBMS background, so we thought it is a good idea to contrast them.

We learned that this is not the best approach and we are moving things a little towards explaining CouchDB on its own instead of comparing it to relational databases in the first chapters. Again, I'm not saying anybody is more right or wrong here, it was just poor choice on our part because we didn't know we'd cause such a ruckus :)

PS: CouchDB is not a relational database and we all support the idea of using the right tool for the job. This is sometimes an RDBMS and sometimes CouchDB :)

As I looked over Chapter 4, one blurb stood out to me: Applications "live" inside design documents. You can replicate design documents just like everything else in CouchDB. Because design documents can be replicated, whole CouchApps can be replicated. Can you explain this in a little more depth?

Jan CouchDB is an application server in disguise. It can host HTML+CSS+JavaScript applications like any other web server, but it also provides an HTTP API to a database system. This is the perfect basis to write standalone applications using the web-technologies everybody knows.

CouchDB's replication becomes a distribution channel not only for data (what books to you have in your library?) but also entire applications (I enhanced the library application to also handle my board game collection, do you want my patch?). Think of GitHub, but for applications and peer to peer distribution.

You can also read more about this topic in a series blog posts by Chris

Refactoring is on my mind a lot right now, and with that comes testing. How testable are CouchDB apps? What kinds of tools or frameworks exist to do testing?

Jan We are currently working with TDD experts to find a good solution to allow CouchApp developers to test their applications inside-out.

Since this is all web-technology, we expect we can re-use some of the existing tools. We just want to go the extra mile and make it really easy for the developer.

What about refactoring proper, what's the state of the art in CouchDB refactoring?

Jan That depends a bit on what you mean. Refactoring CouchApps has not been tackled yet. But CouchDB is schema-free so you can just play around and change things. Documents (that includes the design documents that hold your application) are versioned, so you can go back to an old revision (not forever, but for a little while)) if you screwed up.

About refactoring your data: Say you have an app that stores user profiles and you started out with separate fields for first- and last name. But user-feedback and UI-design found out that a single `name` field is better suited for your app. Your map function to get all first and last names originally looked like this:


 function(doc) {
   emit([doc.firstname, doc.lastname], null);
 }

And your new one looks like this:


 function(doc) {
   emit([doc.name, null);
 }

You can consolidate both to support legacy data:


 function(doc) {
   if(doc.name) {
     emit(doc.name, null);
   } else {
     emit(doc.firstname + " " + doc.lastname, null);
   }
 }

You change your UI-code to deal with a single `name` value and this view will consolidate old and new documents.

Yes, this is a little dirty, but also pretty neat. At some point, you'd want to clean up all your data and get rid of the special cases. Our off-the-hand suggestion is that for minor versions, where you want to add features quickly and make updates painless, you use the duck-typing above and for major versions you take the time to consolidate the cruft and update your data properly and prune your map and reduce functions.

This is good advice (hopefully), but we might be able to provide you with tools and libraries that handle the dirty work for you so you can concentrate on improving your app instead of fussing with the database. After all, this should be relaxing!

No comments: