Thursday, March 26, 2009

Author Interview - Venkat Subramaniam

Here's the third of three Scala book interviews. Venkat Subramaniam (@venkat_s), author of the Prag's Programming Scala, has a lot to say about Scala and why you should be learning and using it.

You can find more posts about Scala (and Haskell and Erlang) on my Functional Programming page.


Functional Programming languages seem to be an increasingly popular topic. Why? What do FP languages teach or do for us?

Venkat Functional Programming has been around for a long time. Programmers are taking the time over the past few decades to rediscover it.

A few decades ago low level languages like Assembly and C were popular. As OOP and C++ because prominent, it became increasingly clear that memory management is quite unmanageable. We raised the bar, let the platform deal with memory management and garbage collection. We figured that by relying on a higher level of abstraction we can focus our effort, for better, on application development rather than memory management. This, among other things, brought languages like Java and C# to prominence. When programming in these languages, we don't have to worry any more about memory management.

As we moved on to create highly responsive applications, we began to realize that dealing with threading and concurrency is another real pain. As soon as you learn how to create a thread, the next thing you learn is how to limit and control it. Multithreaded applications are plagued with issues related to contention, deadlocks, correctness, and complexity of the low level APIs. The gaining availability of multi core processors and multiprocessors is only exasperating these concern. We're eager to put an end to the concurrency issues like we took care of memory management. This is where functional programming comes in.

Functional Programming advocates assignment-less programming and higher order functions with no side effect. By higher order functions, we mean functions that accept functions as parameters. You pass functions, return functions, and nest functions. Furthermore, these functions solely rely on the input you send them. They're not influenced by any external shared state, and, in turn, do not affect any external shared state. They promote immutability. This eliminates the concern of contention and the need for synchronization. It makes it easier to understand and verify behavior of such functions. However, threads still need to communicate and exchange information. Functional Programming languages like Scala provide an actor based message passing model to realize this.

What makes Scala the right FP language for people to pick up?

Venkat Scala is a relatively new language. Other prominent FP languages have come before it. However, quite a few things make Scala special and attractive. It is by far the most powerful statically typed, highly expressive, yet concise language that runs on the Java Virtual Machine (JVM). Scala is fully object oriented, intermixes with Java seamlessly, and provides very strong static typing but without the ceremony that Java adds. Though Scala is entirely a new language, for a moment entertain the thought that Scala is a refactored version of Java, stripped out of its ceremony and threading API, then enhanced with sensible type inference and actor based concurrency model. If Java were to be redesigned in 21st century, it may look a lot like Scala. So, if you are programming on the JVM, and are interested in taking advantage of functional programming capabilities, Scala is a great choice.

Why is the JVM a good platform for Scala and FP?

Venkat Or should we pose the question the other way around, asking why is Scala and FP a good choice on the JVM? I think these two questions go hand in hand.

The real strength of Java is not the language but the platform, the JVM. There are three things going for it. First, the capability of the virtual machine itself is superb in terms of performance and scalability. It did not start out that way, however, we've seen tremendous improvements over the past few years. Second, a rich set of libraries and frameworks are available for various tasks that enterprise applications demand. Name it, and there is something available already for you (the concern often in this space is availability of too many options, not too little!). Third, influenced by the two above factors, a significant number of enterprise applications already run on the JVM platform.

Some developers have the luxury of starting out on green field projects and do not have to integrate with other existing applications and components. They have the ability to choose any language and style they desire.

However, a large number of developers don't have that choice. They develop applications that run and integrate with things on the JVM. Emigrating to another language or another platform to enjoy the benefits of functional programming is really not an option for them. Scala provides a tremendous opportunity here. They can stay right on the powerful platform they're on, continue to developer and maintain their existing Java applications, and at the same time take advantage of functional programming and other related benefits that Scala bring to the platform.

Since Scala code compiles down to bytecode and integrates seamlessly with Java, you can take advantage of Scala to the extent you desire and makes sense of the project. You can have a small part of your application written in Scala and the rest of the application in Java. As you get comfortable and make the paradigm shift, more and more code and even entire applications can be in Scala. At this point, you can continue to integrate and utilize other components and services running on the JVM. So, the JVM is not simply a good platform. It is a compelling platform to utilize and benefit from functional style of programming.

What kinds of problems are a good fit for Scala?

Scala's strengths are strong sensible typing, conciseness, pattern matching, and functional style. The first three strengths can help with your traditional programming needs. It will take you less code to achieve your day to day tasks like working with XML, parsing data, manipulating collections of data, ... Scala can help you do the heavy weight lifting with lesser effort. The last strength I mentioned along with the actor based model will help you develop highly concurrent applications without the pain and perils of thread synchronization. On one hand, your concurrent application will be smaller in size. On the other hand, it is devoid of the uncertainty of program correctness due to state contention and deadlocking.

Design patterns and algorithms look differently when implemented in different languages. Can you give us some examples of elegant patterns or algorithms in Scala?

The language idioms have a great influence on the design. There are a number of patterns that are easier to implement in Java compared to C++. Similarly, languages that support closures and provide conciseness make lots of things easier and elegant.

Let's explore one example.

You are asked to total values in a range from 1 to a given number. You can write a simple loop to do this. Now, you are asked to total only select values, say only even numbers. You could duplicate the code that totals and put a condition statement in it. Now, you are asked to support different criteria, select even number, or select odd numbers, or prime numbers, etc. Spend a few minutes thinking about how you can achieve this in Java.

I can think of two ways to implement this in Java.

Approach 1:

Create an abstract class with an abstract method that will evaluate the selection of a value. Use this abstract method in determining the total. This follows the factory method pattern. Here is the code


// Java code
public abstract class TotalHelper {
 abstract public boolean isOKToUse(int number);

 public int totalSelectValuesInRange(int upperLimit) {
   int sum = 0;
   for(int i = 1; i <= upperLimit; i++) {
     if (isOKToUse(i)) sum += i;
   }
   return sum;
 }
}

Now to use this method, you will have to extend the class and override the isOKToUse() method. Each time you need a different criteria, you have to write yet another class. Quite a bit of work.

Approach 2:

Create an interface with the isOKToUse() method and call that method from within the totalSelectValuesInRange() method as shown below:


//Java code
public interface OKToUse {
 public boolean isOKToUse(int number);
}

//Java code
public class TotalHelper {
 public int totalSelectValuesInRange(int upperLimit, OKToUse okToUse) {
   int sum = 0;
   for(int i = 1; i <= upperLimit; i++) {
     if (okToUse.isOKToUse(i)) sum += i;
   }
   return sum;
 }
}

In this case you don't have to create a separate class to derive from TotalHelper, however, you do have to create (inner) classes that implement the OKToUse interface. Here you are using the strategy pattern to solve the problem.

While both of the above approaches will work, it is a lot of work to implement either approach. You can use IDEs to help generate quite a bit of that code, but that is still a lot of code that you have to look at and maintain.

Let's see how you can solve the above problem in Scala. We will follow along the lines of the second approach above, but without the interface. Since you can pass functions to functions, the code will be a lot simpler to write. At the same time, since Scala is statically typed, you have compile time type checking to ensure you are dealing with the right types in your code. Let's take a look:


def totalSelectValuesInRange(upperLimit: Int, isOKToUse: Int => Boolean) = {
 val range = 1 to upperLimit
 (0 /: range) { (sum, number) =>
   sum + (if (isOKToUse(number)) number else 0) }
}

The totalSelectValuesInRange() function takes two parameters. The first one is the upper limit for the range you want to work with. The second parameter is a function which accepts an Int and returns a Boolean. Within the totalSelectValuesInRange() function, you create a range and for each element in the range, you include the element in the sum if it passes the criteria evaluated by the function given in the parameter.

Here are two examples of using the above code:


Console println "Total of even numbers from 1 to 10 is " +
 totalSelectValuesInRange(10, _ % 2 == 0)

Console println "Total of odd numbers from 1 to 10 is " +
 totalSelectValuesInRange(10, _ % 2 == 1)

The output from the above code is shown below:


>scala totalExample.scala
Total of even numbers from 1 to 10 is 30
Total of odd numbers from 1 to 10 is 25

When looking at the above example, don't focus on the syntax. Instead focus on the semantics. You can't program any language without understanding the syntax. At the same time, once you get a grasp of the syntax, you navigate at the speed possible. This is when the language conciseness will help. There are quite a few interesting things to observe from the above example.

  1. The code is very concise.
  2. You did not spend time creating interfaces. Instead you passed functions around.
  3. You did not specify the type repeatedly. When you called the totalSelectValuesInRange, Scala verified that the operation you perform on the parameter to the function (represented by the underscore) is valid operation on an int. For example, if you wrote
    
    totalSelectValuesInRange(10, _.length() > 0)
    

    Scala will give you a compile time error as shown below:
    
    >scala totalExample.scala
    (fragment of totalExample.scala):15: error: value length is not a member of Int
     totalSelectValuesInRange(10, _.length() > 0)
                                     ^
    one error found
    !!!
    discarding <script preamble>
    

    Notice how it recognized that the parameter represented by _ is an Int.
  4. You did not perform a single assignment (val represents immutable data, you can't change or assign to it once you create it). In the Java example you initialized sum to 0 and then continued to update it. In the Scala example, however, you did not assign a value to any variable at all. This last feature comes in very handy when dealing with concurrency.

Imagine for a minute that the numbers you like to total are your asset values in stock investment. You can write a little code that concurrently fetches the stock prices from the web and determine the total value you have invested in it. You can then total these values without any issues of concurrency. And you'd be surprised, you can do that in about two dozen lines of code in Scala, as shown below.


import scala.actors._
import Actor._

val symbols = Map("AAPL" -> 200, "GOOG" -> 500, "IBM" -> 300)
val receiver = self

symbols.foreach { stock =>
 val (symbol, units) = stock
 actor { receiver ! getTotalValue(symbol, units) }
}

Console println "Total worth of your investment as of today is " + totalWorth()

def getTotalValue(symbol : String, units : Int) = {
 val url = "http://ichart.finance.yahoo.com/table.csv?s=" + symbol +
   "&a=00&b=01&c=" + (new java.util.Date()).getYear()

 val data = io.Source.fromURL(url).mkString
 units * data.split("\n")(1).split(",")(4).toDouble
}

def totalWorth() = {
 val range = 1 to symbols.size
 (0.0 /: range) { (sum, index) => sum + receiveWithin(10000) { case price : Double => price } }
}

The actor helped you to dispatch separate concurrent calls to the yahoo web service to fetch the price for the symbols you hold. You multiplied the response you got with the number of units to determine the total value. Finally you messaged that back to the calling actor. In the totalWorth() method you received the response from those calls using the receiveWithin() method and added them up.

Click here to Tweet this article

1 comment:

Anonymous said...

Agile Developer Venkat Subramania is going to speak at Great Indian Developer Summit 2010 in Bangalore this April. There are some more Speakers from Microsoft as well as other International Speakers who are highly qualified in their respective field are speaking at this conference. Interested people can register at www.developersummit.com