Monthly Archives: March 2012

A First Look at Scala

Scala has been on my “personal radar” for two or three years now, but it hadn’t quite bubbled to the top of the list until recently. I’ve been playing around with it off and on for a couple of months now; not enough to get too deep into it, but I’ve gotten a taste. My first impression – wow.

Scala is a JVM based language, meaning it gets compiled to Java bytecode and its instructions are executed on the Java Virtual Machine. The future of Java as a language is somewhat questionable at the moment (though I wouldn’t write it off by a long shot), but as a platform, there is every indication that Java will be around for a long, long time. Indeed, it seems that there is a new JVM based language every month. Groovy is probably the most popular of the list, but there’s also Clojure, JRuby, Jython, and Ceylon among others. Each of these languages benefits from the tremendous amount of research and development put into the JVM. Oh, and BTW, Scala is also available for the .NET platform.

Back in my university days I took a functional programming class in which we studied Haskell. I have fond memories of learning about lambdas and curried functions and the notion of functions as first class citizens and tail recursion and immutable state and lazy vs eager evaluation and mapping, filtering, and folding and all the other things that makes functional programming so wickedly cool. My professor, Dr. Karl Abrahamson, would talk about “focusing on the what, not the how.” My “a-ha” moment was when Dr. Abrahamson implemented a quick-sort in Haskell on the board in just a few lines of code. It was so elegant, so succinct. We went on to do some really neat things, including a guided theorem prover.

The problem with all this is that Haskell is a purely functional language, which means that its application is somewhat limited. Oh sure, you could point to many academic and even a few industry projects that use purely functional languages, but it will never be mainstream. You’ll never see Haskell in the everyday business application. Dr. Abrahamson’s take on this was that, even if you couldn’t use a functional language directly, you could take the lessons you’ve learned and apply them to your “every day programming.” Thinking in a functional way would make you a better programmer. And, while I certainly agree with that statement, it left me somewhat unsatisfied.

What intrigues me about Scala is that it’s a mixed paradigm language, and I think that’s key to gaining mind share. What I mean by that is Scala supports both functional and object oriented paradigms. This means that you can use functional programming where it’s appropriate but sometimes you need to be able to change state, and Scala accommodates this.

With that, here are some notes I’ve taken as I’ve progressed. Again, I am far from Scala mastery, but here are some of the basics.

Scala is more difficult to master than most languages. Maybe I should say “than most mainstream languages.” Groovy would be a much easier transition for most Java programmers, because pretty much anything that compiles in Java will already compile in Groovy. That’s not true of Scala. That said, the syntax is elegant and succinct. But, it does take practice.

Scala is well suited for highly concurrent applications. That’s what Scala means after all – ‘SCAlable LAnguage’. Our typical approach to solving concurrency issues is to synchronize access to shared, mutable state. But Functional Programming is all about using immutable values and functions that return well defined outputs (in the mathematical sense) with zero side effects. If state can not possibly change, there is no need to synchronize access. Reasoning about synchronization is not something that is natural to anyone. It is difficult to do correctly and therefore error prone. Scala reduces the need to do so by encouraging programmers to prefer immutability where possible.

Scala is statically typed but makes extensive use of type inference. I don’t want to get into a long discussion of static vs. dynamic typing here, but personally, I prefer static typing. And that is my main complaint with Groovy – it’s dynamically typed. Static typing means that an entire class of errors are eliminated from my program because the compiler checks for them at compile time (not at runtime). Type inference means that the compiler can often infer the type from the context, which allows the code to be much less verbose than what you might see in Java. Consider this example:

val myMap: Map[Integer, String] = new HashMap

Compare that to Java:

Map<Integer,String> myMap = new HashMap<Integer,String>();

And, in most cases, the return type for methods can be inferred too. Consider this contrived example:

def addTwo(x: Int) = {
   x + 2
}

Should I really have to specify the return type here?

So, Scala enjoys all the benefits of static typing (chief among them being compile time checking and improved performance/optimization), and at the same time the code is usually as succinct as what you would see in a dynamic language such as Groovy or Ruby.

Everything is an object. There is no notion of a primitive in Scala. Compare this to Java, where you have primitives such as ‘int’, ‘float’, and ‘boolean’. There are no statics in Scala (use singletons), because static members are defined on a class, not an instance. But, more to the point, even functions are objects and therefore can be passed as method arguments. That is an incredibly powerful construct. Take a look at the filter method of Scala’s List class:

def filter(p: A => Boolean): List[A] = this match {
    case Nil => this
    case x :: xs => if (p(x)) x :: xs.filter(p) else xs.filter(p)
}

The argument to ‘filter’ is a function! The function, which we’ll call ‘p’, accepts an arbitrary type ‘A’ and returns a Boolean. The output of the filter function itself is a list of A’s.

Scala has sophisticated pattern matching. You might have noticed in the example above that the ‘case’ statement didn’t look like what you might see in C++ or Java, where case statements are limited to matching against ordinal types. The best you can do in those languages is something along the lines of “if x is 2 then do this, if x is 4 then do that.” With Scala, we can pattern match using sequences, types, and even regular expressions. We can use wildcards as well, and even do deep inspections of an object’s variables.

Here’s what it might look like to match on a sequence:

for (l <- List(A,B,C))) {
   l match {
      case List(_,7,_) => println("There are three elements, and the middle one is 7.")
      case List(2)     => println("A singleton list with element 2.")
      case List(_*)    => println("Any other list.")
   }
}

Say goodbye to NPE’s. Java programmers know that acronym — the dreaded NullPointerException. The typical scenario is that you invoke some method, expecting to get an Object back. Once you receive that Object you try to invoke one of the methods defined on it, only to find out your ‘Object’ isn’t really an Object afterall — it’s ‘null’. To get around this Java programmers end up putting in a lot of defensive code that just clutters things up.

The Scala solution is to encourage the use of the ‘Option’ class (subclassed by ‘Some’ and ‘None’) whenever there is a possibility that the return may not refer to a value. Take a look at the example below that creates a Map and then retrieves objects from it.

val bookMap = Map(
   "Moby Dick" -> "Melville",
   "Great Expectations" -> "Dickens",
   "The Art of War" -> "Sunzi")
 
println("Moby Dick: " + bookMap.get("Moby Dick").getOrElse("unknown"))
println("The Time Machine: " + bookMap.get("The Time Machine").getOrElse("unknown"))

The output of this program would be ‘Melville’, followed by ‘unknown’.

No more passing in pointers or references of objects to be modified, or defining silly return structures. There are times you want to return more than one object. To get around this you typically see something like passing in a reference to an object as an argument, even though the argument isn’t an input (it’s an output). Or, going through the overhead of creating a composite data structure who’s only purpose in life is to wrap other objects, so they can all be returned as a single object.

Scala neatly solves this with Tuples. I found a nice example on Stack Overflow that illustrates the concept nicely.

// Get the min and max of two integers
def minmax(a: Int, b: Int): (Int, Int) = if (a < b) (a, b) else (b, a)
 
// Call it and assign the result to two variables like this:
val (x, y) = minmax(10, 3)     // x = 3, y = 10

Here we see a function ‘minmax’ that takes as arguments two integers and returns a tuple (of two integers). Pretty cool!

Interfaces can have (optional) implementations. Well, that’s not quite true, but that’s the idea behind traits. In the Java world, classes can have only one parent (single inheritance). A class can implement any number of interfaces, but interfaces don’t have implementations. But, sometimes we do need a class to support multiple abstractions, and some of those abstractions may have boilerplate code that can be implemented in a high level class. That’s very difficult to accomplish in a nice way with Java. C++ supports the notion of multiple inheritance, but that has its own problems (see the diamond problem).

Scala solves this cleverly with Traits. Traits give us the ability to push that boilerplate (reusable) code up. You can think of a Trait as a partial implementation. Look for a separate post on this topic soon.

That just scratches the surface of what Scala is about, but the more I learn, the more I like it (give me the red pill please). I will likely spend several more months delving deeper into the world of Scala, and as I do I’ll write up some of the things I learn.

A Groovy way to handle user input

Several weeks ago I wrote about Prophet, my C/C++ chess program. I described in that post the approach I was using in the ParseXBoard( ) method to map user input to a “handler function” using a table of function pointers.

As it turns out, I actually have TWO chess programs under active development. The other is a Java/Groovy based chess program named chess4j. I will likely have more to say about chess4j in future posts, but for now I wanted to mention how Groovy could be used to map user input to handler functions in even fewer lines of code.

First, how it used to look. Similar to how the old Prophet used to look, except in Java:

public boolean parseXBoardCommand(String command) throws ParseException {
	if (command.startsWith("accepted")) {
		// ...
		return true;
	}
 
	if (command.equalsIgnoreCase("analyze")) {
		// ...
		return true;
	}
 
	if (command.equalsIgnoreCase("black")) {
		// ...
		return true;
	}
 
	if (command.equalsIgnoreCase("bk")) {
		// ...
		return true;
	}
 
	// more commands
 
}

That works just fine, but it’s not very elegant. I wanted to find an approach similar to what I did with Prophet. I wanted to create a map of Strings to functions, maybe using a Java Action. But then I remembered something I read in the excellent ‘Groovy In Action’ book – invokeMethod(String methodName,List args)! invokeMethod dynamically invokes the method ‘methodName’ on the current Object, or throws a MissingMethodException if a matching method doesn’t exist. With invokeMethod(), the Groovy solution is even more succinct than the C solution. There is no need to create a data structure to map user input to functions. We just catch the user input and call invokeMethod( ).

The first step is to create the input handlers, like so:

def moveNow(def arg) {}
 
def name(List args) {
	String opponent = args.get(0)
	println "opponent is: " + opponent
}
 
def newGame(def arg) {
	joinSearchThread();
	_forceMode = false;
	App.getBoard().resetBoard();
}

Once all the input handlers are in place, it’s time to write the parse routine:

public void parseCommand(String command) throws IllegalMoveException,
                MissingMethodException,ParseException {
	def input = command.tokenize()
	def method = input.remove(0)
	if ("new".equals(method)) {  // because 'new' is a reserved word
		invokeMethod("newGame",input)
	} else if ("?".equals(method)) { // can't have a method named '?'
		invokeMethod("moveNow",input)
	} else {
		invokeMethod(method,input)
	}
}

As you can see, it’s pretty simple. There are two exception cases to deal with. The XBoard protocol specifies a ‘new’ command, which is a reserved word in most programming languages. Therefore, if ‘new’ is caught the ‘newGame’ method is invoked. Similarly, the XBoard protocol says ‘?’ means ‘move now’, and we can’t have a method called ‘?’. Other than that, input is dynamically mapped to a method of the same name.

By the way, chess4j is hosted on SourceForge.net at http://chess4j.sourceforge.net/. It’s still very early in an ambitious project but feel free to download the source code and take a peek.

java.lang.SecurityException: no manifest section for signature file entry

Just a short post to document a very frustrating problem I wrestled with for a while today.

I’m in the process of converting some older Ant based projects to Maven (yes, I know, Gradle is the rage now). One of these projects is packaged as an uber jar file — meaning ALL its dependencies are just packaged right into one giant file. Thinking I had things just about wrapped up, I ran my utility to be greeted by ….

Exception in thread "main" java.lang.SecurityException: no manifest section for signature file entry 
org/bouncycastle/cms/CMSSignedDataStreamGenerator$TeeOutputStream.class
	at sun.security.util.SignatureFileVerifier.verifySection(SignatureFileVerifier.java:380)
	at sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:231)
	at sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:176)
	at java.util.jar.JarVerifier.processEntry(JarVerifier.java:288)
	at java.util.jar.JarVerifier.update(JarVerifier.java:199)
	at java.util.jar.JarFile.initializeVerifier(JarFile.java:323)
	at java.util.jar.JarFile.getInputStream(JarFile.java:388)
	at sun.misc.URLClassPath$JarLoader$2.getInputStream(URLClassPath.java:692)
	at sun.misc.Resource.cachedInputStream(Resource.java:61)
	at sun.misc.Resource.getByteBuffer(Resource.java:144)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:256)
	at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

The issue turned out to be that I was repackaging a signed jar file. There are a couple of ways around this that I know of. One option is to simply not repackage the signed jar, but I really didn’t want to deal with external dependencies. Another option is to exclude the signature file entries from the repackaged jar. Using the maven-shade-plugin, this can be done like so:

<plugin>
   <groupId>org.apache.maven.plugins</groupId>
   <artifactId>maven-shade-plugin</artifactId>
   <executions>
      <execution>
         <phase>package</phase>
            <goals>
               <goal>shade</goal>
            </goals>
      </execution>
   </executions>
   <configuration>
      <finalName>${project.artifactId}-${project.version}-uber</finalName>
      <filters>
         <filter>
            <artifact>*:*</artifact>
               <excludes>
                  <exclude>META-INF/*.SF</exclude>
                  <exclude>META-INF/*.DSA</exclude>
                  <exclude>META-INF/*.RSA</exclude>
               </excludes>
            </filter>
         </filters>
   </configuration>
</plugin>

And voilĂ ! Now we’ve repackaged the contents of a signed jar into an unsigned one. Whether that’s a good idea or not depends on what you’re doing, but for my purposes it was just fine.