Friday, August 01, 2014

Should you ever embark on a complete rewrite?

I embarked on V2 of my project SimpleDBM about 3 years back. Finally last month I closed down the V2 branch and merged the useful stuff back into the main branch.

V2 was going to be a major refactoring of the system. That is what killed it - because any major refactoring is large amount of effort. One of the best write ups on why no one should ever do this is this article at Joel on Software.

That doesn't mean one should not refactor software - it is just that small incremental changes that are immediately merged and tested with the mainline is the better way to do it.

Sunday, July 27, 2014

Life after Java

After working exclusively in Java for several years, I have been dabbling in C++ for the last year or so.  Question arises - is C++ still a viable language? If Tiobe Index is to be believed C++ has been steadily declining in popularity since about 2005 - coincidentally this was the year I decided to move from C++ to Java for my project SimpleDBM. At the time I stated my reasons for the move in my second blog post.

So what has happened in the meantime and is C++ still a viable language?

The place where I work (my day job) - I introduced Java in the realm of financial risk analytics. I led the team that converted a C++ based application to Java - and in the process we proved that the Java implementation was several times faster. The reason for this was nothing to do with the choice of the language - it was just that with Java you can focus on better algorithms and data structures, rather than fighting the language - which made all the difference in my view.

And yet it is in the realm of numerical computing where C++ is arguably the best language with the exception perhaps of Fortran (of which I have no experience sadly). The main advantages of C++ are:
  • Ability to seamlessly call C++, Fortran and C libraries - a lot of high performance numerical libraries out there are written in these languages.
  • Control of memory layout of data structures.
  • Efficient array access via pointers - and no bounds checking.
  • Templates for generating type specific code.

C++ is still an ugly language with too many features - but the recent changes in C++ 11 have made life tolerable if not completely easy. I have been looking at alternatives such as D, Go, Julia, etc. but haven't found a viable alternative yet. These other languages are either immature or have very restrictive paradigms. JVM based languages such as Scala have the same issues essentially as Java.



Sunday, May 23, 2010

Java versus Google Go - Part 2

The new Go Programming language from Google is very interesting because it attempts to bring to the world of compiled languages some of the benefits of the VM based languages, such as garbage collection and dynamic interfaces. I am considering porting one of my projects to Go, but before diving in, I would like to explore Go by writing a few small programs and comparing these with the Java versions.

Without further ado, here is a very simple program that reads a file and outputs lines to the console. First, lets look at the Java version:
package org.majumdar;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;

public class CatFile {

        public static void main(String[] args) {
                if (args.length == 0) {
                        usage();
                        return;
                }
                BufferedReader reader = null;
                try {
                        reader = new BufferedReader(new FileReader(args[0]));
                        String line;
                        while ((line = reader.readLine()) != null) {
                                System.out.println(line);
                        }
                } catch (Exception e) {
                        System.err.println("error: " + e.getMessage());
                } finally {
                        close(reader);
                }
        }

        private static void close(Reader reader) {
                if (reader == null)
                        return;
                try {
                        reader.close();
                } catch (IOException e) {
                }
        }
 
        private static void usage() {
                System.out.println("usage: CatFile ");
        }
}


Now, the same program implemented in Go:
package main

import "fmt"
import "os"
import "bufio"

func usage() {
        fmt.Printf("usage: catfile \n")
}

func main() {
        if len(os.Args) < 2 {
                usage()
                return
        }
        f, err := os.Open(os.Args[1], os.O_RDONLY, 0)
        if err != nil {
                fmt.Printf("error: %s\n", err)
                return
        }
        defer f.Close()
        r := bufio.NewReader(f);
        for {
                line, err := r.ReadString('\n');
                if err == os.EOF {
                        break
                }
                if err != nil {
                        fmt.Printf("error: %s\n", err)
                        break
                }
                fmt.Printf("%s", line);
        }
}
I am really not sure which one of the two is more readable.

The main differences in the two programs are in how errors are handled, and how resources are cleaned up.

Java offers the finally clause in a try block for cleaning up resources; the Go approach is to allow functions to be scheduled to be invoked when the enclosing function returns via the defer statement. The Go approach doesn't offer much programmer control over when the cleanup should occur. With a try block, the placement of the cleanup code is more under the programmer's control.

Error handling in Java is based upon exception management. Go doesn't have exception management yet; although some form of exception management is planned. The authors of Go seem opposed to exception handling as a mechanism for error handling; their argument is that the try-catch-finally construct makes the code convoluted and that encourages programmers to label ordinary errors as exceptions. My personal preference is for the Java approach because it forces you to handle the error condition. By convention in Java (although the language does not enforce this), error conditions are indicated via exceptions and not by return values.

I think with either approach you can write bad code that doesn't handle errors properly. In Java, you can do this by handling the exception incorrectly; in Go, if you forget to check for an error condition, the program will probably fail at runtime in an unexpected way.

My initial thoughts are that I prefer the try-catch-finally approach to the Go approach, both for error handling and for resource cleanup. Of course the Java approach isn't perfect; for example, the usefulness of checked exceptions is doubtful, and there could be better support for resource cleanup - in fact this is coming in Java 7.

The programs listed above are trivial, and the comparison is not really fair as the strengths and weaknesses of the two languages are not clear. I am hoping to compare two additional programs - a simple TCP/IP server implementation, and a Lock Scheduler implementation. I have the Java versions of these, and am hoping to write the Go versions in the next few days.

Friday, April 02, 2010

Testing concurrent programs

Testing concurrent programs is particularly hard, as the interleaving of multiple threads of execution greatly multiples the number of possible code and data access paths. It is quite challenging to write test cases that properly test such scenarios, usually only a handful can be tested.

A unique tool that helps with testing for concurrency bugs is an IBM product named ConTest. ConTest does not generate any new test cases, but if you already have multi-threaded test scenarios, it increases the likelihood of bugs being triggered by introducing random pauses in thread execution.

A while ago, I tried running ConTest against my test suite for SimpleDBM; I found that execution had slowed considerably. So expect your test cases to take much longer to complete.

Sunday, March 28, 2010

Global Loggers

It seems that all existing logging libraries assume that you want a single global Log Manager, tied to the class loader. Only static methods are provided to access the Log Manager or Logger instances.

I have been rigorously removing all static objects from SimpleDBM, so that the entire object graph of SimpleDBM is rooted in the main Server object. Doing this not only makes the code more robust, it also allows multiple instances of SimpleDBM to coexist in the same classloader without conflict. But where this model has broke down is in the Logger implementation, which is a wrapper for either Log4J or JDK Logging, and neither of these allow non global instances of the Log Manager.

Much as I would loathe to do this, it seems the only solution is to roll out my own ...

Is anyone else facing this issue? 

Life is too short

Life is too short to be able to master multiple programming languages and tools. So while I would love to learn the new Go Programming Language, Python, AJAX, and a few other cool new things, I keep going back to what I already know, the Java programming language. I can write a small utility faster in Java than if I wrote the same utility in Python; not because Java is particularly productive, but because I do not have to spend time figuring out how to do something in Java.

This is where I think something like the Google Web Toolkit is a Godsend for someone like me. Being able to create a web user interface in Java, without having to know the intricacies of  AJAX, JSP, Java Server Faces, etc. is way too cool. Of course, there is a learning curve here too, but it is a less steeper curve because the language is already familiar.

I am using GWT to create a small application to demonstrate the use of SimpleDBM. I am just loving the experience. Kudos to the Google team for coming out with GWT!

Sunday, March 21, 2010

Java versus Google Go

An interesting new systems programming language was announced by Google at the end of last year - The Go Programming Language (www.golang.org). Is this what Java should have been?

Go is of course new and old. It is a new language that derives a lot from the past work done by its creators at Bell Labs. You can even see copyright notices from Plan 9 etc. all over the place. Therefore although the language is new, it is built upon years of experience.

In general I like the new language. Two features are particularly nice:
  • Any object can be cast to an interface as long as the object implements the signature of the interface (sorry for using Java terminology here).
  • Go routines are cool as it overcomes the problem that equating a thread with a process flow creates. In other languages, if your thread blocks, your program halts. In Go, a routine that is blocked is moved out and some other routine takes it place on the thread. This will be very good for servers that need the ability to multiplex processing over a limited set of threads.
There are a few ugly things too. My dislikes are the built in allocation functions new() and make() and the  pointer type. Why this mess? Java is so neat you either have primitives or references. References are like pointers except that you cannot do any pointer operations with them.

The top feature that is missing is an exception handling mechanism. I have programmed many years in C and now in Java, and I can tell you that it is far easier to create robust error handling in Java. Of course, checked exceptions were a mistake (I have changed my mind about them) and I think Go should avoid them.

I wish I had the luxury of rewriting SimpleDBM in Go. It would be an interesting and fun thing to do. But I have better things to do...  I am hoping that I can at least create a network client in Go, so that it is possible to talk to the SimpleDBM server from Go.

Tuesday, December 04, 2007

Java on Mac OS X

I posted elsewhere that I now own an iMac and am doing most of my Java development on it. The big problem with Java on Mac is that the JDK is pretty out of date. The official version that comes with Mac OS X 10.4 Tiger is 1.5.0_07. This version seems buggy, as the JVM seems to hang when running test cases for one of my projects. I was getting quite frustrated, and in the end decided to run my test cases under OpenSolaris, which I installed as a VM using VMWare Fusion.

Fortunately, Apple has recently made available a pre-release version 1.5.0_13 from its Developer Connection web site. You need to be registered to be able to download this release. The good news is that it fixes the bug that was causing my tests to hang the JVM. I think it has something to do with the java.util.concurrency classes. Ofcourse, if you don't use these, you would not have faced the problems I was facing.

The long gap between the Java releases on Mac seems unacceptable, especially since the JDK is so buggy. The other problem is the lack of support from major vendors for the Mac platform. I would have liked to install and use IBM WebSphere and Rational products on my Mac, but it isn't amongst the supported platforms. Linux has much better support now-a-days.

Fast forward

It has been sometime since I last blogged about general programming topics. My day job has kept me from the joys of exploring the JPA. But many other things have happened to me in the meantime, life just goes on, and there is never enough time to catch up on everything.

For a start, I want to retitle this blog, by removing the Java from it. Let this just be a Programmer's Blog.