Sunday, January 25, 2015

Ravi - an attempt to create optional typing for Lua

I am in love with Lua as blogged previously. While it is perfect little language, there is scope for improving the performance of Lua. Obviously great work in this area has already been done by Mike Pall who created Luajit. However, there are some issues with Luajit that are hard to overcome.
  • Large parts of Luajit are written in assembler - which means that it would take significant investment of time and effort to understand how it works and fix issues or make enhancements to it.
  • Mike Pall is undoubtedly a genius, but he is the sole developer of Luajit. The latest version 2.1 has not been released yet as Mike is presumably working on other things as reported on his sponsorship page. So the destiny of Luajit is pretty much tied up with how much effort Mike puts into Luajit.
  • Luajit was based on Lua 5.1, and for good reasons it has stayed compatible with 5.1, avoiding ABI incompatible features in later versions. But this is increasingly going to be a problem as newer versions of Lua introduce new features.
  • Luajit's FFI is great but not compatible with Lua, so any code exploiting FFI is not compatible with Lua. 
So my solution to above is to enhance core Lua to support optional typing so that the VM can use type specific bytecode. This will hopefully help the interpreter performance but more importantly it will enable simple JIT compilation of performance critical functions.

I am naming this new dialect of Lua as Ravi. Full details of the project can be found at the Ravi github site.

Monday, August 25, 2014

Lua is small and small is beautiful

Lua is a tiny language and like C has a tiny standard library. Like many other users of Lua, there are times when I wish it had some language feature (ability to specify types, for example) - but when I think about who Lua is meant for by its designers, I get the logic behind keeping it really small and simple.

Although Lua is powerful enough as a language that it can be used for many complex tasks - see DynASM for an assembler written in Lua - its primary design goal is to provide applications with an extension language. So for example you have a Spreadsheet application, and you wish to allow users to write their own functions they can use in the Spreadsheet. Or you have a Editor and you wish to allow users to customise the editor. And so on. In all these use cases, we cannot assume that the end user who is coding in Lua is a competent programmer. Hence Lua needs to be ultra-simple for such users. Having types in the language, for example, would immediately complicate the syntax of Lua.

That Lua fulfils the needs of its users is evident from the fact that a number of attempts have been made to create a Lua clone that is more powerful as a language - but none of these alternative improved Lua clones have any great following (Note that I exclude LuaJIT from this list as it is 100% faithful to Lua 5.1 so it is another implementation of Lua rather than a clone). I guess the question you have to ask is:

  • Are you trying to create an extension language for ordinary users who are not programmers?

If not then perhaps Lua is not the language you need. 

Tuesday, August 05, 2014

Lua - A fabulous programming language

Last year I discovered Lua and LuaJIT.

These are both amazing implementations of the programming language Lua.

Ok now I need to explain why I think Lua is fabulous and these implementations are amazing.

Lua is a very small, dynamic, scripting language that is extremely easy to learn, and that can be used standalone as well as an embedded scripting language. The well documented and well crafted C API for extending the language is probably one of the best features of the Lua system.

It takes less that a minute to compile and build the Lua language and its basic libraries. Since the language is written in ANSI C, you can virtually build it on any platform.

LuaJIT is a JIT implementation of Lua created by a guy called Mike Pall who is without a doubt a programming wizard. LuaJIT features an interpreter that is hand-crafted in assembler, and has an amazing FFI library that allows easy extensions in C, including creating new data types. LuaJIT comes with an assembler called DynASM that is itself written in Lua.

Neither of these implementations depend on third-party libraries or tools ... which is an amazing thing in today's world (just look at the dependency list of Julia for comparison).

I hope to use Lua extensively in the future, so much so that I decided to learn assembler in order to be able to understand LuaJIT better.



Friday, August 01, 2014

Should you ever embark on a complete rewrite?

I embarked on V2 of my project SimpleDBM about 3 years back. Finally last month I closed down the V2 branch and merged the useful stuff back into the main branch.

V2 was going to be a major refactoring of the system. That is what killed it - because any major refactoring is large amount of effort. One of the best write ups on why no one should ever do this is this article at Joel on Software.

That doesn't mean one should not refactor software - it is just that small incremental changes that are immediately merged and tested with the mainline is the better way to do it.

Sunday, July 27, 2014

Life after Java

After working exclusively in Java for several years, I have been dabbling in C++ for the last year or so.  Question arises - is C++ still a viable language? If Tiobe Index is to be believed C++ has been steadily declining in popularity since about 2005 - coincidentally this was the year I decided to move from C++ to Java for my project SimpleDBM. At the time I stated my reasons for the move in my second blog post.

So what has happened in the meantime and is C++ still a viable language?

The place where I work (my day job) - I introduced Java in the realm of financial risk analytics. I led the team that converted a C++ based application to Java - and in the process we proved that the Java implementation was several times faster. The reason for this was nothing to do with the choice of the language - it was just that with Java you can focus on better algorithms and data structures, rather than fighting the language - which made all the difference in my view.

And yet it is in the realm of numerical computing where C++ is arguably the best language with the exception perhaps of Fortran (of which I have no experience sadly). The main advantages of C++ are:
  • Ability to seamlessly call C++, Fortran and C libraries - a lot of high performance numerical libraries out there are written in these languages.
  • Control of memory layout of data structures.
  • Efficient array access via pointers - and no bounds checking.
  • Templates for generating type specific code.

C++ is still an ugly language with too many features - but the recent changes in C++ 11 have made life tolerable if not completely easy. I have been looking at alternatives such as D, Go, Julia, etc. but haven't found a viable alternative yet. These other languages are either immature or have very restrictive paradigms. JVM based languages such as Scala have the same issues essentially as Java.



Sunday, May 23, 2010

Java versus Google Go - Part 2

The new Go Programming language from Google is very interesting because it attempts to bring to the world of compiled languages some of the benefits of the VM based languages, such as garbage collection and dynamic interfaces. I am considering porting one of my projects to Go, but before diving in, I would like to explore Go by writing a few small programs and comparing these with the Java versions.

Without further ado, here is a very simple program that reads a file and outputs lines to the console. First, lets look at the Java version:
package org.majumdar;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;

public class CatFile {

        public static void main(String[] args) {
                if (args.length == 0) {
                        usage();
                        return;
                }
                BufferedReader reader = null;
                try {
                        reader = new BufferedReader(new FileReader(args[0]));
                        String line;
                        while ((line = reader.readLine()) != null) {
                                System.out.println(line);
                        }
                } catch (Exception e) {
                        System.err.println("error: " + e.getMessage());
                } finally {
                        close(reader);
                }
        }

        private static void close(Reader reader) {
                if (reader == null)
                        return;
                try {
                        reader.close();
                } catch (IOException e) {
                }
        }
 
        private static void usage() {
                System.out.println("usage: CatFile ");
        }
}


Now, the same program implemented in Go:
package main

import "fmt"
import "os"
import "bufio"

func usage() {
        fmt.Printf("usage: catfile \n")
}

func main() {
        if len(os.Args) < 2 {
                usage()
                return
        }
        f, err := os.Open(os.Args[1], os.O_RDONLY, 0)
        if err != nil {
                fmt.Printf("error: %s\n", err)
                return
        }
        defer f.Close()
        r := bufio.NewReader(f);
        for {
                line, err := r.ReadString('\n');
                if err == os.EOF {
                        break
                }
                if err != nil {
                        fmt.Printf("error: %s\n", err)
                        break
                }
                fmt.Printf("%s", line);
        }
}
I am really not sure which one of the two is more readable.

The main differences in the two programs are in how errors are handled, and how resources are cleaned up.

Java offers the finally clause in a try block for cleaning up resources; the Go approach is to allow functions to be scheduled to be invoked when the enclosing function returns via the defer statement. The Go approach doesn't offer much programmer control over when the cleanup should occur. With a try block, the placement of the cleanup code is more under the programmer's control.

Error handling in Java is based upon exception management. Go doesn't have exception management yet; although some form of exception management is planned. The authors of Go seem opposed to exception handling as a mechanism for error handling; their argument is that the try-catch-finally construct makes the code convoluted and that encourages programmers to label ordinary errors as exceptions. My personal preference is for the Java approach because it forces you to handle the error condition. By convention in Java (although the language does not enforce this), error conditions are indicated via exceptions and not by return values.

I think with either approach you can write bad code that doesn't handle errors properly. In Java, you can do this by handling the exception incorrectly; in Go, if you forget to check for an error condition, the program will probably fail at runtime in an unexpected way.

My initial thoughts are that I prefer the try-catch-finally approach to the Go approach, both for error handling and for resource cleanup. Of course the Java approach isn't perfect; for example, the usefulness of checked exceptions is doubtful, and there could be better support for resource cleanup - in fact this is coming in Java 7.

The programs listed above are trivial, and the comparison is not really fair as the strengths and weaknesses of the two languages are not clear. I am hoping to compare two additional programs - a simple TCP/IP server implementation, and a Lock Scheduler implementation. I have the Java versions of these, and am hoping to write the Go versions in the next few days.

Friday, April 02, 2010

Testing concurrent programs

Testing concurrent programs is particularly hard, as the interleaving of multiple threads of execution greatly multiples the number of possible code and data access paths. It is quite challenging to write test cases that properly test such scenarios, usually only a handful can be tested.

A unique tool that helps with testing for concurrency bugs is an IBM product named ConTest. ConTest does not generate any new test cases, but if you already have multi-threaded test scenarios, it increases the likelihood of bugs being triggered by introducing random pauses in thread execution.

A while ago, I tried running ConTest against my test suite for SimpleDBM; I found that execution had slowed considerably. So expect your test cases to take much longer to complete.