TL;DR: Source here, live demo here.

ROME project

ROME project

WebGL is an amazing piece of technology that enables browsers to natively render hardware accelerated 3d creations (yay, no o3d plugin needed!). I’ve always been specially amazed by what Mr Doob has been doing with his Three.js framework for quite a while (in particular his participation on the ROME project, which I briefly talked about recently). Nonetheless, there are some other amazing WebGL creations around, such as those featured on Chrome Experiments, those crafted by OOS and the projects recently presented on WebGLCamp (not to mention the amazing Team Fortress 2 level vizualizer).

One thing that has bothered me though, throughout all the examples, is the lack of interactivity (one glaring exception was GLGE‘s car demo) .

This seemed to be from the fact the 3d collision is quite a bit more involved than 2d (MIT’s lecture notes on Computational Geometry, and even on Doom 3′s recently released source code, can give you an idea of how much involved it can get). And since this is considered to be one of the important things  that 3d games need, I was not happy with my hands off answer of this topic on Hacker News. Thus I took the challenge of making a a 3d game in WebGL with collision detection.

Implementing Minecraft Classic (which is playable online for free) seemed like a good candidate for such project, as its mechanics are simple, and yet meaningful (not to mention that I am big fan of Notch’s creation). If you don’t know anything about Minecraft, I wrote a small intro about it a few months ago.

Enterprise created in Minecraft

Enterprise created in Minecraft

Three.js was selected as the rendering lib not only because I really like Mr Doob’s work, but also because it has is quite mature, is open source, has lots of examples and provides a very promising starting point: a visualization of a Minecraft world, which includes a noise function for generating it.

So all that was left was just adding Collision Detection. Well… this is where things started getting interesting.

Minecraft Collision Detection Attempt 1: JigLibJS

Jiglibjs is a port of Jiglibflash, which in itself is a port of Jiglib for c++. It seemed pretty promising, since there was some people using it (along with three.js), and I had great success with using physics libs in the past for 2d games, so it seemed like a natural choice.

After getting it to work with a prototype of the game, it became clear that some things were very wrong. The amazing demos were too simple for Minecraft, and the rotation would get really wrong when walking on simple plane surfaces (something I speculate happening due to some numerical stability differences from ActionScript and Javascript, alongside with some AS3-> Flash compiler bugs).

JigLibJS demo

After some fun 4 dimension matrix hacking, and failing to get around the bugs, I was ready to move on. Thankfully I found another port of JigLibJS which seemed to correct such issues…

Minecraft Collision Detection Attempt 2: JigLibJS2

The fact that someone tried to make a more complex example on JigLibJS, failed, but was tenacious enough to write its own AS3 -> JS compiler to make another port (which worked!) made a very positive impression.

JigLibJS2 Vehicle Physics Demo

It was also very positive the fact that the interface was almost the same to JigLibJS, making the change to the existing code very small. The problems started to come when trying to make the player cube jump: the cube would not always collide with the bottom plane. This was solved by making more collision iterations. Soon enough another problem came up: when trying to move inside an adjacent static cube (even when both the player and the cube were in the same height), sometimes the player would get a rotation. Trying to set it to no rotation every iteration did not actually help, as the player would still sometimes get a vertical velocity when trying to penetrate the block.

Still, rolling my own Physics Engine seemed a bit daunting, and I decided to try out another lib:

Minecraft Collision Detection Attempt 3:  Ammo.js

Ammo.js is a port of the C++ physics engine BulletPhysics, that runs on top of the LLVM‘s Javascript port, aka Emscripten. Everything looked amazing this time: the demos not only looked great, but were very hackable, and had no problems that I previously had.

Ammo.js Demo

But then it came the time to try to get to more Minecrafty world dimensions. It was a bit disappointing when mere 400 cubes made the physics engine go to a crawl (even when using static cubes). It became clear that I needed O(1) collision algorithms, which is very doable for Minecraft Classic, as only the player can move, and there is always a constant amount of cubes the player can collide with at any given time. And now there were no more libs left to evaluate.

Minecraft Collision Detection Attempt 4:  Rays!

Rays are the standard way of detecting line/Object collision in Three.js. A very simple interactive demo by OOS seemed like it could do the trick. It had the advantage of being very simple, and constant time (given that I selected the possible blocks to collide with, as the traditional Ray.intersectObjects actually tries to intersect all objects on the Scene).

Ray Collision Demo

The OOS example had some issues (like trapping the player cube when jumping up and down, while hodling the forward key). This was solvable by using 12 rays correspoding to the edges of the cube that represented the player. Actually, this did not help all the time, as some rays would not collide with the world blocks if the ray’s origin inside a blocks’s face. This is a bug yet to be solved, but I got around it by using 24 rays (two directions for every edge of the player cube).

Things were going well enough that I finally moved into adding textures to the game. However, after making the player cuboid have size dimensions to Minecraft’s, instead of having the same size of the world’s cube (which was what I was experimenting with so far), I noticed that the ray would give false positives depending on the height position of the cube. This only happened when the numbers were all too exact, instead of with minor deltas, as you’d expect from using rays cast from the mouse pointer (as it is usually used on Three.js’ demos).

This, and the fact that collision was taking about half the time of every tick (which was constant, due the improved collision algorithm) made me move on to…

Minecraft Collision Detection Attempt 5: Cube Projection

Up until now I had hopes that I would be able to eventually rotate the player cuboid according to its camera. After having so many troubles with so many collision systems, I simplified the problem: collision of unrotated cuboids. This is a really simple problem: from the Separating Plane Theorem, it is easy to see that  I only need to see if the orthogonal projections of the cuboids, which, from the Separating Axis Theorem, means I only need to check out if the unrotated rectangles from the projected faces collide, which finally means I only need to check if the projected intervals collide. All of this amounts to 18 lines of Coffeescript:

CollisionUtils =
    # The two intervals are [s1, f1] and [s2, f2]
    testIntervalCollision: (s1, f1, s2, f2) ->
        return true if s1 == s2
        return f1 >= s2 if s1 < s2         return f2 >= s1

    #Cubes are objects with vmax, vmin (the vertices with greatest/smallest values)
    #properties. Assumes unrotated cubes.
    testCubeCollision: (cube1, cube2) ->
        fcol = CollisionUtils.testIntervalCollision
        for axis in ['x', 'y', 'z']
            collides = fcol cube1.vmin[axis], cube1.vmax[axis]
            , cube2.vmin[axis], cube2.vmax[axis]
            return false unless collides
        return true

window.CollisionUtils = CollisionUtils

Which in retrospect is what I should have tried in the first place. At least I learned some stuff in the process.

Not quite finished yet: Camera

Paul Irish did a pretty amazing job with the Three.js FirstPersonControls.js, which is the one that powers the Three.js Minecraft visualizer. The problem with this camera is that it makes hard to actually play Minecraft, as its default mode is to be always moving, making hard to place/remove blocks. Real Minecraft uses first person shooter camera which cannot be achieved with current browsers, as there is no way to trap the user’s mouse. Nevertheless, Minecraft on Android uses a touch and drag camera that can easily be implemented in JS. This camera works they same way as the one on Brandon Jones’ Quake 3 WebGL implementation.

Quake 3 demo

Therefore I refactored, and converted to Coffeescript, the FirstPersonControls to work as a click and drag camera. The resulting 86 lines of code can be seen here.

Adding/Removing Blocks

This is where Rays worked really well. In fact, Mr. Doob even have a voxel editor example which shows really well how to make an app that adds/removes cubes on a 3d grid:

WebGL Voxel Editor

The only issue I’ve found was that my floor plane was too big, which messed up the Ray/plane collision in certain angles. So I coded this intersection directly:

    getCubeOnFloorPosition: (ray) ->
        return null if ray.direction.y >= 0
        ret = vec()
        o = ray.origin
        v = ray.direction
        t = (-o.y) / v.y
        ret.y = 0
        ret.x = o.x + t * v.x
        ret.z = o.z + t * v.z
        return @addHalfCube ret

Conclusions

I felt the result was quite satifying (the live demo, and the MIT licensed source, can both be found on Github):

Minecraft Brick Pyramid

WebGL Brick Pyramid

Granted, it starts skipping frames quite often as you add more blocks. The original example from Dr. Doob’s  didn’t because he created a single Mesh (aka object scene) composed of all blocks. Doing such would make adding/removing blocks a lot more involved (or force me to handle some area loading/unloading), which is a project in itself. Note that the rendering, and not the collision system, is the real bottleneck at this point.

All of this made me admire Notch’s work on Minecraft a lot more, as Minecraft can handle over 21×21 loaded chunks of 16x16x128 (over 14 million blocks!) in any given time. The Three.js community had some great insights on how to achieve such performance over WebGL, which requires the use of shaders, which are quite low level (even if you use the respective TDL Google Library for this, or GPipe to write the shaders in Haskell), and would probably require a lot of collision code to be also written in also a very low level language (slash GPipe’s Haskell). I found it also interesting that shaders can be used in some clever ways to improve JS performance.

And finally, it is important to note that a lot of very people a lot smarter than me have been doing some great work to make working with WebGL and making 3d games much simpler.

There are also other renderer libraries besides Three.js: Scene.js, PhiloGL, A3 (recently presented on the WebGL Camp), Coppercube (which is not open source, but can use flash for 3d rendering as well) and GLGE.

Hacker News Comments

A language that doesn’t affect the way you think about programming, is not worth knowing

Alan J. Perlis

In the past few years Javascript has gained a lot of attention and ubiquity: HTML5 technologies leverage a lot of Javascript, which enables people to create amazing dream worlds (like the in the ROME project) with WebGL, V8 brings a lot of JIT techniques to a JavaScript Virtual Machine which helps Google Chrome be a very fast browser, and powers NodeJS (allowing people to create a web page in a single programming language).

Javascript also powers queries in NoSQL databases like Mongo and CouchDB, and it can be used when making QT applications, 3d Games in Unity and even mobile apps with frameworks like WebMynd and PhoneGap. It has been a long way from the old days when it was confined to the browser, and mostly used for form validation.

In spite of all of this attention, Javascript has been so misunderstood that attention to its Good Parts had to drawn. It doesn’t help that its prototype based OO was first introduced by the rather unknown language Self, despite the several advantages it has when compared to traditional class based OO (the paper Organizing Programs Without Classes, written by Google’s Senior VP of Operations Urs Hölzle, who, among other things, also contributed to key JIT techniques like polymorphic inline caching).

Therefore it is not surprising that there are many projects that compile existing languages to Javascript. Many of these were too focused on the web platform (like Google Web Toolkit and the amazing Cappuccino‘s Objective-j). On more recent years languages are targeting the whole JS ecosystem (which makes a very poignant argument that JavaScript is Assembly Language for the Web). Coffeescript is one of such languages, which very fond (I’ve  written about it recently).

About a month ago ClojureScript was released, porting the Clojure language from Java ecosystem to the JS. Clojure is quite an amazing effort of engineering, not only for being a very successful Lisp on the JVM, but also for its novel approach of handling time and state (which its creator, Rich Hickey, explains really well).

I was really excited to see the examples, but I was a bit bummed out that the most interesting example was a Twitter visualization tool (which feels a bit too much like a 2010 app). Since both CoffeeScript and Clojure are fun languages, and I wanted to see how ClojureScript would compare to CoffeeScript,  I took the challenge and crafted a simple game HTML5 physics based game on both languages, using Box2dWeb, a js port of Box2D (the physics engine, created by Erin Catto, that is behind Angry birds).

The game consists of clicking on the objects to destroy them, so that they don’t reach the top of the canvas (and, in another very Tetris like fashion, the elements pop out faster the more you play). It really sticks to the bare minimum of Terrano’s Hierarchy of Gamer Needs. All the code is open source and can be fount on Github. The CoffeeScript‘s version source can be found here, and the ClojureScript‘s here.

Lessons Learned

Disclaimer: ClojureScript is pretty much in alpha status, so many things are likely to improve in the future.

Compiling: The first thing that really pops up is how fast Coffeescript compiles down to JS. The watch behavior allows you to fire the compilation process and forget it. ClojureScript takes me about 5 seconds to compile a single file. Granted it gives warning about unused/undefined variables, but I’d really prefer it to compile instantly and let the browser tell me this on runtime.

Namespaces: Clojure’s namespace are implemented as global variables, which are shadowed by local variables with the same name. For instance, if you are in a namespace called game, don’t use local variables and arguments named game. This is really important, as ClojureScript will use the global namespace for every single function defined in that namespace, so shadowing it is likely to give all sorts of errors.

ClojureScript is not Clojure: Fogus wrote an interesting piece on the lack of eval on ClojureScript. Even though I find it might make sense for some web pages, when making WebGL games, or even Canvas 2d games, the assets size can easily overshadow then entire library’s size. Which is not a big deal if you use HTML5′s Cache manifest. In the end, it felt very much like the opposite of the Lisp spirit (epitomized by Paul Graham on his Five Questions about Language Design: “Give the Programmer as Much Control as Possible“).

The documentation is quite clear that eval is not supported. What it is not clear, is that this argument against eval permeates many other functions: resolve is not implemented (neither ns-resolve, or the *ns* definition). Without both of them, there is no way to transform a string into a function. For people more used to OO languages, like Javascript, Ruby and Python, this essentially means that ClojureScript doesn’t have any reflection APIs. On the game:

createElement: ->
  randomY = (0.2 + 0.4 * Math.random())*  H / @scale
  randomX = (Math.random() * (W - 50) + 25) / @scale
  type = @objectList[randomInt(@objectList.length)]
  @["create#{type}"] randomX, randomY, Math.random() + 1

The last line of the Coffeescript version uses reflection to get the correct  method name (to decide to invoke createTriangle, createCircle or createSquare ). The ClojureScript version had to be translated into:

(defn- create-element [game]
  (let [randomY (/ (* H (+ 0.2 (* 0.4 (rand)))) scale)
        randomX (/ (+ 25 (* (rand) (- W 50))) scale)
        type (rand-nth [:circle :square :triangle])
        method (keyword (str "create-" (name type)))]
    ((@game method) game randomX randomY (inc (rand)))
    )
  )

Which is possible because the game has the respective functions as keyword attributes, which can be easily converted from string interpolation.

User Macros are not supported: At the moment at least (support for it is likely to come on following updates). This makes the situation above much harder to take. But this is mostly due to ClojureScript’s alpha status. This does make the code a bit longer (the CoffeeScript version  has 236 lines, while the ClojureScript has 301 lines). In order to circumvent what I consider that  would be one of the ugliest bits caused by lack of macros (several (set! (. obj attr) value)  calls), I defined a js-set function:

(defn- js-set
  "Sets an attribute name to a value on a javascript object
Returns the original object"
  ([jsobject attr value]
    (do (native-set-wrapper jsobject attr value)
      jsobject))
  ([jsobject & values]
    (do (doseq [[attr value] (apply hash-map values)]
          (native-set-wrapper jsobject attr value))
      jsobject)))

This is actually really against Clojure’s spirit, as Clojure really promotes immutable code. However Javascript libraries, in particular Box2dWeb, really expect mutable state. Therfore handling native js objects require such functions (note that converting them to clojure and keeping it on clojure land can be easily done with the nice undocumented function js->clj function, which is actually used on TwitterBuz).

Therefore we can write functions this way

(defn- create-fixture
  ([shape] (js-set (b2FixtureDef.)
  :density 3
  :friction 0.3
  :restitution 0.9
  :shape shape
))
([] (create-fixture nil))
)

Instead of:

(defn- create-fixture
  ([shape] (let [f (b2FixtureDef.)]
  (set! (. f density) 3)
  (set! (. f friction) 0.3)
  (set! (. f restitution) 0.9)
  (set! (. f shape) shape)
))
([] (create-fixture nil)))

Which makes it look a lot like assoc function for creating maps with updated values. This version inspired me to refactor the Coffeescript version using a similar assoc function:

createFixture = (shape) ->
f = new b2FixtureDef
f.density = 3.0
f.friction = .3
f.restitution = .9
f.shape = shape if shape?
return f

became:

assoc = (o, i) -> o[k] = v for k, v of i; o

createFixture = (shape) ->
assoc new b2FixtureDef,
density: 3
friction: .3
restitution: .9
shape: shape

Which is quite similar to ClojureScript’s version (the colons are on the right instead of the left, and it requires a comma). Which reduces the amount of accidental complexity to a minimum.

Edit: Thanks everybody for pointing out that you can use macros with Clojurescript. However, at the moment, the are clojure macros (so no js), and they require you hacking your clojurescript to add the macro files in the classpath. Hiccups and cljs-3d are two projects that do this, so you can see on their build files how they do this. Even then, you still need to use require-macros. All of this makes macros less of a native feature on Clojurescript, and it makes a lot harder to share code seamlessly.

IDE support: Clojure’s IDE support is really nice. Intellij’s La Clojure (avaiable on its free Community Version) does a lot more than mere syntax highlight: minimal refactoring support, rainbow parenthesis, smart parenthesis, syntax highlighted repl, great autocomplete support, awesome code navigation, autocomplete for java classes and live templates. And it works pretty well for ClojureScript as well. Other IDEs are also great, even though Emacs support can be a bit more intense on its setup (which is not something emacs users are unfamiliar with).

Even though I am really happy with Github’s founder Chris Wanstrath work on the Emacs mode for Coffeescript, it doesn’t have the same support that Clojure does. It is getting more and more support, but nowadays Clojure has the upper hand.

Debugging support: Browser support for debugging languages that compile down to javascript is coming, but at the moment Cofffeescript compiles down to such a readable JS that it not a big problem. This is a known issue with ClojureScript at the moment. Even on pretty print compile mode.

Conclusions

Since Javascript on the web has a much more simple execution model than Java, Clojure’s amazing concurrency control mechanisms are not as shining. Nevertheless ClojureScript is a delight to work with. As it moves out of its Alpha status, many of the issues are likely to be gone. I also expect it to support the full Clojure language (including things like resolve, letfn, macros and eval), as writing web apps in a single language on client and server is a really nice feature. This would also make ClojureScript even more interesting, as it would allow developers do leverage all the power of existing Clojure libraries into their Javascript work (and possibly use it on a more polyglot environment).

Coffeescript is more suitable for production apps right now, but it  is nice to see all these developer efforts to allow people to be more productive and happy with their work on Javascript platforms (this way we don’t have to wait for Google’s NaCl and PNaCL, which promise to bring even more languages to the environment).

For those who don’t know, Minecraft is an inspiring indie game that places the gamer into a sandbox 3d world, where everything is made of blocks. Blocks can also be crafted into other blocks through recipes. Besides having over 10 million users and being feature on Techcrunch, what makes it quite unique is what the users have created with it: from a 3d block replica of the enterprise, to giant spaceships and entire 8 bit CPUs. Users even created a real time kinetic world terraformer, a tool that lets you use a 3d printer to bring to physical world creations from minecraft, and tools to import good old fashioned 3d models into minecraft.

As expected, Minecraft has its tools and mods. And even though there are several libs for view the world of a save file, I found that, in spite of its vibrant community, there were no libs for editing the world. Note that MCEdit allows some  hacking, but it is mostly a GUI editor (and a very good one in my opinion).

So I hacked together a simple library for manipulating the world files called RubyCraft. To illustrate the simplicity it enables, turning the first chunk completely into gold is a simple one line:

Region.fromFile(filename).chunk(0, 0).block_map { :gold }

And making all blocks into orange wool is as simple as

Region.fromFile(filename).chunk(0, 0).each do |b|
  b.name = :wool
  b.color = :orange
end

The result:

The issue with this Api is that it leaks a bit the Minecraft abstraction of how the world is divided. In a nutshell, the world is divided into region files, each one is divided into a 32 x 32 matrix of chunks, which is nothing more than a 16x16x128 cube of blocks. To manipulate the chunks inside a region file, you can request a cube, giving its initial point, width, length and height. The same code above could be written ignoring the chunk abstraction like this:

r = Region.fromFile(filename)
c = r.cube(0, 0, 0, :width => 16, :length => 16, :height => 128)
c.each do |block, z, x, y|
  block.name = :wool
  block.color = :orange
end

A cube can span several chunks, but at the moment it can’t span several regions. It might not be a big issue, as a Region is a pretty large area (it contains over 33 million blocks), and it can take a while to save an entire region (the time it takes to save a Region is proportional to the changed chunks), even in JRuby (which I found to be 3 times as fast than MRI for this particular task).

A Gnuplot in my Minecraft

Edit: The save file for the resulting world can be found here.

After turning Minecraft world file into a 3d matrix, making a two real function plotter quite simple. The plotting_example.rb mostly contains code that decide the area where the graph will be plotted, centering the function on the xz axis, and more importantly, it plots f(x, y) for a given f:

def plot(function, fillFunction)
cube = getCube
middlePointX = length / 2
middlePointZ = width / 2
centeredF = proc do |x, z|
function.call(x - middlePointX, z - middlePointZ).ceil
end
points = Set.new
yzraster(centeredF, points)
yxraster(centeredF, points)
modifyBlocks(cube, centeredF, fillFunction, points)
end

Quite straightforward. The functions yxraster and yzraster have a mild subtlety: just plotting the points of f(x, y) can prevent a look from looking continuous. In general plotting algorithms you have to find a plane or another elementary surface to approximate a small region. As minecraft only contains blocks, I’ve joined all points by discrete line segments, using Bresenham’s line algorithm (source here). This is done by transversing the plotting cube with xy planes, and then with zy planes (therefore only the 2d version of Bresenham algorithm is needed).

Also note that f(x, y) is coerced into integer values by taking the ceil. This is because Bresenham’s algorithm expects points defined on Z x Z, but is expected, as the resulting points would have to be coerced into a integer y coordinate anyway because of the Minecraft world definition.

With all of  this, the following examples are easy to create:

Diamond Cone:

plotWith :diamond_block do |x, z|
   sqrt((x** 2 + z ** 2) / 3) * 5 + 20
end

Water Hyperbolic Paraboloid

    plotWith :water do |x, z|
      (x** 2 - z ** 2) / 3 + 50
    end

Lava Surface 10 of Gnuplot examples

    plotWith :lava do |x, z|
      log(x ** 4 * z ** 2 + 2) + 20
    end

Netherrack Surface15 of Gnuplot examples

    plotWith :netherrack do |x, z|
      (sin(sqrt(z ** 2 + z ** 2)) / sqrt(x ** 2 + z ** 2)) * 30 + 30
    end

Golden rotated Sine

    plotWith :gold do |x, z|
      sin(sqrt((x** 2 + z ** 2)) / 2) * 10 + 30
    end

Ice Sphere (half sphere actually)

    plotWith :ice do |x, z|
      sqrt(18**2 - x**2 - z **2) + 30
    end

Wooden Polynomial

    plotWith :log do |x, z|
      x /= 5
      z /= 5
      (x + z) ** 5 + x**3 + z**2 + 30
    end

Obysidan Polynomial Quotient

    plotWith :obsidian do |x, z|
      p1 = (x + z) ** 6 - x ** 3 + z **2 + 50
      p2 = x ** 7 + 6* z ** 6 - x **4 - z**2 + 30
      p1 / p2 + 10
    end

Colorful Paraboloid

    plot(proc {|x, z| (x** 2 + z ** 2) / 3}, proc do |b, z, x, y|
        b.name = :wool; b.data = OrderedColors[y * 16 / 128].data
      end)

The Ordered Colors of the Colorful Paraboloid are the Wool Colors of minecraft sorted by distance to the black color. The distance definition is the same as the one from the kinetic experiment.

The plotting class cannot plot parametric surfaces at the moment. However, since the graphs are real minecraft objects, they can be manipulated as any other minecraf object. For instance, it is possible to turn the golden rotated sine into a roller coaster (source here):

All the code is open source and can be fount on Github.

Acknowledgements

The examples use an edited version of the Low Dirt Tyken‘s test world. The screenshots were take while flying using Single Player Commands mod. The algorithm for parsing the region file was based on Weeble’s work. Parsing the nbt binary is done through NbtFile gem.

Coffeescript is a very nice (and relatively new) language that compiles down to javascript, making web programming (and making firefox plugins, nodejs apps, and so forth) much more joyful. Its object model is the same as javascript (one of coffeescript’s motto is Unfancy JavaScript), and its compiled js form is quite easy to read and debug. It has many niceties, including classes (effectively making the prototype chain a first class citizen of the language) and array/object comprehensions (heavily influenced by python’s list comprehensions).

Ruby also has a influence on the language, such as optional parenthesis on method/function invocation. In fact, the original version of Coffeescript compiler was written in Ruby (but nowadays coffeescript is a self-hosting language).

Coffeescript has been used by several projects, including a mobile framework written by Rail’s creator 37 signals. I’ve been using for about one year (including some open source work using a HTML 5 Canvas framework called EaselJs, a port of ruby functionalities and even a Firefox plugin).

Because of all the Ruby and Python influence on the language, and the fact that Coffeescript can convey beautiful and concise code, I had a hunch that it could get a really good position on Peter Norvig’s Spelling Corrector implementation collection (Javascript’s version currently has 53 lines, which is a lot more than python‘s 21). With some work, I managed to implement it in 21 lines as well:

words = (text) -> (t for t in text.toLowerCase().split(/[^a-z]+/) when t.length > 0)
Array::or = (arrayFunc) -> if @length > 0 then @ else arrayFunc()
Array::flat = -> if @length == 0 then @ else @[0].concat(@[1..].flat())
train = (features) ->
 model = {}
 (model[f] = if model[f] then model[f] +1 else 2) for f in features
 return model
NWORDS = train(words(require('fs').readFileSync('./lib/big.txt', 'utf8')))
alphabet = 'abcdefghijklmnopqrstuvwxyz'.split ""
edits1 = (word) ->
 s = ([word.substring(0, i), word.substring(i)] for i in [0..word.length])
 deletes = (a.concat b[1..] for [a, b] in s when b.length > 0)
 transposes = (a + b[1] + b[0] + b.substring(2) for [a, b] in s when b.length > 1)
 replaces = (a + c + b.substring(1) for c in alphabet for [a, b] in s when b.length > 0)
 inserts = (a + c + b for c in alphabet for [a, b] in s)
 return deletes.concat transposes.concat replaces.flat().concat inserts.flat()
known_edits2 = (word) -> ((e2 for e2 in edits1(e1) when NWORDS[e2]? for e1 in edits1(word)).flat())
known = (words) -> (w for w in words when NWORDS[w])
correct = (word) ->
 candidates = known([word]).or -> known(edits1(word)).or -> known_edits2(word).or -> [word]
 ({k: w, v: NWORDS[w] or 1} for w in candidates).sort((a, b)-> b.v  - a.v)[0].k

All the code is hosted on github. The code above can be seen in a more readable version here (after line 21 it also contains a full test, using a fixture generated by a slightly modified version of Peter Norvig’s original implementation). There is a more testable version, along with Jasmine BDD tests (which look a lot like Rspec’s in Coffeescript), which run headless on NodeJs, but work just fine in the browsers.

Considerations

Findall regex doesn’t exists natively in Javascript, however it is equivalent to spiting by the complementary regex (line 1)

Array::or (line 2) was needed to be implemented, because Python’s truthfulness allows a collection to be true (actually, any iterable) as long as it is not empty. Array::flat (line 3) has to be implemented because Coffeescript’s loop comprehension is a bit different from python’s: double loops (example: x + y for y in col1 for z in col2) return array of arrays instead of a single array.

Also note that loop comprehension’s order is inverted (x + y for x for y in python is translated as x + y for y for x).

Granted, this version runs really fast on NodeJs 0.4.1, and I was quite happy with how the resulting code looked. I was even happier that I did not have to write the compiled Javascript file and its whooping 147 lines of Spelling Corrector.

To see the reason why this work, check out Peter Norvig’s original post.

Many developers are used to low-level concurrency primitives, such as locks, monitors and semaphores. Java also has higher level concurrency utilities such as Atomic Objects and Fork/Join framework. Such primitives still require a lot of attention to shared variables, and are very easy to get wrong. Ilya Grigorik recently discussed other models of concurrency on his recent post Concurrency with Actors, Goroutines & Ruby, where he even introduced a ruby port of Go‘s concurrency mechanism. These models attempt to make concurrent programming easier.

The actor model is another very simple high level concurrency model: actors can’t respond to more than one message at a time (messages are queued into mailboxes) and can only communicate by sending messages, not sharing variables. As long as the messages are immutable data structures (which is always true in Erlang, but has to be a convention in languages without means of ensuring this property), everything is thread-safe, without need for any other mechanism. This is very similar to request cycle found in web development MVC frameworks.

Scala is famous for coming with an Actor library built-in. However, using Scala libraries in Ruby is not easy[1]. Akka is another great project that implements Actors, however it has a java Api, which makes the JRuby integration easier. Why JRuby? Not only to access Akka’s actor library, but also because JRuby is one of the few ruby implementations that doesn’t have the GIL, therefore it allows true concurrency for all types of applications (IO bounded or not).

Integrating Akka with JRuby

For starters, let’s first create a simple actor in Java:

public class PingActor extends UntypedActor {
	public void onReceive(Object message) throws Exception {
	    if (message instanceof String) {
	    	System.out.println("!!! Acted on: " + message);
	    }
	    else throw new IllegalArgumentException("Unknown message:" + message);
	}
}

This simple actor will just output any message it receives prefixed with “!!! Acted on: “, and will throw exception on any message that is not a string. This example show how simple it is to define an actor: just define a onReceive method that is called whenever a message is sent.

To see this actor working, we need four lines:

		ActorRef actor = actorOf(PingActor.class).start();
		actor.sendOneWay("hello actor world");
		TimeUnit.SECONDS.sleep(1);
		ActorRegistry.shutdownAll();

The first gets an actor reference and starts it. It is important to note that you cannot create an actor just by invoking new. Not in java or scala (we can solve this in Ruby). This is because there is a lot of AOP going on the background[2]. The second line just sends the message to the actor asynchronously. There two other ways of sending messages, which I’ll not cover, but you can read more in Akka’s documentation.

The last two lines just give time to the message reach the actor (remember, the sendOneWay method is non-blocking), and stops all actors on the system. Pretty simple right? Let’s see how we can do the same in JRuby. Setting up the stage:

require 'java'
module Akka
  include_package 'se.scalablesolutions.akka.actor'
end

These lines enable java and make a ruby module with all the classes of se.scalablesolutions.akka.actor package. Basic JRuby setup. Now on to defining the actor:

class PingActor < Akka::UntypedActor
  def self.create(*args)
    self.new(*args)
  end

  def onReceive(message)
    puts "!!! Acted on: #{message}"
  end
end

Here we have our first differences. The onReceive is just a cleaner version of the Java one. No type annotations, no type checking and a simpler string output. However, we have to define a classmethod called create, which just invokes new. This method seems to be created by the AOP part of Akka, which doesn’t seem to work on Ruby subclasses of UntypedActor. However, we can defined it ourselves. Now to actually using the actor:

actor = Akka::UntypedActor.actorOf(PingActor).start
actor.sendOneWay "hello actor world"
sleep 1
Akka::ActorRegistry.shutdownAll

Pretty much the same four lines as on the Java version, with a little less parenthesis, and a terser sleep method. The ruby code can be found on this page, and Java code here.

Fixing the Ruby Interface

Much of the code in the former example is infrastructure, but we can work around the static nature of Java classes in ruby. As factored out in akka.rb, we can gather this functionalities into a base class, and rewrite the first example in 8 lines:

require 'akka'
class PingActor < Actors::Base
  def onReceive(message)
    puts "!!! Acted on: #{message}"
  end
end
PingActor.spawn.sendOneWay "hello actor world"
Actors.delayedShutdown 1

Using closures we can even enhance our JRuby api with some ideas from the Scala api, making it down to 3 lines:

require 'akka'
Actors.spawn { |m| puts "!!! Acted on: #{m}" }.sendOneWay "hello actor world"
Actors.delayedShutdown 1

In this example spawn takes a block, and creates an actor that executes it every time it receives a message. As in the Scala API, spawn starts the actor as well as creating it.

But every Object is an Actor!

Alan Kay, the inventor of Smalltak and of the term OO, once said:

I’m sorry that I long ago coined the term “objects” for this topic because it gets many people to focus on the lesser idea. The big idea is “messaging”.

This is one of the reasons that Erlang with its actors form an object oriented language[3]

If we look into the resemblance of sendOneWay and the reflective method invocation, which in Ruby is made through send or __send__, it is quite easy to adapt the ruby method invocation to Actor message sending. We start with a simple delegator:

MethodParameters = Struct.new :name, :args, :block

class DelegatorActor < Base
    def self.new(target)
      ret = super()
      ret.instance_variable_set(:@target, target)
      return ret
    end

    def onReceive(message)
      param = message
      @target.__send__ param.name, *param.args, &param.block
    end
  end

The important part is the onRecieve message, which takes a MethodParameters object and invoke on the target. The caveat here is that we need to override the new method, because, for some reason, ruby subclasses of Akka UntypedActors will not invoke the initialize method with the arguments passed[4]. However, by turning any object into an actor, this can be the only place such hack is needed.

Now, the next step: adapting the actorRefs to make the ruby method invocation a actor sendOneWay:

class ActorRefHandler
    public_instance_methods.each do |m|
      undef_method m unless m =~ /^__/ or m == 'to_s'
    end

    def initialize(actorRef)
      @actorRef = actorRef
    end

    def method_missing(name, *args, &block)
      @actorRef.sendOneWay MethodParameters.new name, args, block
    end
  end

Which is a pretty standard implementation of message forwarding in ruby: remove all instance methods (except the really private ones, such as __send__), making sure all method calls are forwarded to method_missing.

With all of this we can write a simple example of making any object an actor:

require 'akka'
class HelloWord
  def hi
    puts "hello actor world"
  end
end
Actors.actorOf(HelloWord.new).hi
Actors.delayedShutdown 1

Making it faster

In the heart of all of this lies the problem: making code runs faster by using the machine’s cores more effectively. Here we build the good old canonical map-reduce example: word count. We will count 5.4 MB of Shakespeare‘s texts. The example consists of 3 types of actors: a producer, mappers, and one reducer. The producer generates the chunks of lines to the mappers, which count the words on each chunk and generate a hash of word:count pairs, which the reducer aggregates into a hash of its own.

require 'akka'
require 'regular_word_count'
include Actors
module AkkaDispatcher
  include_package 'se.scalablesolutions.akka.dispatch'
  def self.workStealer(name)
    Dispatchers.newExecutorBasedEventDrivenWorkStealingDispatcher 'mappers'
  end
end

file = File.join(File.dirname(__FILE__), 'shakespeare.txt')
input = IO.readlines(file).each_slice(500).map &:join

This code setups up the code to later define WorkStealer, so that our map actors can share the same message queue. We also load the file in memory, split into 500 lines chunks. If the chunks are too small, the mappers will receive too many messages, which makes the code go slow. If the chunks are too big, the job will not be split evenly among the mappers[5].

start = nil
values = Hash.new 0
linesToRead = input.size
reduceActor = actor do |message|
  linesToRead -= 1
  hash = message
  hash.each do |key, value|
    values[key] += value
  end
  if linesToRead == 0
    puts ">> All over: Just to say we used any computed value: #{values['shakespeare']}"
    finish = Time.now
    puts ">> Total time: #{finish - start}s"
    Akka::ActorRegistry.shutdownAll()
  end
end

The reducer actor is pretty straightforward. When all chunks are read, he shutdowns all actors and outputs the result and the time it took for the whole map-reduce chain to take place.

mapActorsSize = 2
mapActors = []
wordCount = WordCount.new
workStealer = AkkaDispatcher.workStealer 'mappers'
mapActorsSize.times do
  mapActor = actor do |message|
    reduceActor.sendOneWay wordCount.count message
  end
  mapActor.setDispatcher workStealer
  mapActors.push mapActor
end

The mappers delegate the actual work to an immutable WordCount class. The important part is the one that sets the same dispatcher on the actors. More on how this work on Akka’s documentation.

mapActor = mapActors.first
producer = actor do |message|
  for line in input
    mapActor.sendOneWay line
  end
end

allActors = [reduceActor, producer] + mapActors
allActors.each do |a|
  a.start
end
start = Time.now
producer.sendOneWay :start

These lines define the producer, start all actors, set the start time, and send the producer actor a message, which begins the map-reduce chain. It is important to note that the program’s main thread finishes on the last line. This example shows how It is possible to make it wait for the result and then resume the main thread (it requires using the other types of message sending methods, thus I’ll not cover it in detail).

Results: The sequential version runs on my machine (which has 2 cores) in about 4 seconds. This one with map-reduce actors take about 3 seconds, which yields a 25% improvement[6].

Conclusion

This post showed how it is easy to use Akka actors with JRuby and that they can easily enable thread-safe and easy to reason multicore programming. The Akka project has many other tools to help with distributed/parallel programming, such as remote actors, software transactional memory, and integrations with all sorts of persistence/queue systems. This post barely scratches the surface.

All the code on this blog post can be found on github, where all dependencies are easily available, and instructions on how to easily run the code. Give it a try, and see if you agree (or not) with others that writing parallel code can be much easier and fun.

Footnotes

[1] As Daniel Spiewak showed it on his Integrating Scala into JRuby post.

[2] Incidentally Akka was started by Jonas Bonér, who is also one of the creators of the java AOP tool AspectWerkz, which is included in Aspectj nowadays.

[3] From a interview with Joe Armstrong, the creator of Erlang:

Actually it’s a kind of 180 degree turn because I wrote a blog article that said “Why object-oriented programming is silly” or “Why it sucks”. I wrote that years ago and I sort of believed that for years. Then, my thesis supervisor, Seif Haridi, stopped me one day and he said “You’re wrong! Erlang is object oriented!”

[4] In general, you need to create a UntypedActorFactory to pass arguments to the constructor, which we already do (implicitly, using JRuby’s closure to interface coercion), but even then Ruby’s Actors will not work. This could be worked around by changing the Actors module and invoking another hook method that is not initialize.

[5] Thanks to the Akka committers Viktor Klang and Peter Veentjer for the hint.

[6] This is not a real benchmark. Making a real one requires a lot more attention to details like jvm warm-up, jiting from ruby to java, jiting on java bytecode, and so on.

Twitter Updates

  • RT @c_pruett: I spent all evening playing pre-release games that people sent me for feedback. Apparently they believe I know what I am ... 20 hours ago
  • RT @d6: Some people, when confronted with a problem, think "I know, I'll use multithreading". Nothhw tpe yawrve o oblems. 6 days ago
  • RT @CompSciFact: "Sometimes, the elegant implementation is a function. Not a method. Not a class. Not a framework. Just a function." -- ... 6 days ago
  • RT @pvblivs: A company should have working mechanisms to improve management and culture. If it doesn't political friction will be it's b ... 1 week ago
  • RT @olafjacobi: no - a business model of a German startup isn't proven just because of a huge investment round of a similar startup in t ... 1 week ago
Follow

Get every new post delivered to your Inbox.