Metaphysical Developer

RubyUnderscore: A bit of Arc and Scala in Ruby

Posted in Languages, Software Development by Daniel Ribeiro on October 31, 2010

A few months ago I’ve mentioned one thing that bothered me in ruby was No way to create simple blocks”. This is in contrast to other languages, such as Scala, Clojure and Groovy’s underscore, percent and “it”, respectively, shortcut notations. There are other languages with equivalent mechanisms as well. Even newer languages like Coffeescript have considered adding it. As James Iry mentioned, such constructs are in fact related to delimited continuations.

However ruby has Syntax Tree manipulation (via Parse Tree gem). Using it I created the RubyUnderscore project, which brings this ruby, using the underscore symbol (just like Scala and Arc). With it, it is possible to refactor the following:

    classes.reject { |c| c.subclasses.include?(Enumerable) }
    dates.select { |d| d.greater_than(old_date) }
    collection.map { |x| x.invoke }

into:

    classes.reject _.subclasses.include? Enumerable
    dates.select _.greater_than old_date
    collection.map _.invoke

The last case can also use symbol to proc coercion (appending & to symbol):

     collection.map &:invoke

However, the proc coercion is not flexible enough to allow arguments or invoke a method chain. Which I think brings a small increase in readability and code quality, not to mention that by making closures easier to declare, it fosters them to be used more. I find this to be a good thing, specially when you start to refactor your loops into maps, selects, rejects, group_bys and reduces

This also highlights another issue I mentioned in Improving Ruby: that Syntax Tree manipulation is too important to be supported only on MRI, and not throughout the implementations, like JRuby and Rubinius. Python has this built in into its standard library (through ast module and inspect.getsource), and Lisp can also do syntax tree manipulation with its macro system. The importance of such capability was mentioned by Paul Graham (one of the creators of Arc, which is a dialect of Lisp):

Letting people rewrite your language is a good idea. You, as the language designer, can’t possibly anticipate all the things programmers are going to want to do with it. To the extent they can rewrite the language, you don’t have to.

However, syntax tree manipulation in ruby is not only unsupported in most implementations, but it is also poorly documented (even though PostRank’s founder Ilya Grigorik‘s post on the subject is a very good introduction) and a bit awkward to use: the visitor from sexp-processor gem embraces side effect (mutating all the tree nodes while processing them) and the tree nodes are just arrays, unlike Python’s modules where there is a class for every node type. It is important to note that if you are willing to pre-process your ruby code, you can use ruby2ruby to generate the equivalent and regular ruby code, which will work all over.

These techniques are expected to be fixed as more people realize the gains they bring, and these improvements find their way into YARV and eventually other ruby implementations. Ruby is a very nice, clean, productive and elegant language, and it would be shame if we stopped making it even better.

Tagged with: , , ,

Closures, Collections and some Functional Programming

Posted in Languages by Daniel Ribeiro on May 2, 2009

Collection libraries are quite common today in almost all languages, and we are very glad that every non trivial piece of software does not require us to define again what a List is, neither what a Tree is. Also, we do not have to keep on reimplementing basic algorithms such as sort. Some languages even provide literal ways to define the most commonly useful collections, such as lists and maps.

Closures (also known as blocks and lamba expressions) are also common on several languages (Java and C++ being notable exceptions), even though it only started becoming more mainstream with Ruby. Besides being useful on creating DSLs , they allow us to easily define callback procedures on GUIs, declare transactions, readily craft simple implementations of Visitors and Commands patterns, enable controlled lazy evaluation on languages that do not natively support it, refactor complex switch statements to a simple map leading to Closures, and so on.

Even though closures and collections are useful on their own, when combined they allow us to abstract iteration mechanisms. Smalltalk is notable for blending closures and collections in a category of methods that form a protocol called enumeration protocol. For instance (examples in ruby, that also features an enumeration protocol, similar to smalltalk), we can rewrite this:

novels = []
for b in books
  if b.novel?
    novels << b
  end
end

Into this:

books.select {|b| b.novel?}

This way the code gets clearer, more concise and more simple, while encapsulating the actual iteration algorithm, which allow us to reuse the very same filter whether the books are coming from a list, from a file, from a database, or from a SOAP request. Not only that, but it allows us to refactor several filters of books into one line methods:

class Books
  def novels()
    return @books.select {|b| b.novel?}
  end

  def older_than(year)
    return @books.select {|b| b.age > year}
  end

  def name_starts_with(str)
    return @books.select {|b| b.name[0, 1] == str}
  end
end

But abstract iteration mechanisms are more than just filtering. Common enumeration protocols (such as those from Ruby and Smalltalk) features several methods to detect elements, to create new maps with a function, sort elements, get the maximum and minimum elements (using a closure as a definition of comparison) and so on. Many of these are inpsired by functions orginally defined on functional programming languages, such as fold and map.

The encapsulation of iteration mechanism is also usefull when we want to start using coarse-grained parallelism, such as proposed by Fork Join (which, ironically is written in a language that does not have closures) . That is: you had a sequential code, and with very little alteration, you can get a parallel/distributed one (while being cautious of side effects). In our example: Books#novels could be very well selecting the novels in separate threads, and joining them all together. It would only be a matter of changing the implementation of @books.

Concluding: Closures and Collections allows us to improve code by:

  • making it more concise
  • making it clearer
  • making it more suitable to refactoring
  • making sure you Don’t Repeat Youself
  • enabling you to easily turn sequential code into parallel

Considering that all of these points are too good to let go, and considering that java 7 will not get closures and that not only are anonymous inner classes really closures but also too verbose, I started a little project called Fluent Java. This project, among other things, brings up the enumeration protocol to java, while still keeping it terse, and easy to use (Fork Join integration is planned as well).

Edit: Java might get simplified closures after all (at least, for now, it is planned). But still too early to say regarding how this was already pulled off once, and Oracle has some issues to work through with Sun, which may push JDK7 to being released early.

Edit (22 Sep 2010): Yes, it got pulled away. From the Chief Architect of the Java Platform Group: It’s time for … Plan B. Scala has them though…