Metaphysical Developer

The Issue with Static Typing

Posted in Languages by Daniel Ribeiro on June 30, 2010

“Shouldn’t we use Scala?” is a recurring question my peers make me. I think it is fair, since I have advocated in the past that Scala has a lot of strong points compared to Java. Furthermore, this question is usually made in contrast to dynamic languages, usually Ruby or Python.

Since I don’t want to discuss that common confusion of  strong with static typing, I’ll try to be very clear about what I’m talking about. By static type I mean that the types of all data must be known at compile time. By dynamic I mean that the types may not be known until runtime. Of course, this ignores languages that have mixed typing, such as Groovy, Objective-C and C#4.0 (as it includes the dynamic keyword), but for argument’s sake, these can be ignored as the issues of static typing will apply to these as well whenever you are using it, and will not whenever you aren’t. The common examples of languages are Smalltalk, Clojure, Erlang, Ruby, Python, Lisp, Lua, Javascript, for dynamically typed, and Java, Haskell, Scala, C#, C++ for statically typed.

Statically typed languages have as usual claimed benefits that they are faster than dynamic languages, their types provide documentation of the methods and functions, static analysis tools can be more comprehensive and yield better results, and automated refactoring is a lot easier to accomplish and can also give better results. All of these reflect more the current state of implementation and tooling of such languages than of intrinsic property of the type of the language. Steve Yegge has argued this about the speed property, but one can argue for dynamic language that its tooling could use type information from runtime sources, such as unit-testing, to yield similar results .

But some hidden complexities take place when using static typing, which may overthrow any possible benefits coming from it:

  • The biggest one is type coupling. For instance, in dynamic languages, renaming an interface is trivial as most interfaces are not even declared, just documented (such as “this must implement less than”). Structural typing can ease this, so can type classes. However, even languages that support these can have problems with other refactorings, such as adding methods to an interface. Paul Graham comments about this a little (on the essay Hackers and Painters):

Everyone by now presumably knows about the danger of premature optimization. I think we should be just as worried about premature design– deciding too early what a program should do.

The right tools can help us avoid this danger. A good programming language should, like oil paint, make it easy to change your mind. Dynamic typing is a win here because you don’t have to commit to specific data representations up front. But the key to flexibility, I think, is to make the language very abstract.

  • Some DSLs can’t be built. A common example is the XML DSL. This is because you can’t have in a pure statically typed language a type that accepts any single method call, returning any single possible return value, given any amount of any arguments of any type. In fact, if you do have this type, you in fact have the dynamic type from C#. Dynamic languages usually support this through a mechanism called method lookup alteration and interception, such as those provided by Smalltalk’s doesNotUnderstand, Ruby’s missing_method and Python’s __gettattr__ method.
  • Natural complexity. By this I mean that every statically typed languages is a proper superset of a dynamic language of itself. This is easy to see from the theory behind types, as the untyped lambda calculus is just the typed lambda calculus with one type (the brave ones can find more about this on Physics, Topology, Logic and Computation A Rosetta Stone).
  • False sense of safety. The types do not guarantee that the implementations maintain the invariants of a type (such as those described by Baraba Liskov). The common example is a comparator interface in Java. Just because a class implements the interface, it doesn’t mean that the compare method is transitive, as required by the documentation. James Iry recently commented on languages with more complex type systems (he actually said a lot more about type systems in general), that contain theorem provers in their compilers, such Agda and Epigram, which can solve this issue. However such languages have other limitations, such as not being Turing Complete. This essentially means that testing practices are still as needed as for static languages (even if you have theorem provers, you cannot be sure without some form of acceptance tests that the problem you are trying to solve is actually the one the customer had in his mind).
  • Type variances. This is actually two problems: if your languages does not support them, you have to fall back to type casting, which actually violates type-saftey and static typing in general. If your language supports it, you have to know one more concept, and when to apply it. It may not be always clear when a type should be co-variant or contravariant. This is further complicated by the fact that types with side effects (such as mutable objects) have special rules about this (further about this can be found under Variance of Mutable Types, from Programming Scala). Also this concept is so common that functions are naturally covariant on the return value and contravariant on the parameters. Scala’s two argument function documentation shows this explicitly.

The last two issues are very diminished for languages without subtyping, but these have deeper problems as, in order to relinquish subtyping, you also have to give up on any possibility of code reuse possibility from ad-hoc polymorphism.

It might seem that some points were left out, but these are usually the problem of a particular language, which are commonly and mistakenly considered to be a problem of static typing in general:

  • Lack of metaprogramming support.  Of course C++ has templates, which means that most people know that this is false for all statically typed languages. However, a variant of this issue is: type safe safe metaprogramming. This is also not true. Haskell, for instance, has a type-safe macro system. Note that you cannot make type-safe runtime metaprogramming in general. For instance: even though some languages allow you to create interfaces that do not exist on compile time, the only way to invoke methods from these is through non type-safe ways (such as reflection).
  • Verbosity. Scala is the canonical counter-example (Scala can be as terse as Clojure in some ways). The more type-inference you have, the less type annotation you have to write. This doesn’t necessarily make reading it easier, but IDEs can help here. On the other hand, it will never be harder to read than dynamic languages.

Going back to the question that originated this discussion: static types can come in handy, and they do have better tools these days. But they do bring complexities way beyond having to type a few extra characters. And these should not be taken lightly when considering to express code in a statically typed language.

Improving Ruby

Posted in Languages by Daniel Ribeiro on March 31, 2010

Some may say Ruby is a bad rip-off of Lisp or Smalltalk, and I admit that. But it is nicer to ordinary people.

Yukihiro “Matz” Matsumoto, LL2

When I heard the Smalltalk traits of Ruby, I was intrigued. When I learned more, I enjoyed Ruby’s similarities with one of the most beautiful and powerful languages I’ve known. As I dug deeper, I enjoyed more of its wonderful metaprogramming abilities, which makes Ruby’s classes a lot more dynamic and easy to declare than Smalltalk’s. After reading this in-depth comparison of both, and Kent Beck’s article on the incompatibilities of Smalltalk’s VM implementations, I realized that I was generally more in favor of Ruby than Smalltalk (even though I fear that What Killed Smalltalk Could Kill Ruby, Too).

But Ruby is not perfect. In all fairness, its creator claims that it’s just plain impossible to design a perfect language. But I do believe it could be a bit better. People do point out some controversial rough edges, but these seem a bit trifle when compared to what really bothers me:

  • No scoped open classes. It is an issue that is actually being considered to be solved on 2.0 (and there are some branches of ruby that enable it), but, for the time being, there is no way to make the changes made by opening an existing class only apply to objects while being used on the a lexical scope. This is not the same as adding and removing the changes as, in this case, calls that fall out of scope (such as to another module or file) still see the changes. It would be nice to have both ways of open classes: scoped and not scoped.
  • Method arguments do not interleave with the method’s name (like in Smalltalk). Example: Instead of calling: File.fnmatch(‘*’, ‘/’, File::FNM_PATHNAME) you’d be to call it like: File.fn match: ‘*’ path: ‘/’ flags: File::FNM_PATHNAME. This seems weird, but it is a very powerful feature that allows method invocation to be descriptive, similar to Python’s named arguments or even Ruby’s named arguments with a hash. On the other hand it has a cleaner syntax than the former, and does not require checking hash keys as the later (the later is still useful for methods that want to receive an arbitrary list of named arguments). It would change a bit of the syntax of method_missing and how to deal with varargs and blocks for each parameter, but these can be dealt with as Scala does with its parameter lists.
  • There is no method called (). Procs (who are the obvious beneficiaries of such change ) use the method call, but one could be an alias to the other. There would be a mild ambiguity here, when you call the function that was just returned. For instance, imagine func is a method that receives no arguments and returns a function (that is: an object that implements the () method) wich also receives no arguments. Then func() could mean 1. call func, and 2. call the function returned by func. I am aware this is kinda of a sensitive topic, but the Scala approach to this issue is very simple: func and func() are the same (provided func can be called with no parenthesis). If you want to call the returned function in the same expression, use func()(). With the alias, you’d be able to call it like func.call() , func().call() or even func.call
  • No way to create simple blocks. It would be nice to have something similar to Scala’s underscore or Groovy’s it, which allows method invocations like: collection.map(get_square_of(_)) in Scala and collection.map(get_square_of(it)) in Groovy. Ruby’s symbol coercion to proc (&:method) does not really work on anything besides methods of the arguments of the block.
  • Difficulties of composing callback methods. You could be sure to always invoke super on them, and even meta-program all the classes/objects that do not do such thing. However, it is not easy to actually see which methods will be called, or even manipulate/re-prioritize the blocks of code on runtime (kinda like a Chain of Responsibility), which can be very bad, as these methods can modify a lot of behaviour throughout the Object Space.
  • The reflection API could be more thorough. For instance, you can’t get the source code of a method/class, etc (as you can in Python). You can use Parse Tree and Ruby2Ruby to do it, but Parse Tree is not portable (does not even work on ruby 1.9) and the output can be formated differently than the actual source code (which can be critical on DSLs). Also, methods added do not have information on which line of code they were added (which is less important when adding methods the recommended way: extending/including Modules), and properties created with class methods (such as those created by attr_reader, or some other libraries equivalents) can’t be discovered on runtime (they are like any other method, with no other meta-data whatsoever). Ruby also seems to be missing some helper methods, such as #metaclass.
  • No support for immutability. This is kinda nitpicking, but using recursive freeze (as noted by Dean Wampler) is not really practical (as it is really slow). Neither does it encompass immutable local variables. This is not only useful for concurrency and functional programming issues, but is also useful when writing code that is side-effect free so that it is easier to reason about.
  • The return value of a setter method (that is: one that ends with the equals symbol) is the argument, not the return value of the method. This is an issue that matters more when using immutable objects, as the only way for them to “mutate” is to return a new object. Therefore you can’t use a setter method on an immutable object, as, even if it returns a new one, the runtime will ignore the return value and set to the variable the argument that was received. On the other hand, I don’t think this can be changed without breaking a lot of existing code.

Several of this annoyances can be solved with a heavy dose of open classes, s-expressions manipulation (using Parse Tree) and meta-programming in general. Knuth has said that: Language designers also have an obligation to provide languages that encourage good style, since we all know that style is strongly influenced by the language in which it is expressed. Fully agreeing with such Sapir-Whorf-esque sentence, I feel it would be a good thing if the underlying listed solutions were built into the language itself (and supported across implementations, such as JRuby, Iron Ruby, Rubinius, Maglev), as it would not only improve, even if a little bit, the language itself, but also they way its users write code.

Update: Thanks Michael Fellinger for noting that ruby blocks are fully adherent to method definitions on 1.9, as they allow both default parameters and blocks as arguments.

Scala: The Successor to the Throne

Posted in Languages by Daniel Ribeiro on July 29, 2009

It has been a while since Java was the sole language running over a JVM. Scala is another such language which gained a lot attention recently for being used to scale Twitter‘s backend. Scala differs from most other languages that run on the JVM, such as Groovy, JRuby and Jython, as it is statically typed. This means that, similar to Java and C#, the types must be known at compile time. Scala is usually introduced as being both OO and functional. While this statement is true (and daunting, as many people are uncomfortable with the f*** word), it fails to grasp the important aspects of Scala.

Among the most direct benefits of using Scala feature:

  • Compatible with Java. Kinda obvious (as so are all the other 200+ languages over the JVM), but it is such an important feature that should not be overlooked. This means that Scala can use all Java libraries and frameworks. Which shows respect for people’s and companies investment on the technology.
  • Joint Compilation. This means that, like Groovy, Scala classes are compiled to Java classes, and therefore can be used on Java projects (even by java classes on the same project they are defined). Even if your team decides to make the complete move towards Scala, this can be useful integrating with dynamic languages via JSR 223.
  • Type Inference. If the compiler can guess the type (and it usually can), you don’t have to tell it. This allows Scala code to be as concise as dynamic languages, while still being type safe.
  • Implicit conversion allows you to achieve in a type safe way what extension methods do for C# and open classes (mostly) do for ruby. That is: add methods to types you might not have not defined yourself (such as strings, lists, integers). This is one of the features that make Scala DSL friendly.
  • Object immutability is encouraged and easy to accomplish. Scala even comes with immutable collections built-in.
  • Getters and Setters are automatically generated for you. If you don’t want them (if you only want setters for example), you have to explicitly make them private. Which is not a problem, as the common case is to want them.
  • Scala has first-order functions and implements an enumeration protocol (with the iterable trait), which helps keeping code clearer, more concise, and brings several other benefits.
  • The Actor programming model eases up the development of highly concurrent applications.
  • Exceptions don’t have to be explictly caught or thrown. It can be argued that having checked exceptions does more harm than good.

These features alone would be enough to make Scala a very interesting language, and worth being heralded as the current heir apparent to the Java throne by one of JRuby’s creator, Charles Nutter (a view somewhat shared by Neal Gafter). Or even worth of being endorsed both by Groovy’s creator, James Strachan, and by the inventor of Java, James Gosling.  Nonetheless Scala is deep, and there are several exciting advanced features that allow developers to be more productive. But learning such features before getting a good grasp the basics can be quite frustrating, more so without a good supporting literature (such as IBM‘s, Aritma‘s, Jonas Bonér‘s, Daniel Spiewak‘s, Sven Efftinge‘s, the official one, and several others). However it quite is feasible, not only encouraged, to delve into deeper concepts as you need them.

Even though Scala has academic roots (as it shows on its papers page, and some advanced concepts these tackle), Scala has been successfully used on enterprise projects, besides Twitter, such as Siemens, Électricité de France Trading and WattzOn.

Besides all the good points, Scala does have some rough edges. Even though many people are working on overcoming them, they are likely to be relevant on the short term:

  • Incipient IDE support. As Lift‘s author expressed, IDEs for Scala, while undergoing a lot of development, are not what they are for Java. There is poor refactoring support, code completion and unit test integration. Not to mention the fact that most framework support tools will not play nicely with Scala. This can also put off some newcomers, as an IDE can help people learn the language. On the other hand, Martin Folwer relativizes this IDE situation, as a language that allows you to be more productive can more than make up for the lack of sophisticated tools.
  • Joint Compilation is not supported by most IDEs as well. Again, likely to change as Scala grows in popularity.
  • Immutability on a class is not really immutability, since referring objects may not be immutable themselves. And there is no way at the moment to ensure the whole object graph is immutable.
  • Making JSR 223 work perfectly with Scala can be challenging. On the other hand, making it work good enough is quite attainable.
  • Scala doesn’t support metaprogramming. This can be worked around by combining it with dynamic languages, such as Ruby (following a polyglot programming approach), but if you are going to do heavy use of metaprogramming, than using a whole different language may be a better solution (Fan is another static type language that runs over the JVM, similar to Scala, that has metaprogramming support).
  • Frameworks that expect Java source, such as the client-side GWT, will not play nicely with Scala (note that people have made Scala work with GWT on the server-side though). However there is an ongoing project that will translate Scala into Java source.
  • The syntax and some concepts are bit different from Java, such as: inverted type declaration order, underscore being used instead of wildcards, asterisks and default values, many kinds of nothing, no static methods (you need to use singleton objects instead) and other minor things. The documentation walks through this quite nicely though, but keep in mind that it is not an automatic transition from writing Java to writing Scala code.

As Joe Armstrong said, the need for languages that allow developers to easily make use of CPUs with multiple cores will only increase as such CPUs become cheaper and gain more and more cores. Scala is quite suited for such task, while Java’s development is stuck dealing with issues that come from being widely deployed, uncertanties of how open it will be in the future and political issues with some of its main contributors. Given the situation, Scala seems to fit quite nicely the role of the successor to Java’s throne.

Literal Collections In Java

Posted in Languages by Daniel Ribeiro on May 23, 2009

Terse ways to express Literal Collections such as Maps and Lists can be useful when building an Internal DSL. Java is famous for not having any facility to craft those. It does have literal object arrays, but these are a bit awkward when nesting, which is usually what you are going to do when employing this technique. But using variable arguments functions can supplement this deficiency quite nicely.

Below I show some ways to achieve this using an useful import static from http://code.google.com/p/fluentjava/: import static org.fluentjava.FluentUtils.*. I also comapare with the original ruby examples from Martin Fowler’s DSL Book.

  • Lists and Maps in Java
  • alist("computer",
      alist("processor", map(pair("cores", 2), pair("type", "i386"))),
      alist("disk", map(pair("size", 150))),
      alist("disk", map(
       pair("size", 78),
       pair("speed", 7200),
       pair("interface", "sata")))
      );
  • Lists and Maps in Ruby
  • [:computer,
     [:processor, {:cores => 2, :type => :i386}],
     [:disk, {:size => 150}],
     [:disk, {:size => 75, :speed => 7200, :interface => :sata}]
    ]
  • Just Lists in Java
  • alist("computer",
      alist("processor", alist("cores", 2), alist("type", "i386")),
      alist("disk", alist("size", 150)),
      alist("disk", alist("size", 78), alist("speed", 7200),
        alist("interface", "sata")));
  • Lists and Pairs in Java
  • alist("computer",
     alist("processor", pair("cores", 2), pair("type", "i386")),
     alist("disk", pair("size", 150)),
     alist("disk",
      pair("size", 78), pair("speed", 7200), pair("interface", "sata"))
     );
  • Just Lists In Ruby
  • [:computer,
     [:processor,
      [:cores, 2,],
      [:type, :i386]],
     [:disk,
      [:size, 150]],
     [:disk,
      [:size, 75],
      [:speed, 7200],
      [:interface, :sata]]]
  • Using a symbol as a vararg function (Java only)
  • $("computer",
     $("processor", $("cores", 2), $("type", "i386")),
     $("disk", $("size", 150)),
     $("disk", $("size", 78), $("speed", 7200), $("interface", "sata"))
    );
    
    private FluentList<Object> $(Object... args) {
       return alist(args);
    }

From the comparison, it is easy to see that ruby is bit more appropriate for this technique (and so are other languages such as python, perl and lisp). However, some cleverly defined (or imported) functions can help Java code benefit from it as well.

Tagged with: , , , ,

Closures, Collections and some Functional Programming

Posted in Languages by Daniel Ribeiro on May 2, 2009

Collection libraries are quite common today in almost all languages, and we are very glad that every non trivial piece of software does not require us to define again what a List is, neither what a Tree is. Also, we do not have to keep on reimplementing basic algorithms such as sort. Some languages even provide literal ways to define the most commonly useful collections, such as lists and maps.

Closures (also known as blocks and lamba expressions) are also common on several languages (Java and C++ being notable exceptions), even though it only started becoming more mainstream with Ruby. Besides being useful on creating DSLs , they allow us to easily define callback procedures on GUIs, declare transactions, readily craft simple implementations of Visitors and Commands patterns, enable controlled lazy evaluation on languages that do not natively support it, refactor complex switch statements to a simple map leading to Closures, and so on.

Even though closures and collections are useful on their own, when combined they allow us to abstract iteration mechanisms. Smalltalk is notable for blending closures and collections in a category of methods that form a protocol called enumeration protocol. For instance (examples in ruby, that also features an enumeration protocol, similar to smalltalk), we can rewrite this:

novels = []
for b in books
  if b.novel?
    novels << b
  end
end

Into this:

books.select {|b| b.novel?}

This way the code gets clearer, more concise and more simple, while encapsulating the actual iteration algorithm, which allow us to reuse the very same filter whether the books are coming from a list, from a file, from a database, or from a SOAP request. Not only that, but it allows us to refactor several filters of books into one line methods:

class Books
  def novels()
    return @books.select {|b| b.novel?}
  end

  def older_than(year)
    return @books.select {|b| b.age > year}
  end

  def name_starts_with(str)
    return @books.select {|b| b.name[0, 1] == str}
  end
end

But abstract iteration mechanisms are more than just filtering. Common enumeration protocols (such as those from Ruby and Smalltalk) features several methods to detect elements, to create new maps with a function, sort elements, get the maximum and minimum elements (using a closure as a definition of comparison) and so on. Many of these are inpsired by functions orginally defined on functional programming languages, such as fold and map.

The encapsulation of iteration mechanism is also usefull when we want to start using coarse-grained parallelism, such as proposed by Fork Join (which, ironically is written in a language that does not have closures) . That is: you had a sequential code, and with very little alteration, you can get a parallel/distributed one (while being cautious of side effects). In our example: Books#novels could be very well selecting the novels in separate threads, and joining them all together. It would only be a matter of changing the implementation of @books.

Concluding: Closures and Collections allows us to improve code by:

  • making it more concise
  • making it clearer
  • making it more suitable to refactoring
  • making sure you Don't Repeat Youself
  • enabling you to easily turn sequential code into parallel

Considering that all of these points are too good to let go, and considering that java 7 will not get closures and that not only are anonymous inner classes really closures but also too verbose, I started a little project called Fluent Java. This project, among other things, brings up the enumeration protocol to java, while still keeping it terse, and easy to use (Fork Join integration is planned as well).

Edit: Java might get simplified closures after all (at least, for now, it is planned). But still too early to say regarding how this was already pulled off once, and Oracle has some issues to work through with Sun, which may push JDK7 to being released early.

Edit (22 Sep 2010): Yes, it got pulled away. From the Chief Architect of the Java Platform Group: It’s time for … Plan B. Scala has them though...

Follow

Get every new post delivered to your Inbox.