Metaphysical Developer

Insights on Parallel and Distributed Systems

Posted in Software Development by Daniel Ribeiro on August 31, 2009

At least two trends are making paralell and distributed programming come to focus: computers with multiple cores getting cheaper and getting more cores, and websites leveraging terabytes worth of of user content. There are several services, tools and programming models around to help people cope with such trends, such as Amazon EC2, Hadoop, Fork Join, Actor Models, Non relational Databases, and so on. A couple of things to bear in mind when using multiple cores or multiple computers (with or without these tools):

  • Amdahl’s law: It states that, with a fixed problem size, there is a point for every program after which adding more computational units do not give you improved performance. By computational units, you can take either cores (for paralell programming) or hosts (for distributed programming). Gustafson’s law tackles the issue when you do not fix the problem size, but this does not really help when you need faster response for your current problem size.
  • Brewer’s CAP Theorem: This has to do with distributed programming only, and many people said a lot of about this. But essentially it means that a distributed system can only have at most two out of the three following properties: consistency, availability and partition tolerance (a kind of fault tolerance).

Amdahl’s law is quite troublesome: you cannot really do anything about it, but changing the algorithms involved. But CAP Theorem allows you to trade off one property for another. You can relax consistency into eventual consistency, you can relax on fault tolerance against partition tolerance, or you may live with less nines of up-time. It all depends on your application’s profile which one you will have to abandon. Dealing with this means no longer looking for ACID (Atomicity, Consistency, Isolation, Durability) systems, where consistency is very important, but looking for BASE (Basic available, soft-state or scalable, and eventually consistent) systems. It means listening to Gregor Hohpe’s suggestion and accepting that Two Phase Commit may not be the right way to go in detriment to pursuing the new ACID properties (Associative, Commutative, Idempotent and Distributed).

Not every system requires looking into these trends and thinking about such limitations and trade-offs. But if yours does, then keeping these in mind might come in handy.


2 Responses

Subscribe to comments with RSS.

  1. Diego said, on September 1, 2009 at 2:47 pm

    Very intersting subject.
    Couch DB book has also a nice explanation of how eventual consistency is adopted by them.

  2. Daniel Ribeiro said, on September 17, 2009 at 4:38 pm

    Actually, I was not gonna talk about this topic: I was working on different subjects. But on the last minute, it seemed to me that this topic would be far more relevant. And making all the research involved to actually make the post is actually quite a lot of work. Therefore: one post per month while I have no small topics on sight to talk about…

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: