April 5, 2009

It looks like my complaining might’ve paid off, since HADOOP-5450 got committed on Friday, which has the fortunate consequence that Hadoop 0.21 won’t require any patching to make Dumbo work. Although having to apply a few patches is far from the end of the world, it might still be a show-stopper for some people, and using Dumbo on Cloudera’s distribution or Amazon’s Elastic MapReduce might only become feasible when Hadoop supports it “out of the box”.

I didn’t mean to suggest that Hadoop is a badly-organized open source project or anything like that, by the way. On the contrary, it’s far better organized than many of the other projects I’m familiar with. The only message I wanted to get across is that it would make sense to look for ways to get patches reviewed and committed more quickly. I heard some rumours about organizing commit fests, for instance, which sounds like a great potential solution to me.