Unfortunately, the list of Hadoop patches required for making Dumbo work properly just expanded a bit, since I traced down a strange encoding bug to an issue in Streaming’s typed bytes code. Hence, you might want to apply the MAPREDUCE-764 patch to your Hadoop build if you use Dumbo, even though the bug only leads to problems in very specific cases and usually isn’t hard to work around. Hopefully this patch will make it into Hadoop 0.21.
This isn’t all bad news, however. The encoding bug was initially reported on the dumbo-user mailing list, which apparently has 12 subscribers already and is starting to attract fairly regular traffic. I haven’t promoted this mailing list much so far and never really expected that people would actually start using it to be honest, but obviously I was wrong. Everyone who reads this blog should consider subscribing, I’m sure you won’t regret it!