Fast Python module for typed bytes

Over the past few days, I spent some time implementing a typed bytes Python module in C. It’s probably not quite ready for production use yet, and it still falls back to the pure python module for floats, but it seems to work fine and already leads to substantial speedups.

For example, the Python program

from typedbytes import Output
Output(open("test.tb", "wb")).writes(xrange(10**7))

needs 18.8 secs to finish on this laptop, whereas it requires only 0.9 secs after replacing typedbytes with ctypedbytes. Similarly, the running time for

from typedbytes import Input
for item in Input(open("test.tb", "rb")).reads(): pass

can be reduced from 22.9 to merely 1.7 secs by using ctypedbytes instead of typedbytes.

Obviously, Dumbo programs can benefit from this faster typed bytes module as well, but the gains probably won’t be as spectacular as for the simple test programs above. To give it a go, make sure you’re using the latest version of Dumbo, build an egg for the ctypedbytes module, and add the following option to your start command:

-libegg <path to ctypedbytes egg>

From what I’ve seen so far, this can speed up Dumbo programs by 30%, which definitely makes it worth the effort if you ask me. In fact, the Dumbo program would now probably beat the Java program in the benchmark discussed here, but, unfortunately, this wouldn’t be a very fair comparison. Johan recently made me aware of the fact that it’s better to avoid Java’s split() method for strings when you don’t need regular expression support, and using a combination of substring() and indexOf() instead seems to make the Java program about 40% faster. So we’re not quite as fast as Java yet, but at least the gap got narrowed down some more.

Advertisements

One Response to Fast Python module for typed bytes

  1. Holley says:

    Good site you have got here.. It’s hard to find quality writing like yours nowadays.
    I seriously appreciate people like you! Take care!!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: