Top Posts Tagged with #python 3.3 performance

Python 3.3 dictionary's hard-to-believe performance

In my post Insights Into Python's Dictionary I talked about what goes on inside python 3.3 dictionary in terms of the underlying structure (a hash table) and a high-level view of the hashing algorithm.

Hashing Performance Using a Textbook Algorithm

Using an open addressing collision strategy described in an algorithms text book, I built a hash table, inserted the words from "A Tale of Two Cities" and plotted the mean number of collisions before each key successfully found a home. This ipython notebook shows the results. (The hash table approximately doubles its size when it becomes half full. The peaks in the plot indicate resizing.) In the end, there are 4371 unique keys and the table size grows to 16381. The average number of collisions is about 3.3.

Interlude

My goal was to do the same for a python 3.3 dictionary. But how could I instrument the code that implements the dictionary so that I can count the collisions for each key inserted? I wanted to get under the hood of the dictionary, put some probes on the wires, and expose what goes on. But how??? First, I show the results.

Hashing Performance Using Python 3.3 Dictionary

This ipython notebook shows the performance results of the python 3.3 algorithm when the same keys, in the same order, are inserted into a python 3.3 dictionary. (The table resizes when it becomes 2/3 full.) The final table size is half the size of the textbook algorithm and the average number of collisions are about 1.5. Wow.

The file cpython/Objects/dictobject.c says:

Most hash schemes depend on having a "good" hash function, in the sense of simulating randomness. Python doesn't: its most important hash functions (for strings and ints) are very regular in common cases.

[Python's non-random algorithm] gives better-than-random behavior in common cases, and that's very desirable.

This was fun but the real challenge was to figure how to probe the internal dictionary machinery. This led me to the ctypes module. A whole new world on the other side of the python/C divide.

My next post will be on my eye-opening Interlude.

#python 3.3 #python dictionary #hash table #python 3.3 performance

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming