Not using iteritems() to iterate over a large dictionary in Python 2

PEP 234 defines iteration interface for objects. It also states it has significant impact on performance of dict iteration.

Note

This anti-pattern only applies to Python versions 2.x. In Python 3.x items() returns an iterator (consequently, iteritems() and Python 2’s iterative range() function, xrange(), have been removed from Python 3.x).

Anti-pattern

The code below defines one large dictionary (created with dictionary comprehension) that generates large amounts of data. When using items() method, the iteration needs to be completed and stored in-memory before for loop can begin iterating. The prefered way is to use iteritems. This uses (~1.6GB).

d = {i: i * 2 for i in xrange(10000000)}

# Slow and memory hungry.
for key, value in d.items():
    print("{0} = {1}".format(key, value))

Best-practice

Use iteritems() to iterate over large dictionary

The updated code below uses iteritems() instead of items() method. Note how the code is exactly the same, but memory usage is 50% less (~800MB). This is the preferred way to iterate over large dictionaries.

d = {i: i * 2 for i in xrange(10000000)}

# Memory efficient.
for key, value in d.iteritems():
    print("{0} = {1}".format(key, value))

References