Python 3.3 introduced hash randomization by default. This is a good thing because it avoids us, developer, to relies on the fact that a given hash is most of the time the same like it was the case with Python 2 and until Python 3.3. The “most of the time” assertion have created headache to some developer for example when running doctests on a CI server. We should always assume that dict keys order is random!

For those who don’t know doctest yet, it is a way to:

  1. document a piece of code: function, method, class, module
  2. give user a quick view of how to use it
  3. ensure the documentation reflect with the code as it will be ran against

It looks like this:

def hello_world(name='world'):
    """
    >>> hello_world()
    'Hello world!'
    >>> hello_world('Alexandre')
    'Hello Alexandre'
    """
    return 'Hello {name}!'.format(name=name)

When running a test (for example with py.test --doctest-module) line starting with >>> will be evaluated and return value will be compared with the described one.

Doctest is not intended to be used for huge test case (you should try to split it anyway) but it allow to get a quick overview of the piece of code. See the official documentation for more information on usage, don’t miss the available flags like ELLIPSIS or NORMALIZE_WHITESPACE which are very handy.

As described in this quick doctest introduction, evaluated call result is compared to the expected one. The problem is that with dict keys will be returned in a random order. A solution could be to compare the sorted keys and then the sorted values but we lost some readability.

Here comes pprint, a built-in pretty print function. It may goal is to print a nice object representation nested list, align dict keys, etc… but it also sort the object which is what we want with the dict in doctests.

>>> from pprint import pprint as _
>>> _({'foo': 1, 'bar': 2})
{'bar': 2, 'foo': 1}

Wrapping your dict with pprint will ensure keys order without losing readability and with a minimal performance inpact:

In [1]: from pprint import pprint as _
In [2]: %timeit _({'foo': [[1, 2], [3, 4]], 'bar': [[5, 6], [7, 8]]})
1000 loops, best of 3: 396 µs per loop
In [3]: %timeit {'foo': [[1, 2], [3, 4]], 'bar': [[5, 6], [7, 8]]}
1000000 loops, best of 3: 1.22 µs per loop