Please Don’t Abuse repr()

repr() is useful.

Python’s repr() function can be really handy, especially when trying to debug issues. It provides a straightforward way to both print out a representation (hence the name) of a given Python object to a console, and also allows in many cases for easy experimentation with that data (since the results of repr() are often things that can be copy-pasted into a Python session to recreate the object).

The representation of Python primitives is straightforward – it’s just the same syntax Python normally uses for such literals. More complex objects like user-defined classes, however, are not so simple to represent – the Python interpreter has no real way to discern what internal state is important and what is not, let alone to provide a way to construct that object (short of something like pickle).

The default Python behavior for repr() applied to complex objects is to show the type of the object and the memory address of the particular instance, e.g.

"<__main__.Foo instance at 0xd8d998>"

It’s not a bad default – that information can often solve a lot of the problems that you might be using repr() to debug. Python, however, provides a handy way to create a more useful output: __repr__. By overriding the __repr__ method on your class, you can customize exactly what string is returned when repr() is invoked on an instance of it. Here’s an example:

class Foo(object):
    def __init__(self, bar):
        self.bar = bar
    def __repr__(self):
        return "<Foo bar:%s>" % self.bar

If you call repr(Foo(42)), you’ll get…

"<Foo bar:42>"

which lets you easily look inside your objects to assess their state (or at least, the important parts of it) while debugging.

Until you abuse it.

What do I mean by “abusing” it? I mean pretending that something isn’t what it actually is. For instance, let’s say you implemented a linked list class, LinkedList. You might be tempted to have repr() return a simple Python list with the elements from your LinkedList, because that way it’s easy to take that set of elements and use them in another Python session.

Don’t do that. Why? Because it makes it non-obvious that the real object in question is not actually a basic Python list, but an instance of your custom class. Sure, you might know now that what you’re calling repr() on is a LinkedList, but other developers don’t – and you might not either a year from now when you’re trying to track down some bug.

Here’s a real life example of exactly this kind of confusion. A library (mutagen) returns an object that repr()s to a Python dictionary, but isn’t actually a dict. Thus, json.dump hiccups on it, but the error message (which is generated by json.dump() using repr()) makes it seem like json.dump() is behaving inconsistently and failing to encode a simple dict.

The are better ways to do this, while still retaining whatever benefits pretending to be a dict would have:

# Don't do this...
repr(my_object)  --->  "{'foo':1, 'bar': 2}"

# ...do this...
repr(my_object)  --->  "<MyClass {'foo':1, 'bar': 2}>"

# ...or this.
repr(my_object)  --->  "MyClass(foo=1, bar=2)"

Whether you pick the second option or the third depends on both your goals and the kind of class state you’re trying to represent, but what’s important is that you shouldn’t pick the first. If someone really wants a dictionary from your object, well, that’s what __dict__ is for.

Posted on June 3, 2012, in Software Development and tagged . Bookmark the permalink. 1 Comment.

  1. Or, a functional repr might be MyClass(**{‘foo’: 1, ‘bar’: 2})