Defining __repr__ when subclassing set in Python

15,274

Solution 1

I think I have something that gets you what you want, in addition to showing some benchmarks. They are almost all equivalent though I am sure there is a difference in memory usage.

#!/usr/bin/env python

import time

class Alpha(set):
    def __init__(self, name, s=()):
            super(Alpha, self).__init__(s)
            self.name = name
    def __repr__(self):
            return '%s(%r, set(%r))' % (self.__class__.__name__, 
                                        self.name, 
                                        list(self))

class Alpha2(set):
    def __init__(self, name, s=()):
            super(Alpha2, self).__init__(s)
            self.name = name
    def __repr__(self):
            return '%s(%r, set(%r))' % (self.__class__.__name__, 
                                        self.name, 
                                        set(self))

class Alpha3(set):
    def __init__(self, name, s=()):
            super(Alpha3, self).__init__(s)
            self.name = name
    def __repr__(self):
            rep = super(Alpha3, self).__repr__()
            rep = rep.replace(self.__class__.__name__, 'set', 1)
            return '%s(%r, %s)' % (self.__class__.__name__, 
                                    self.name, 
                                    rep)

def timeit(exp, repeat=10000):
    results = []
    for _ in xrange(repeat):
        start = time.time()
        exec(exp)
        end = time.time()-start
        results.append(end*1000)
    return sum(results) / len(results)

if __name__ == "__main__":
    print "Alpha():  ", timeit("a = Alpha('test', (1,2,3,4,5))")
    print "Alpha2(): ", timeit("a = Alpha2('test', (1,2,3,4,5))")
    print "Alpha3(): ", timeit("a = Alpha3('test', (1,2,3,4,5))")

Results:

Alpha(): 0.0287627220154

Alpha2(): 0.0286467552185

Alpha3(): 0.0285225152969

Solution 2

I couldn't find any better way than to do this. I suppose it's better than throwing away a set though.

(Python 2.x)

>>> class Alpha(set):
...     def __init__(self, name, s=()):
...             super(Alpha, self).__init__(s)
...             self.name = name
...     def __repr__(self):
...             return 'Alpha(%r, set(%r))' % (self.name, list(self))
... 
>>> Alpha('test', (1, 2))
Alpha('test', set([1, 2]))

Or, if you don't like the hardcoded class name (though it really shouldn't matter).

>>> class Alpha(set):
...     def __init__(self, name, s=()):
...             super(Alpha, self).__init__(s)
...             self.name = name
...     def __repr__(self):
...             return '%s(%r, set(%r))' % (self.__class__.__name__, self.name, list(self))
... 
>>> Alpha('test', (1, 2))
Alpha('test', set([1, 2]))
Share:
15,274
me_and
Author by

me_and

L3 support engineer for Metaswitch Networks (stuff posted here isn't on behalf of my company &c. &c.). I hack around in Python, spend more time than I'd like with shell, and am proficient if somewhat rusty with C. I'm also the Git maintainer for Cygwin and a contributor to Dreamwidth. You can also find me in a wide variety of larp fields, or on Twitter, Facebook, Google+, Wikipedia (which has by far the most complete profile), and other places by request.

Updated on June 08, 2022

Comments

  • me_and
    me_and almost 2 years

    I'm trying to subclass the set object in Python, using code similar to the below, but I can't work out a sensible definition of __repr__ to use.

    class Alpha(set):
        def __init__(self, name, s=()):
            super(Alpha, self).__init__(s)
            self.name = name
    

    I'd like to define __repr__ in such a way that I can get the following output:

    >>> Alpha('Salem', (1,2,3))
    Alpha('Salem', set([1, 2, 3]))
    

    However, if I don't override __repr__, the output I get ignores the name value…

    >>> Alpha('Salem', (1,2,3))
    Alpha([1, 2, 3])
    

    …while if I do override __repr__, I can't get direct access to the values in the set without creating a new set instance:

    class Alpha(set):
        …
        def __repr__(self):
            return "%s(%r, %r)" % (self.__class__.__name__, self.name, set(self))
    

    This works, but creating a new set instance for __repr__ that will then be disposed of seems clunky and inefficient to me.

    Is there a better way to define __repr__ for this sort of class?

    Edit: Another solution that has occurred to me: I can store the set locally. It seems slightly neater than the other options (creating and destroying something for every call of __repr__ or using some form of string manipulation), but still seems less than ideal to me.

    class Alpha(set):
        def __init__(self, name, s=()):
            super(Alpha, self).__init__(s)
            self.name = name
            self._set = set(s)
        def __repr__(self):
            return "%s(%r, %r)" % (self.__class__.__name__, self.name, self._set)
    
  • John Doe
    John Doe over 12 years
    Ya, I'd opt in for list() as it seems the most straightforward, but clearly it doesn't matter too much at all.
  • gecco
    gecco over 12 years
    I'd not hardcode classname and use self.__class__.__name__ instead (like me_and did)
  • jdi
    jdi over 12 years
    Im willing to bet, memory-wise that the first and second are less efficient since they are instantiating and throwing away objects. But thats just an assumption without actually checking the memory/cpu. The third example only gets a string and reformats it.
  • John Doe
    John Doe over 12 years
    @gecco: Personal preference. I don't see much point in not hardcoding it (unless you're going to have classes that subclass it), especially as some places still require it (super call). It's easy to change anyways. But I've added the non-hardcoded version as well as per request.
  • jdi
    jdi over 12 years
    I agree with @JohnDoe. I dont see it as critical when you already need to explicitly use the class name for the super() calls. But yes it is a bit more dynamic.
  • John Doe
    John Doe over 12 years
    Possibly. I guess it's a question of, is set() or list() faster than the 2 calls of Alpha3 (benchmark may be saying yes).
  • jdi
    jdi over 12 years
    list() is more efficient than set() for sure. The string replacement is a little better than set() but gives him the exact formatting he wanted. So its a toss up
  • me_and
    me_and over 12 years
    @JohnDoe: I'm absolutely going to have classes subclassing this one, so using the dynamic name makes much more sense here.
  • me_and
    me_and over 12 years
    None of these feel quite right to me: creating and discarding instances will, I'm willing to bet, be much more expensive for larger lists, while string editing feels quite fragile and "unpythonic".
  • me_and
    me_and over 12 years
    Regardless, +1 for a very complete answer and for including benchmarks.
  • jdi
    jdi over 12 years
    @me_and - Thanks for the +1. Isn't a repr method by nature a string formatting method? So it would seem to me that the 3rd option is reasonable, though instead of formatting your string from scratch you are just reformatting the superclass version.
  • me_and
    me_and over 12 years
    @jdi: good point well made. I think that'll be the option I go for. Thanks!