Friday, January 09, 2015

Creating safe cyclic reference destructors (without requiring __del__)

Well, it seems common for people to use __del__ in Python, but that should be a no-go mainly for the reasons below:

1. If there's a cycle, the Python VM won't be able to decide in what order elements should be deleted and will keep them alive forever (unless you manually clear that cycle from the gc module)... Yes, you shouldn't create a cycle in the first place, but it's hardly guaranteed some client of your library does a cycle when he shouldn't.

2. There are caveats related to reviving self during the __del__ (i.e.: say making a new reference to it somewhere else during its __del__ -- which you should definitely not do...).

3. Not all Python VMs work the same, so, unless you explicitly do some release on the object, some resource may be alive much longer than it should (i.e.: Pypy, Jython...)

4. If an exception in the context is thrown, all the objects may stay alive for much longer than antecipated (because the exception keeps a reference to the frame that has thrown the exception).

Now, if you still want to manage things that way (say, to play safe if the user forgets to do a context manager on some case), at least there's a relatively easy solution for points 1 and 2: instead of using __del__, use the weakref module to have a callback when the object dies to make the needed clearing...

The only thing to make sure here is that you don't use 'self' directly inside the callback, only the things it has to clear (otherwise you'd create a cycle to 'self', which is something you want to avoid here).

The example below shows what I mean (StreamWrapperDel is the __del__ based solution which shouldn't be used and StreamWrapperNoDel is the solution you should use):

import weakref

class StreamWrapperDel(object):
    
    def __init__(self, stream):
        self.stream = stream

    def __del__(self):
        print('__del__')
        self.stream.close()
        
class StreamWrapperNoDel(object):
    
    def __init__(self, stream):
        self.stream = stream
        def on_die(killed_ref):
            print('on_die')
            stream.close()
        self._del_ref = weakref.ref(self, on_die)


if __name__ == '__main__':
    class Stub(object):
        def __init__(self):
            self.closed = False
        
        def close(self):
            self.closed = True
            
    s = Stub()
    w = StreamWrapperDel(s)
    del w
    assert s.closed
    
    s = Stub()
    w = StreamWrapperNoDel(s)
    del w
    assert s.closed

Given that, personally I think Python shouldn't allow __del__ at all as there's another way to do it which doesn't have the related caveats.

For some real-world code which uses that approach, see: https://code.activestate.com/recipes/578998-systemmutex (recipe for a system wide mutex).

p.s.: Thanks to Raymond Hettinger the code above is colorized: https://code.activestate.com/recipes/578178-colorize-python-sourcecode-syntax-highlighting

1 comment:

Anonymous said...

Hi,
Thanks for the nice article. I totally agree regarding __del__! However, rather than using the weakref.ref() second parameter for a callback, I believe what you should really be doing in your example is calling weakref.finalize() to create a finalize object associated with the object.
Bye for now,
-John Garbuio