Friday, August 3, 2012

Bootstrapping Trial in Python 3

Initially, I had tried an "extensive" approach to porting Twisted - picking a certain error and fixing it in every module. Unfortunately, as I've found out, this isn't very practical: not only is Twisted a large code base, it's also very old. While updating this crufty code might have been doable, Twisted also has a requirement that all changes to code need to be tested (and I think this is very nice!*). This has been enforced quite strictly in the last few years, but of course, the code using the really old, Python3-incompatible idioms, is the same code which has no tests. As such, to make any sort of substantial change I would also need to write tests. This proved to be a little too much, and Itamar suggested I consider a more "intensive" approach - fixing Twisted a module at a time, starting with the core.

In this I also meandered slightly, but after discussing it with exarkun on IRC, we concluded it would be best to pick a file with tests, run it under Python 3 and fix the failures which arise. This is in line with Twisted's programming paradigm, test-driven development, and is a very comfortable way of working. The idea, of course, was to start with modules which have no dependencies on the rest of Twisted, and then work "down" the dependency tree as individual modules are ported. While this sounds ideal, I've encountered two problems: the minor one is that Twisted depends on itself a lot, and it's hard (although not impossible) to identify modules which do not use any others; the major issue is the test runner itself, Trial.

Trial is Twisted's (quite versatile) framework for testing, based on the standard unittest module. In time, the TestCase class was completely rewritten (though in a compatible way) to support various features which make testing easier. Now, when importing a file in Python 3, it needs to be syntax-compatible with Python 3, but all of it's imports need to be compatible too. So now, each test subclasses from twisted.trial.unittest.TestCase and the twisted.trial.unittest module is very large and unfortunately imports a large chunk of Twisted itself (notably, twisted.internet.reactor, but also half the twisted.python package). Therefore, it's impossible for me to actually run the tests, as I need Trial and Trial needs other things and none of this is compatible with Python 3. I had tried writing a large patch to at least make Trial importable, but it was rejected (and for good reason, I now think). Obviously, the huge patchset would need to be broken into smaller tickets, but preferably in a logical way.

Luckily, the solution came via the official unittest module - if I only change the test case to import from the official library, rather than from Trial, it will work! Then a simple ``python3.2 -m unittest twisted.test.test_whatever`` runs the tests. I have successfully used this method for several simpler files but I fear the low-hanging fruit are gone - as was to be expected, many test files do use functionality provided only by Trial's TestCase. I am still trying to "pick around" here and there, and have also submitted tickets which do not fix a specific module, but just a single issue (eg. removing __cmp__ in t.p.versions, removing uses of UserDict). It is clear, however, that this approach will not lead me to my immediate goal - running Trial itself under Python 3.

And this is where I currently am: my goal is to bootstrap Trial, to make it runnable in Python 3, which will make running tests (and, by extension, fixing relevant failures) much easier. The "pick a test_file and fix it" method cannot bring me there and I've been unable to think of a better alternative. One idea was to use an alternative TestCase implementation (where I tried testtools, which unfortunately isn't as-is compatible with Twisted's tests); using a different runner wouldn't help, as the modules would still need to be imported. Another idea is to provide some sort of temporary class, which would extend unittest from the official library with the specific methods I'm lacking; this class would then be deleted as soon as it's possible to run Trial itself. This doesn't strike me as a very clean approach, but it might be the only plausible one, unless someone has a different suggestion...

In the meantime, I'm focusing on fixing what I can (even if it doesn't directly lead to supporting Trial) and making more "general" changes, to lower the size of further patches (but there will be at least a couple of big ones, there's no avoiding it). In fact, I've been focusing on making tickets as small as possible, to ease review burden, though I've still got plenty awaiting review: any help on this front would also be very appreciated. I've also tried reviewing other tickets, to ease the general burden, though the one case where I actually "passed" a review I had to revert the change, so I'm trying to be more careful about it now.

*While I do find it very nice, I do have some issues with this policy and I feel that a few carefully thought-out exceptions would have been very helpful in my project. More thoughts on this in a future blog post.