Wednesday, November 28, 2012

Announcing my Bachelor's thesis: Mock library for MATLAB

After positive experiences blogging about my GSoC projects, I've decided to do the same for my Bachelor's thesis. I've always felt that writing helped me process and refine my thoughts, and I'm hoping to leverage this for my thesis. Therefore, I plan on writing regularly, even about failed approaches.

As the title suggests, my thesis will be about creating a library for Mock objects for MATLAB. During talks with my mentor, we concluded that, while a unit-test library for MATLAB exists (xUnit) there is no support for mock objects. Mocks are very useful as they allow developers to write robust, correct tests faster, so their lack in MATLAB can be felt. As MATLAB supports object-orientated programming, we are reasonably certain that Mocks can also be implemented. Thus, the first part of my thesis is to conduct a feasibility study and develop a simple prototype as a proof-of-concept. To do that, I would need to define exactly which features I am looking for.

Coming from a Python background, my first instinct was to check existing Python Mock libraries (from this nice list). Most of them are ports of existing mocking frameworks to Python, though there are varieties. What was more interesting to me is that there is actually quite a bit of confusion about terminology (in particular, the term "mock" is used to refer to objects of vastly varying complexity). Investigating this, I came across an excellent article by Martin Fowler, Mocks Aren't Stubs. The article is a great read throughout, as it goes into quite a bit of detail on "mocking" and general design issues; it also made me realize that most approaches are quite complicated. Taking a step back, it's fair to assume that the "average" user of MATLAB isn't a dedicated CS specialist, but rather a researcher. As such, I believe ease of use should be valued the highest, otherwise no one would take the time to learn these tools.

This approach led me to mockito (nice presentation on differences between mockito and other frameworks, slides 38-40 provide direct comparisons of code). mockito defines itself as a "Test Spy" framework, where a "spy" is in essence a stub which remembers which interactions were invoked on it. This is a very natural, light-weight approach which is easy to pick up and almost as powerful as more verbose frameworks. After discussing it with my mentor, we both agreed that basing the new framework on mockito is the right approach. The next course of action is to create a basic set of use-cases, based on existing mockito examples, and start implementing them. This first code will be strictly proof-of-concept.

Finally, one of the first decisions we made was to make the final, resulting code public; development will also probably happen on Github. Philosophically speaking, I would prefer that any work I do be actually useful, and the idea behind creating a library is for many people to use it, and that cannot be accomplished in the confines of a single faculty or university. It would also be useful to have some projects which would like to use Mocks, so that the library can be tested (and designed!) with real code in mind. Here I would like to turn to the community - if someone has felt a need for a similar library, please send me your desired use-cases and any other comments and ideas you might have. They would be a tremendous help!

Friday, August 3, 2012

Bootstrapping Trial in Python 3

Initially, I had tried an "extensive" approach to porting Twisted - picking a certain error and fixing it in every module. Unfortunately, as I've found out, this isn't very practical: not only is Twisted a large code base, it's also very old. While updating this crufty code might have been doable, Twisted also has a requirement that all changes to code need to be tested (and I think this is very nice!*). This has been enforced quite strictly in the last few years, but of course, the code using the really old, Python3-incompatible idioms, is the same code which has no tests. As such, to make any sort of substantial change I would also need to write tests. This proved to be a little too much, and Itamar suggested I consider a more "intensive" approach - fixing Twisted a module at a time, starting with the core.

In this I also meandered slightly, but after discussing it with exarkun on IRC, we concluded it would be best to pick a file with tests, run it under Python 3 and fix the failures which arise. This is in line with Twisted's programming paradigm, test-driven development, and is a very comfortable way of working. The idea, of course, was to start with modules which have no dependencies on the rest of Twisted, and then work "down" the dependency tree as individual modules are ported. While this sounds ideal, I've encountered two problems: the minor one is that Twisted depends on itself a lot, and it's hard (although not impossible) to identify modules which do not use any others; the major issue is the test runner itself, Trial.

Trial is Twisted's (quite versatile) framework for testing, based on the standard unittest module. In time, the TestCase class was completely rewritten (though in a compatible way) to support various features which make testing easier. Now, when importing a file in Python 3, it needs to be syntax-compatible with Python 3, but all of it's imports need to be compatible too. So now, each test subclasses from twisted.trial.unittest.TestCase and the twisted.trial.unittest module is very large and unfortunately imports a large chunk of Twisted itself (notably, twisted.internet.reactor, but also half the twisted.python package). Therefore, it's impossible for me to actually run the tests, as I need Trial and Trial needs other things and none of this is compatible with Python 3. I had tried writing a large patch to at least make Trial importable, but it was rejected (and for good reason, I now think). Obviously, the huge patchset would need to be broken into smaller tickets, but preferably in a logical way.

Luckily, the solution came via the official unittest module - if I only change the test case to import from the official library, rather than from Trial, it will work! Then a simple ``python3.2 -m unittest twisted.test.test_whatever`` runs the tests. I have successfully used this method for several simpler files but I fear the low-hanging fruit are gone - as was to be expected, many test files do use functionality provided only by Trial's TestCase. I am still trying to "pick around" here and there, and have also submitted tickets which do not fix a specific module, but just a single issue (eg. removing __cmp__ in t.p.versions, removing uses of UserDict). It is clear, however, that this approach will not lead me to my immediate goal - running Trial itself under Python 3.

And this is where I currently am: my goal is to bootstrap Trial, to make it runnable in Python 3, which will make running tests (and, by extension, fixing relevant failures) much easier. The "pick a test_file and fix it" method cannot bring me there and I've been unable to think of a better alternative. One idea was to use an alternative TestCase implementation (where I tried testtools, which unfortunately isn't as-is compatible with Twisted's tests); using a different runner wouldn't help, as the modules would still need to be imported. Another idea is to provide some sort of temporary class, which would extend unittest from the official library with the specific methods I'm lacking; this class would then be deleted as soon as it's possible to run Trial itself. This doesn't strike me as a very clean approach, but it might be the only plausible one, unless someone has a different suggestion...

In the meantime, I'm focusing on fixing what I can (even if it doesn't directly lead to supporting Trial) and making more "general" changes, to lower the size of further patches (but there will be at least a couple of big ones, there's no avoiding it). In fact, I've been focusing on making tickets as small as possible, to ease review burden, though I've still got plenty awaiting review: any help on this front would also be very appreciated. I've also tried reviewing other tickets, to ease the general burden, though the one case where I actually "passed" a review I had to revert the change, so I'm trying to be more careful about it now.

*While I do find it very nice, I do have some issues with this policy and I feel that a few carefully thought-out exceptions would have been very helpful in my project. More thoughts on this in a future blog post.

Monday, June 18, 2012

Another year, another GSoC!

Well, this blog post took long enough, but I'm happy to announce that I've been accepted once again for the Google Summer of Code, this year for Twisted ("an event-driven networking engine written in Python"). What's more, my project is essentially the same as last year - porting Twisted to Python 3, or at least getting as close as possible (my actual proposal is available on Google Docs). Unfortunately, compared to last year, my school load was much higher this time around, so I've done much less work than I would have liked.

At the start, I've mostly focused on fixing the warnings thrown when running the test suite with "-3", taking care of most of the trivial ones (eg. has_key, apply, classical division). Currently, I'm looking into replacing "buffer()", which was a built-in but is removed in Python 3. While the work is similar, the workflow for getting changes in is quite different from SymPy. Twisted uses a svn repository, and trac for issue tracking; each change must be done in a separate branch and have a corresponding ticket; SymPy uses the classical Github + git workflow, with pull requests and reviews in the online interface. Now, I got too used to git to just give it up easily (especially as the Twisted workflow almost requires additional tools on top of vanilla svn), and this guide was very useful in setting up git svn. Although I'm getting used to the review process (eg. changesets are reviewed in total, not per-commit), I still find the Github (and git) model more productive - it streamlines review and allows small, atomic commits to be made (although I've been trying to keep my changes as small as possible, each such small change requires the opening of another ticket and creation of another SVN branch, so there's a point at which it's too much effort to do it all). Still, Twisted is unlikely to change so I will have to accommodate - at least I can follow my own practices in my own repo.

The next steps are deciding which porting strategy to ultimately pursue - some Twisted developers suggest a single code-base strategy (py2 and py3 compatible), I personally favor a single code base which relies on 2to3 for py3k compatibility and there was even an attempt at a dual code-base, by Antoine Pitrou. While I feel that approach is the least likely to succeed, as it introduces a high burden on the maintainer (and the effort has indeed stalled), the code already there will be helpful in my own work. Often, the changes made in py3k code can be reused in the "main", Python 2 code with little or no changes. Still, all approaches deserve investigation and my mind is still open to other ideas. Twisted currently supports Python 2.6+, which makes my job easier. The final piece of good news is that I may be able to get some sort of help (or at least support) from Canonical, as part of their plan to install only Python 3 in the next desktop Ubuntu release.

Friday, February 10, 2012

Thoughts on Google Code-in 2011

Google Code-in is the high-school equivalent of the Google Summer of Code. The program ran from Nov 21st to Jan 16th, though we've only now gotten around to sending a "summary" mail to the list about it. As Aaron noted, we've had some translation work, some work on SymPy Live and a bevy of documentation and code improvements. With 176 tasks completed, I'd say the whole project was a success for SymPy. I was involved as a mentor, so here are some general thoughts and observations about the process.

E-mail spam. In SymPy we didn't have a clean separation of mentor duties (eg. KDE only allowed tasks for which someone volunteered to mentor), so the initial idea was to add all (most) mentors to all tasks. This meant a lot of mails, an effect worsened by the fact that each commenter to the issue starts another "conversation" when viewed from Gmail (which I even reported to Melange as a feature request/bug). At the height of activity, I could get upwards of 30-40 mails ("conversations") daily, which by far dwarfed my other mail traffic. Then, because each comment is basically a separate mail, I wasted a lot of time looking at issue that someone already addressed (again, most mentors could handle most tasks). For the second round of tasks I didn't add myself to each task, otherwise I'm sure I'd have gotten even more spam. The bug I reported in Melange was fixed, so hopefully this will be less of an issue next year.

Being a mentor takes a lot of time. Partly a consequence of above, partly due to all the work being done, but being a mentor took a lot of time. Many students were unfamiliar with git (and didn't want to read the instructions on development workflow on our excellently-written (in my opinion) GCI Landing Page) and solving issues with them was a constant topic on IRC. Students also lacked follow-through with comments (or, occasionally, expected the work handed down to them) which didn't help. Finally, many students were very anxious, and didn't appreciate that we are all volunteers and cannot be around 24/7. All of this resulted in a process that was frustrating at times and stressful for mentors.

Regardless of all of the above, a lot of work was done for SymPy. While I didn't look at the stats, my feeling is that the biggest improvement could be seen in our SymPy Live interface (and our webpage) and our documentation. Yes, we also saw some code improvements, but they were probably a smaller part of the overall contribution (though by no means less important). Interestingly, I think this exposes the two types of tasks the GCI contest is well-suited to: tasks where there is no "in-house" expertise (anything web related in our case) and uninteresting tasks/chores (writing documentation, in our case and probably for most projects). In the first case, we managed to attract experienced developers who could improve our webpage much faster and better than any of the core developers. Writing documentation is also an important task, but one that is shunned by most developers. Still, it is mostly simple work and (more importantly) doesn't usually require in-depth understanding of the code. This made it ideally suited for new contributors. The financial award (100$ for every 3 completed tasks, up to 500$) was enough of a motivation for students. The all-around improvements to our documentation are probably the single biggest advantage of our participation in GCI.

Translations. In GCI, tasks were divided into categories and we needed to have at least 5 tasks in every category. While we managed to "fill-up" most categories, Translation was probably the biggest problem. As a, basically, command-line library, it does not make a lot of sense for SymPy to be translated in other languages. In the end, we created tasks for translating our webpage and tutorial to the languages covered by the development team and some of these were done, but I consider this a waste of time. Though this issue is "near and dear" to me (I'm not a native speaker of English), I'm of the opinion that it would be impossible for someone without at least a basic knowledge of English to program with SymPy. Simply, however much effort we put into translating, the class and method names will remain in English and there's no helping that. I very much doubt the newly translated documents will be even used and they're bound to fall behind as the original document changes. We also had to start using gettext to manage the translations, which is a non-trivial amount of work (and there are still some issues). In my opinion, it adds another layer of complexity (however small) for very little gain.

In conclusion: did we get stuff done? Yes, without a doubt. Would we have gotten more if the mentors used their mentoring time for coding? Perhaps, but not necessarily. Are some of the students going to keep contributing? Most likely not. Still, I would consider the whole program, and our participation in it, a success. Ideas for next year could be focusing more on stuff none of the core developers can do (eg. the website work), but we can't really say how far along will SymPy development progress during this year or which tasks might be available to students. Hopefully, more people will volunteer to mentor next year, which would help with most issues I raised here. It is interesting, though, that even with our normally very fast development process we couldn't handle the influx of student work. It'd be interesting to see how other organizations coped.

Here's to another GCI this year!