Our choices for python web applications

So, at work, we’re doing some “next generation” versions of a bunch of our backoffice tooling. That involves producing a bunch of cute little web applications, that often control not so cute and not so little processes (like transcoding and publishing and whatnot). The course-grained architecture pattern is pretty simple and familiar: database with information about files, jobs, tasks and metadata, some common libraries for interacting with the database, some web application middleware using those libraries, and a web server frontend serving up the middleware.

Pretty much normal bread-and-butter stuff. It’s not quite like document-based CMS work (you don’t really want to store many-gigabyte video in a JCR repo), but a lot of the technology choices are still similar.

Customizing a snake

Based on the various tech we have deployed today, and the skills of the people working on this kind of thing, we’re trying to standardize around two main server-side technologies: java and python. This post explains the choices we made for the python universe. At the moment those choices are actually not so easy, since there’s so much happening and so many projects are moving so fast. We scouted the web quite a bit to figure out what to do.

Lower layers

  • OS: ubuntu 7.10 (still some nodes on 6.10)
  • Database: MySQL 5.0.45 (comes with ubuntu, a bit reconfigured of course) with some little bits of replication
  • Python: Python 2.4.4 (2.5 is not on ubuntu 6.10 and not on all our developer workstations, but we’re testing with it and will upgrade eventually)
  • WSGI server: right now we have a slightly customized cherrypy wsgi server (so that it accepts signals, restarts itself, runs from /etc/init.d, logs in all the right places, etc) behind an apache httpd 2.2 ProxyPass, which also handles SSL/AAA. We want to try and move to mod_wsgi but first we need its mac install to suck a bit less, and so far, cherry is not quite falling over on us. If mod_wsgi doesn’t work out it’ll probably be back to twisted, probably also behind apache for SSL reasons.

Application glue

  • Database access layer: storm, our own slightly modified version. We really like storm, and every now we find we are pushing it a bit beyond its limits, which leads to some bits of patching (by people smarter than me!). Fortunately it seems the guys working on it are quite responsive on IRC. I expect there’ll be a few (more) patches from us that flow back upstream. I really hope someone implements support for forking out reads and writes to different nodes (like you get for free with MySQL Connector/J), either in MySQLdb or inside storm.
  • Python web glue: We’re trying to do everything completely WSGI-based, though most everything at the moment is actually inside CherryPy 3.1b1 handlers. The WSGI pattern works just fine and scales nicely enough in our tests.
  • Templating: Genshi 0.4.4 (we had to pick one, there’s a few good choices here)
  • XML bits and pieces: lxml 1.3.6. It’s the best XML support in python so far, but it still isn’t quite as good as what you get in java. All the various bits and pieces just aren’t quite as mature, and the underlying libxml2 doesn’t quite do XML schema support as well, and I also miss something like XMLBeans for python.

Out of the box?

We took a look at a bunch of the web frameworks out there. We didn’t seriously consider zope, but we took a long stare at pylons, turbogears and django before deciding not to bother with them. We’re not using much of paste either. Basically we missed one or more of

  • good support for storm out of the box
  • doing everything the WSGI way
  • good and correct documentation
  • easy to scale / make efficient
  • stable core with excellent compatibility and bugfixing

And perhaps a few other things, and on the balance we guessed it would be easier to roll our own and integrate components, rather than strip something else down, and maintain lots of vendor branches.

Key point: standardization good

Two years ago I would’ve picked twisted without blinking and invented another fancy wheel on top of it, but I’m happy I don’t have to do that anymore. Twisted has quite a learning curve, not just for app developers, but also for the people that need to deploy and scale the beast.

Two good things happened to the python webapp world: competition and standardization. Now things are progressing rapidly.

Progress is good, but it can result in various kinds of chaos that don’t help the application developer that likes to plan ahead a bit. The new scripting language based mega frameworks seem to attract a certain kind of developer and they probably work for a certain set of use cases, but standardizing on patterns and interfaces is much more useful for (opinioned!) people like us (with subtly deviating use cases). So framework authors: please do keep working on bridging the gap between all of them by cutting ’em down into tiny little WSGI middleware bits and pieces, and turn frameworks into libraries where you can.

6 thoughts on “Our choices for python web applications”

  1. Not sure why you feel the current manual tweaks which are some times, but not always, required for mod_wsgi on Mac OS X are such a big problem.

    The MacPorts issue arises because you are trying to use the non standard Python installation. Most packages need tweaking to work with MacPorts and that is what the point of the MacPorts compilation files are for. Such a MacPorts files is referenced from the page you refer to and mod_wsgi may even be available through MacPorts now. I don’t know what its progress is, but believe someone submitted it.

    As to the issue with fat binaries on 64bit MacOS X Leopard, that is a screw up on Apple’s part as they didn’t make sure it worked properly for Python on their fat binary capable platform. I’ll have a new iMac soon and may be able to incorporate a work around for their incomplete Python configuration. But then if this is done, it may well likely just break when they fix the problem properly.

    So, neither problem is really mod_wsgi’s fault. 😦

  2. Hey Graham! (wow, someone reads my blog! 🙂 )

    There’s not a *big* problem here with mod_wsgi on Mac OS X, it’s just a little bit of obstacle that we don’t need to battle right now! We need an easy way to minimize changes between various environments, including mac laptops of developers that work remotely and who have their own specific preferences on how to set things up.

    So what I need from open source packages is reliable, simple and fast installation on mac os 10.4 and 10.5 as well as ubuntu 6.10 and 7.10, with some possibility to install on windows nodes (where a lot of fiddling is acceptable). Preferably without having to read a manual.

    A (set of) pure python script(s) trivially satisfies that requirement – do an svn checkout, run a ‘sudo python setup.py install’, and things pretty much just work.

    Apache modules are unfortunately a bit trickier to get right, unless they’re part of the surrounding packaging system (i.e. available through DarwinPorts / apt-get / cygwin.exe).

    Of course that hasn’t really got anything to do with mod_wsgi! I think the core reason is that it’s just a relatively new module, so it’s not that readily available from downstream distributors yet. I’m sure some friendly packagers will pick it up eventually (hopefully soon!) and make it a one-liner to get the right installation.

    Even if it turns out all the friendly packagers are busy and that does not happen, we might just eventually do that ourselves, once we start hitting some robustness issues and really need something solid, it just means I have to go figure out how contributing to DarwinPorts works :-).

  3. I got my shiny new iMac and fixing both these MacOS X issues in mod_wsgi was the first thing I did. After all, it has to work for own system. 🙂

    The changes are in the Subversion trunk for mod_wsgi and will make its way into a 2.0 release candidate quite soon.

  4. Hi Leo,

    Interesting topic. I’ve recently gone through Django’s tutorial, because I am also looking for a Pythonic way of doing web apps. I particularly support your wish of standardization and your preference for libs over frameworks.

    Niek

  5. I know this comment may be a bit late, stumbled onto your blog from Google.

    I am in a similar situation at the moment, leaning towards a ‘stitch it together’ approach than trying to mold one of the current frameworks to my needs.

    Having come from a PHP background, I essentially did the same thing there after spending about one year with CodeIgniter, ZendFramework, Symphony, and numerous others; I decided on a DIY solution that pulled on some of the rote classes from ZF, Cake, and my own custom libs.

    I have experimented with Django, Pylons, TurboGears, (Zope is interesting to me, but, the learning curve looked too steep) and felt like I was back where I started when mucking around with PHP frameworks.

    (I guess this was mostly a comment of sympathy)

    The odd part of it though, is, I get a great deal of joy from re-inventing the wheel, even if that wheel is only one shade different from that other wheel. Kind of a sisyphic (i’m not sure that’s a word) experience.

Comments are closed.