So is the database the bottleneck?
Note that compared to the last post, I enabled
KeepAlive, i.e. I added the
-k flag to
ab -k -n 10000 -c 100 http://localhost/hello.txt
I also toned down the MPM limits a bunch:
StartServers 16 MinSpareServers 16 MaxSpareServers 128 MaxClients 64 MaxRequestsPerChild 0
(Apache will now complain about reaching MaxClients, but given the db is on the same machine, things end up faster overall, this way)
And then I decided to just run ‘ab’ locally since I couldn’t really see that much difference vs running it over the network.
Then I proceeded to get myself a MySQL 5 server with a single InnoDB table inside with a single row in it that reads
(key:1, value:'Hello world!'). I also set up a memcached server. Now, look at this…
|What?||Requests per second||relative performance hit||CPU usage|
|hello.txt||9700||–||httpd 2.2% per process|
|python string||7500||23%||httpd 2% per process|
|memcached lookup||2100||23% * 72%||memcached 10%, httpd 2% per process|
|_mysql||1400||23% * 82%||MySQL 77%, httpd 1.2% per process|
|mysql||880||23% * 82% * 37%||MySQL 65%, httpd 4.5% per process|
|SQLAlchemy||700||23% * 82% * 37% * 20%||MySQL 19%, httpd 5.5% per process|
Using a mysql database backend costs 82% req/s compared to the “serve python string from global variable”. Totally expected of course, and we have also been taught how to deal with it (caching).
In this example, we’re not actually getting a lot out of using memcached – the memcached approach (I’m using python-memcached) is still 72% slower than having the in-process memory, though if you look at it the other way around, it is 50% faster than using the _mysql driver. Nevertheless, this shows that the mysql query cache is pretty fast, too, esp. if you have a 100% cache hit ratio 🙂
Using the SQLAlchemy ORM framework makes things 20% slower, and it gobbles up a lot of CPU. We also expected that, and we know how to deal with (hand-optimize the SQL in performance-sensitive areas, or perhaps use a faster ORM, like storm).
But, did you know that you can take a 37% performance hit just by using the _mysql bindings directly instead of the MySQLdb object oriented wrappers? Looks like some optimizations should be possible there…
Now, in “real life”, imagine a 90% hit ratio on the memcache, and for the other 10% there’s some expensive ORM processing. Further imagine a 5% performance hit in the logic to decide between memcache and ORM layer. You’d end up doing about 1850 requests per second.
This leads to Leo’s WSGI performance rule of thumb number 1: Even if your python webapp is really really really fast, it will still be about 5 times slower than apache serving up static files.