Django profiling again

The process of profiling takes the identification of bottlenecks (parts of the code that run slowly) followed by their more or less successful removal. As described in the previous post about profiling you can get helped by using a profile middleware. The solution is all fine, but the problem remains how to identify a bottleneck.

You don’t want to test all of your views in a huge application by appending ?profile in the urls. At least I don’t. And even if you do, you might detect a view that takes few seconds to response, but it is called only twice a year. While another view taking a millisecond is called million times a day. Therefore the following decorator and a shell script can help you identifying such views.

The decorator stores function’s name, time it spent and its timestamp in a temporary file. The shell script then parses that file and displays sum of time used in each function. All you have to do now is add @execution_time in front of every function you’d like to measure.

class execution_time(object):
    def __init__(self, func):
        self.func = func

    def __call__(self, *args, **kwargs):
        from datetime import datetime
        n1 = datetime.now()
        result = self.func(*args, **kwargs)
        n2 = datetime.now()
        delta = n2 - n1
        file = open('/tmp/exectime', 'a')
        file.write('%d %s %d %d %d %d %d %d %dn' % (
            delta.microseconds, self.func.__name__,
            n1.year, n1.month, n1.day, n1.hour, n1.minute,
            n1.second, n1.microsecond)
        )
        file.close()
        return result
    def __repr__(self):
        """Return the function's docstring."""
        return self.func.__doc__

The shell script. Cat, (g)awk, and sort. Make a call with the file produced by the decorator as an input parameter.

#!/bin/bash
echo '  absolute    relative     #calls   time/call function'

cat $1 |
gawk '{ all += $1; a[$2] += $1; freq[$2] += 1; }
END { for(x in a) printf "%10d %11g %10d %11g %sn",
    a[x], a[x]/all, freq[x], a[x]/freq[x], x; }' |
sort -n

Finally, the output. Ordered by absolute time spent in function. The second column shows the percentage of time spent in a function, e.g., the model_preview function taking 10,61% of all measured time. Note that this is relative to the sum of time that was spent only by measured views. One of the drawbacks of this decorator is that it is not taking into account the fact that one function might call the other. So if function A calls B, and you measure both, you’ll end with time for A that is actually time for A and B together. But that’s when the profiling using middleware comes handy.

  absolute    relative     #calls   time/call function
    211772 0.000133614         21     10084.4 load
    368835  0.00023271         72     5122.71 sort_boxes
    384175 0.000242389         33     11641.7 remove_from_cart

...

 139603047   0.0880802       4585     30447.8 design
 160016266     0.10096       7074     22620.3 render_image
 168310848    0.106193       4172       40343 model_preview

Share!
  • email
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • StumbleUpon

No Comments

Post a Comment

Your email is never shared. Required fields are marked *