20070112

How to look for long running pages in the Apache Logs using awk

In this tutorial we will learn how to look for pages, that takes 3 or more seconds to be served by Apache.

Usually the slow pages will be some sort of dynamic ones (PHPs, ASPs, Servlets or JSPs) because the static html pages take very short time to be served.


This procedure can help you in find some bottlenecks in your dynamic web application.


First step, change your Apache configuration



Locate your httpd.conf usually located in the $APACHE_HOME/conf directory

Edit the file (vi powa) and change the format of the Log File, adding a %T at the end.

Example:
from:
LogFormat "%h %l %u %t \"%r\" %>s %b" common
to:
LogFormat "%h %l %u %t \"%r\" %>s %b %T" common

Restart your apache server



go to $APACHE_HOME/bin ./apachectl stop and ./apachectl start

Now, if you locate the log file and take a look on the output (tail -f access_log) you can see the last column is the time taken in seconds to serve the page.

Example:

555.254.999.71 - omatech [12/Jan/2007:12:12:47 +0100] "GET /omatech/rulez/pk_microcalls.view_tralari_tralara?p_user_logged_id=1&p_load_id=07 HTTP/1.1" 200 3713814 4
555.254.999.71 - omatech [12/Jan/2007:12:38:35 +0100] "GET /omatech/rulez/pk_microcalls.view_ekipoja?p_user_id=1&p_type=C&p_cust=739 HTTP/1.1" 200 73793 5


In the above example, the first request have taken 4 seconds, and the second 5 seconds.

Extract the long duration requests with awk



This is the tricky one, you must use awk to extract the requests that takes more than 3 seconds, this step is useful because you eventually will have lots and lots of requests that take 0 seconds.


awk '/[^0-2]$/ {print $0}' access_log



This line uses awk to parse the file and extract the lines that DO NOT ends with 0, 1, or 2.


The [^] is for negation, the 0-2 is the range, and the $ stands for "at the end of the line"

I hope it will be useful