How to look for long running pages in the Apache Logs using awk

In this tutorial we will learn how to look for pages, that takes 3 or more seconds to be served by Apache.

Usually the slow pages will be some sort of dynamic ones (PHPs, ASPs, Servlets or JSPs) because the static html pages take very short time to be served.

This procedure can help you in find some bottlenecks in your dynamic web application.

First step, change your Apache configuration

Locate your httpd.conf usually located in the $APACHE_HOME/conf directory

Edit the file (vi powa) and change the format of the Log File, adding a %T at the end.

LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%h %l %u %t \"%r\" %>s %b %T" common

Restart your apache server

go to $APACHE_HOME/bin ./apachectl stop and ./apachectl start

Now, if you locate the log file and take a look on the output (tail -f access_log) you can see the last column is the time taken in seconds to serve the page.


555.254.999.71 - omatech [12/Jan/2007:12:12:47 +0100] "GET /omatech/rulez/pk_microcalls.view_tralari_tralara?p_user_logged_id=1&p_load_id=07 HTTP/1.1" 200 3713814 4
555.254.999.71 - omatech [12/Jan/2007:12:38:35 +0100] "GET /omatech/rulez/pk_microcalls.view_ekipoja?p_user_id=1&p_type=C&p_cust=739 HTTP/1.1" 200 73793 5

In the above example, the first request have taken 4 seconds, and the second 5 seconds.

Extract the long duration requests with awk

This is the tricky one, you must use awk to extract the requests that takes more than 3 seconds, this step is useful because you eventually will have lots and lots of requests that take 0 seconds.

awk '/[^0-2]$/ {print $0}' access_log

This line uses awk to parse the file and extract the lines that DO NOT ends with 0, 1, or 2.

The [^] is for negation, the 0-2 is the range, and the $ stands for "at the end of the line"

I hope it will be useful