gawk analyze log file

原创文章,转载请注明来源并保留原文链接

今天有个需求,需要分析服务器上面的log日志,找出其中响应时间大于400ms的请求。

log日志大概的样子如下:

Started GET "/xxx/xxx/l1/pnxx9" for 218.107.18.137 at Mon May 07 10:32:00 +0800 2012
  Processing by XXController#action as HTML
  Parameters: {"x"=>"xxx", "xx"=>"xxxx", "x"=>"xxxx", "x"=>nil}
...
...
...
... lot of render
Completed 200 OK in 222ms (Views: 166.1ms | ActiveRecord: 0.0ms)

每段记录是以空行隔开的,文件比较大,大概在二百兆的样子。

我用gawk来处理:

gawk -vRS= -F'\n' 'split($NF,a," ") match(a[5],/([0-9]*)/,b) {if (b[1] > 400) print $1,"\n",$2,"\n",$NF,"\n"}' production.log > analyze.log

处理时间不到一分钟,还是可以的。

Leave a Reply

Your email address will not be published. Required fields are marked *