apache log分析-liang3391-ChinaUnix博客

米拉centos.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

liang3391

博客访问： 250009
博文数量： 79
博客积分： 1942
博客等级：上尉
技术积分： 910
用户组：普通用户
注册时间： 2008-05-19 16:17

文章分类

全部博文（79）

高可用（1）
系统调优（5）
备份（0）
监控（2）
LVS（1）
版本控制（1）
虚拟化服务器（1）
dns（3）
vpn（0）
shell（17）
Nginx（3）
ftp（1）
squid（2）
iptables（8）
PHP（2）
Mysql（19）
Apahce（4）
未分配的博文（9）

文章存档

2011年（1）

2010年（50）

2009年（28）

我的朋友

相关博文

apache log分析

分类： LINUX

2010-04-18 20:23:07

假设apache日志格式为：

118.78.199.98 - - [09/Jan/2010:00:59:59 +0800] "GET /Public/Css/index.css HTTP/1.1" 304 - "" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6.3)"

问题1：在apachelog中找出访问次数最多的10个IP。
awk '{print $1}' apache_log |sort |uniq -c|sort -nr|head
awk 首先将每条日志中的IP抓出来，如日志格式被自定义过，可以 -F 定义分隔符和 print指定列；
sort进行初次排序，为的使相同的记录排列到一起；
upiq -c 合并重复的行，并记录重复次数。
head进行前十名筛选；
sort -nr按照数字进行倒叙排序。
   我参考的命令是：
   显示10条最常用的命令
   sed -e "s/| /\n/g" ~/.bash_history | cut -d ' ' -f 1 | sort | uniq -c | sort -nr | head

问题2：在apache日志中找出访问次数最多的几个分钟。
awk '{print $4}' apache.log |cut -c 14-18|sort|uniq -c|sort -nr|head
awk 用空格分出来的第四列是[09/Jan/2010:00:59:59；
cut -c 提取14到18个字符
剩下的内容和问题1类似。

问题3：在apache日志中找到访问最多的页面：
awk '{print $11}' apache_log |sed 's/^.*cn$.*$\"/\1/g'|sort |uniq -c|sort -rn|head
类似问题1和2，唯一特殊是用sed的替换功能将""替换成括号内的内容："（/common/index.php）"

问题4：在apache日志中找出访问次数最多（负载最重）的几个时间段（以分钟为单位），然后在看看这些时间哪几个IP访问的最多？
版本1
#!/bin/bash
# analysis apache access log
# histroy
# caoyameng version0.1 2010/01/24

if (test -z $1) ;then
read -p "Specify logfile:" LOG
else
        LOG=$1
fi

if [ ! -e $LOG ];then
echo "I cann't find apache log file."
exit 0
fi

awk '{print $4}' $LOG |cut -c 14-18|sort|uniq -c|sort -nr|head >timelog
for   i in `awk '{print $2}' timelog`
do

all=`grep $i timelog|awk '{print $1}'`
echo " $i $all"
IP=`grep $i $LOG| awk '{print $1}' |sort |uniq -c|sort -nr|head`
echo "$IP"

done
rm -f timelog

另一个版本的解决方法，其实就是换了下for的计算方式

#!/bin/bash
# analysis apache access log
# histroy
# caoyameng version0.2 2010/01/24

if (test -z $1) ;then
read -p "Specify logfile:" LOG
else
        LOG=$1
fi

if [ ! -e $LOG ];then
echo "I cann't find apache log file."
exit 0
fi

awk '{print $4}' $LOG |cut -c 14-18|sort|uniq -c|sort -nr|head >timelog
for (( i=1; i<=10; i=i+1 ))
do
num=`sed -n "${i}p" timelog|awk '{print $1}'`
time=`sed -n "${i}p" timelog|awk '{print $2}'`
echo "####The No.$i "
echo " "
echo " $time   $num"
echo " "
full=`grep $time $LOG| awk '{print $1}' |sort |uniq -c|sort -nr|head`
echo "$full"
echo " "
done
rm -f timelog

转载于linuxtone网站

阅读(1103) | 评论(0) | 转发(0) |

上一篇：shell替换程序里的代码关键字

下一篇：linux php加载zend框架

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6