因工作需要, 写一个脚本自动获得电信 网通等ISP的地址, 处于懒惰的思想, 先在网上搜了一下找到这篇帖子:
http://shunter.blog.51cto.com/2183398/1076743
看完之后, 觉得作者的shell水平和代码风格有很多值得商榷的地方. 所以自己写个小脚本.
先在APNIC网站上下载最新地址列表:
这里我对文件格式简单说明一下, 下面是官方的说明文档:
-
Format:
-
registry|cc|type|start|value|date|status[|extensions...]
-
Where:
-
registry The registry from which the data is taken.
-
For APNIC resources, this will be:
-
apnic
-
cc ISO 3166 2-letter code of the organisation to
-
which the allocation or assignment was made.
-
May also include the following non-ISO 3166
-
code:
-
-
AP - networks based in more than one
-
location in the Asia Pacific region
-
type Type of Internet number resource represented
-
in this record. One value from the set of
-
defined strings:
-
{asn,ipv4,ipv6}
-
start In the case of records of type 'ipv4' or
-
'ipv6' this is the IPv4 or IPv6 'first
-
address' of the range.
-
value In the case of IPv4 address the count of
-
hosts for this range. This count does not
-
have to represent a CIDR range.
start是网络的起始地址, 随后的value字段是主机数量, 当然READEME也说了这不代表CIDR范围, 为了我们方便处理, 牺牲小小精确性, 我就把这个数量转换为CIDR的子网掩码.
我们知道 hosts = 2 ^ (32-mask), 就是说256的主机数, 就是24位子网掩码, 也就是2的(32-24)次方. 那么在这里我们知道了主机数反求子网掩码的话, 应该是log以2为底取主机数的对数. 然后32减去该对数得到子网掩码.
然后上述文章的作者用的CU前版主admirer的bc代码来求的mask:
-
mask=$(cat <<EOF | bc | tail -1
-
pow=32;
-
define log2(x) {
-
if (x<=1) return (pow);
-
pow--;
-
return(log2(x/2));
-
}
-
log2($cnt)
-
EOF
-
)
该作者用Here document调用bc计算机,再自定义bc函数加用递归的方式来通过函数调用次数找出子网掩码. 因为shell里没有现成的log()函数, 就需要不断的对该主机数(代码中的$cnt变量)除2取整. 我承认递归调用的代码确实很炫, 一般在C程序中, 除了遍历树的结构和递归查找文件, 我都不会考虑优先使用递归, 因为效率问题.
在这里我就用shell的循环简单的实现求子网掩码:
-
MASK=$(pow=32;for((i=$CNT;i>1;i>>=1)); do :; ((pow--)); done;echo $pow)
也可以用awk的自定义函数递归调用:
-
MASK=$(awk -v c=$CNT 'function log2(x){if(x<2)return(pow);pow--;return(log2(x/2))}BEGIN{pow=32;print log2(c)}')
因为我们只针对中国地区的IPV4的地址进行分类查询, 所以只需要过滤出我们所需要的内容, 然后逐一用whois工具进行查询获取结果即可. 我还实现了地区分类, 多进程并发处理.
-
#!/bin/bash
-
#include
-
#define
-
TMP=/tmp/apnic_file
-
FILE=$1
-
DIR=APNIC
-
PROG1="whois.sh"
-
PROG2="merge.sh"
-
THREAD=30
-
#function
-
#main
-
if [[ -z $1 ]]; then
-
echo "$(basename $0) "
-
exit
-
fi
-
which whois &>/dev/null
-
if [[ $? -ne 0 ]]; then
-
echo "Please install whois(apt-get install whois)"
-
exit 1
-
fi
-
rm -rf $DIR [0-9]* $PROG 2>/dev/null
-
tail -n +$(awk '/^#!/{if(i){print NR;exit}i++}' $0) "$0" > $PROG1
-
tail -n +$(awk '/^#!/{if(i==2){print NR;exit}i++}' $0) "$0" > $PROG2
-
chmod +x $PROG1 $PROG2 2>/dev/null
-
awk -F"[|]" '/apnic\|CN\|ipv4\|/{print $4,$5}' $FILE > $TMP
-
awk -vp=$THREAD 'BEGIN{while(getline i}' $TMP
-
for BLOCK in $(ls [0-9]*); do
-
./$PROG1 $BLOCK &
-
done
-
wait
-
echo "The whois query is completed"
-
mkdir -p $DIR/CHINANET $DIR/UNICOM
-
for dir in $(ls [0-9]*); do
-
cd $DIR/$dir
-
for i in $(find . -type f); do
-
file=${i#*/}
-
cat $file >> ../$file
-
done
-
cd ../..
-
done
-
rm -rf $DIR/[0-9]* 2>/dev/null
-
rm -rf [0-9]* 2>/dev/null
-
for file in $(find $DIR -type f); do
-
case $file in
-
*bug|*error|*print)
-
echo "ignore $file"
-
;;
-
*)
-
./merge.sh $file
-
;;
-
esac
-
done
-
rm $PROG1 $PROG2 2>/dev/null
-
rm /tmp/whois_*
-
echo "$(basename $0) Completed"
-
exit 0
-
#!/bin/bash
-
#################################################
-
# 主机数 = 2 ^ (32-mask)
-
# 所以以2为底取主机数的对数, 就是该mask的值.
-
#
-
#MASK=$(cat <<EOF | bc | tail -1
-
#pow=32;
-
#define log2(x) {
-
# if (x<2) return (pow);
-
# pow--;
-
# return(log2(x/2));
-
#}
-
#log2($CNT)
-
#EOF
-
#)
-
#MASK=$(pow=32;for((i=$CNT;i>1;i=i/2)); do :; ((pow--)); done;echo $pow)
-
#MASK=$(awk -v c=$CNT 'function log2(x){if(x<2)return(pow);pow--;return(log2(x/2))}BEGIN{pow=32;print log2(c)}')
-
#################################################
-
#include
-
#define
-
FILE=$1
-
WHOIS=/tmp/whois_$FILE
-
DIR=APNIC/$FILE
-
#function
-
province(){
-
case $4 in
-
FJ*|fj*|FuZhou|fuzhou)
-
echo "$2/$3" >> $DIR/$1/fujian
-
;;
-
GD*)
-
echo "$2/$3" >> $DIR/$1/guangdong
-
;;
-
NM)
-
echo "$2/$3" >> $DIR/$1/neimenggu
-
;;
-
GZ)
-
echo "$2/$3" >> $DIR/$1/guizhou
-
;;
-
NX|NINGXIA)
-
echo "$2/$3" >> $DIR/$1/ningxia
-
;;
-
HL*)
-
echo "$2/$3" >> $DIR/$1/heilongjiang
-
;;
-
SX|TY)
-
echo "$2/$3" >> $DIR/$1/shanxi
-
;;
-
SN|SHAANXI)
-
echo "$2/$3" >> $DIR/$1/shannxi
-
;;
-
HA)
-
echo "$2/$3" >> $DIR/$1/henan
-
;;
-
BJ)
-
echo "$2/$3" >> $DIR/$1/beijing
-
;;
-
CQ)
-
echo "$2/$3" >> $DIR/$1/chongqing
-
;;
-
KM|YN)
-
echo "$2/$3" >> $DIR/$1/yunan
-
;;
-
HB|DIAQOS1)
-
echo "$2/$3" >> $DIR/$1/hubei
-
;;
-
XZ)
-
echo "$2/$3" >> $DIR/$1/xizang
-
;;
-
HE)
-
echo "$2/$3" >> $DIR/$1/hebei
-
;;
-
SD)
-
echo "$2/$3" >> $DIR/$1/shandong
-
;;
-
GS)
-
echo "$2/$3" >> $DIR/$1/gansu
-
;;
-
AH|Anhui)
-
echo "$2/$3" >> $DIR/$1/anhui
-
;;
-
LN)
-
echo "$2/$3" >> $DIR/$1/liaoning
-
;;
-
HN|HUNAN)
-
echo "$2/$3" >> $DIR/$1/hunan
-
;;
-
JS|SZ)
-
echo "$2/$3" >> $DIR/$1/jiangsu
-
;;
-
XJ)
-
echo "$2/$3" >> $DIR/$1/xinjiang
-
;;
-
JX)
-
echo "$2/$3" >> $DIR/$1/jiangxi
-
;;
-
JL)
-
echo "$2/$3" >> $DIR/$1/jilin
-
;;
-
SH|INSURANCE)
-
echo "$2/$3" >> $DIR/$1/shanghai
-
;;
-
GX)
-
echo "$2/$3" >> $DIR/$1/guangxi
-
;;
-
HI)
-
echo "$2/$3" >> $DIR/$1/hainan
-
;;
-
TJ)
-
echo "$2/$3" >> $DIR/$1/tianjin
-
;;
-
SC)
-
echo "$2/$3" >> $DIR/$1/sichuan
-
;;
-
QH|GEERMU)
-
echo "$2/$3" >> $DIR/$1/qinghai
-
;;
-
HK)
-
echo "$2/$3" >> $DIR/$1/xianggang
-
;;
-
ZJ)
-
echo "$2/$3" >> $DIR/$1/zhejiang
-
;;
-
*)
-
echo "$2/$3" >> $DIR/$1/_other
-
;;
-
esac
-
}
-
whois_query(){
-
echo -e "Process[$FILE]\twhois [$1]"
-
whois $1 > $WHOIS
-
return $?
-
}
-
ntoa(){
-
awk '{c=256;print int($0/c^3)"."int($0%c^3/c^2)"."int($0%c^3%c^2/c)"."$0%c^3%c^2%c}' <<<$1
-
}
-
aton(){
-
awk '{c=256;split($0,ip,".");print ip[4]+ip[3]*c+ip[2]*c^2+ip[1]*c^3}' <<<$1
-
}
-
add_network(){
-
echo "$2/$3 $1 $4" >> $DIR/print
-
case $1 in
-
CHINANET)
-
province $1 $2 $3 $4
-
;;
-
UNICOM)
-
province $1 $2 $3 $4
-
;;
-
CMNET)
-
echo "$2/$3" >> $DIR/$1
-
;;
-
CTTNET)
-
echo "$2/$3" >> $DIR/$1
-
;;
-
CERNET)
-
echo "$2/$3" >> $DIR/$1
-
;;
-
*)
-
echo "$2/$3 $1 $4" >> $DIR/bug
-
echo "$2/$3" >> $DIR/others
-
;;
-
esac
-
}
-
bool_sub(){
-
START=$HEAD
-
MASK=32
-
local NET
-
local i=$((~0))
-
while [[ $START -lt $TAIL ]]; do
-
((i<<=1))
-
NET=$((HEAD&i))
-
START=$((~(NET^i)))
-
((MASK--))
-
if [[ $START -eq $TAIL ]]; then
-
return 0
-
fi
-
done
-
return 1
-
}
-
do_whois(){
-
local NET
-
local i=$((~0))
-
local j
-
eval $(awk 'BEGIN{i=256}/^inetnum:/{split($4,ipe,".");ipt=ipe[4]+ipe[3]*i+ipe[2]*i^2+ipe[1]*i^3}END{print "TAIL="ipt}' $WHOIS)
-
eval $(awk '/^$/{if(i)exit;}\
-
/^netname:/{i++;split($2,a,"-");isp=a[1];area=a[2];if(isp=="CNC"||isp=="UNI"||isp=="uni")isp="UNICOM";\
-
if((isp=="UNICOM"&&length(area)) || (isp=="CHINANET"&&length(area)))exit}\
-
/^mnt-by:.*CNCGROUP/{n=split($2,a,"-");isp="UNICOM";for(x=1;x<=n;x++){if(a[x]=="CNCGROUP"){area=a[x+1];break}};exit}\
-
/^mnt-by:.*CHINANET/{n=split($2,a,"-");isp="CHINANET";for(x=1;x<=n;x++){if(a[x]=="CHINANET"){area=a[x+1];break}};exit}\
-
/^mnt-by:.*CERNET/{n=split($2,a,"-");isp="CERNET";for(x=1;x<=n;x++){if(a[x]=="CERNET"){area=a[x+1];break}};exit}\
-
/^mnt-by: *MAINT-CN-SNXIAN/{isp="CHINANET";area="SN";exit}\
-
/^netname: *guangzhou-.*-corp/{isp="UNICOM";area="GD";exit}\
-
/^mnt-lower:.*CERNET/{isp="CERNET";exit}\
-
/^mnt-lower:.*CHINANET/{n=split($2,a,"-");isp="CHINANET";for(x=1;x<=n;x++){if(a[x]=="CHINANET"){area=a[x+1];break}};exit}\
-
END{print "ISP="isp";AREA="area}' $WHOIS)
-
HEAD=$(aton $IP)
-
bool_sub
-
if [[ $? -eq 0 ]]; then
-
add_network $ISP $IP $MASK $AREA
-
else
-
j=$((32-MASK))
-
((i<<=j))
-
while [[ $NET -ne $HEAD ]]; do
-
((i>>=1))
-
NET=$((HEAD&i))
-
((MASK++))
-
done
-
IP=$(ntoa $HEAD)
-
add_network $ISP $IP $MASK $AREA
-
TAIL=$((~(NET^i)))
-
((TAIL++))
-
IP=$(ntoa $TAIL)
-
whois_query $IP
-
if [[ $? -eq 0 ]]; then
-
do_whois
-
else
-
echo "$IP/$MASK" >> $DIR/error
-
fi
-
fi
-
}
-
#main
-
FILE=$1
-
rm -rf $DIR 2>/dev/null
-
mkdir -p $DIR/CHINANET $DIR/UNICOM
-
while read IP CNT; do
-
START=$(aton $IP)
-
END=$((START+CNT-1))
-
TAIL=0
-
MASK_MAX=$(pow=32;for((i=$CNT;i>1;i>>=1)); do :; ((pow--)); done;echo $pow)
-
while [[ $TAIL -lt $END ]]; do
-
whois_query $IP
-
if [[ $? -eq 0 ]]; then
-
do_whois
-
((TAIL++))
-
IP=$(ntoa $TAIL)
-
else
-
echo "$IP/$MASK" >> $DIR/error
-
fi
-
done
-
done < $FILE
-
exit 0
-
#!/bin/bash
-
if [[ -z $1 ]]; then
-
echo "file not found"
-
exit
-
fi
-
if [[ ! -f $1 ]]; then
-
echo "$1 is not exsit"
-
exit
-
fi
-
TMP=/tmp/merge
-
while :; do
-
awk -F"/" '
-
function ntoa(n){c=256;return int(n/c^3)"."int(n%c^3/c^2)"."int(n%c^3%c^2/c)"."n%c^3%c^2%c}
-
function aton(d){c=256;split(d,ip,".");return ip[4]+ip[3]*c+ip[2]*c^2+ip[1]*c^3}
-
function ntobc(a,b){e=compl(0);f=lshift(e,32-b);s=and(a,f);return compl(xor(s,f))}
-
function ntosub(j,k){g=compl(0);h=lshift(g,32-k);return and(j,h)}
-
NR>1{
-
if($1==ntoa(bc+1) && $2==mask && ip_int==ntosub(ip_int,$2-1)){
-
mask=$2-1;bc=ntobc(ip_int,mask);
-
next;
-
}else{
-
print add"/"mask
-
}
-
}
-
{add=$1;ip_int=aton($1);mask=$2;bc=ntobc(ip_int,mask)}
-
END{print add"/"mask}
-
' $1 > $TMP
-
FILE_SIZE=$(ls -l $1 2>/dev/null | awk '{print $5}')
-
TMP_SIZE=$(ls -l $TMP 2>/dev/null | awk '{print $5}')
-
if [[ $FILE_SIZE -eq $TMP_SIZE ]]; then
-
break
-
fi
-
cp $TMP $1
-
done
-
exit 0
由于CU博客代码框原因, 有问题的下载原件.
代码原件:
apnic.rar
APNIC目录下有三个日志文件:
bug 是除开这几大分类之外的地址, 记录的解析后的信息, 其IP段已经加入了others
print是所有查询处理的信息,
error是由于whois超时而未查询到的地址.
谢谢CU的waker, blackold, cjaizss等我的挚友们.
阅读(14230) | 评论(2) | 转发(2) |