Chinaunix首页 | 论坛 | 博客
  • 博客访问: 3031888
  • 博文数量: 272
  • 博客积分: 5544
  • 博客等级: 大校
  • 技术积分: 5496
  • 用 户 组: 普通用户
  • 注册时间: 2011-03-08 00:48
个人简介

  每个人都要有一个骨灰级的爱好,不为金钱,而纯粹是为了在这个领域享受追寻真理的快乐。

文章分类

全部博文(272)

文章存档

2015年(2)

2014年(5)

2013年(25)

2012年(58)

2011年(182)

分类: LINUX

2013-07-10 09:53:26

因工作需要, 写一个脚本自动获得电信 网通等ISP的地址, 处于懒惰的思想, 先在网上搜了一下找到这篇帖子:
http://shunter.blog.51cto.com/2183398/1076743
看完之后, 觉得作者的shell水平和代码风格有很多值得商榷的地方. 所以自己写个小脚本.

先在APNIC网站上下载最新地址列表:


这里我对文件格式简单说明一下, 下面是官方的说明文档:

点击(此处)折叠或打开

  1. Format:
  2.         registry|cc|type|start|value|date|status[|extensions...]
  3.     Where:
  4.         registry The registry from which the data is taken.
  5.                 For APNIC resources, this will be:
  6.                  apnic
  7.         cc ISO 3166 2-letter code of the organisation to
  8.          which the allocation or assignment was made.
  9.          May also include the following non-ISO 3166
  10.          code:
  11.          
  12.                  AP - networks based in more than one
  13.                  location in the Asia Pacific region
  14.         type Type of Internet number resource represented
  15.                 in this record. One value from the set of
  16.                 defined strings:
  17.                  {asn,ipv4,ipv6}
  18.         start In the case of records of type 'ipv4' or
  19.                 'ipv6' this is the IPv4 or IPv6 'first
  20.                 address' of the range.
  21.         value In the case of IPv4 address the count of
  22.                 hosts for this range. This count does not
  23.                 have to represent a CIDR range.

start是网络的起始地址, 随后的value字段是主机数量, 当然READEME也说了这不代表CIDR范围, 为了我们方便处理, 牺牲小小精确性, 我就把这个数量转换为CIDR的子网掩码.
我们知道 hosts = 2 ^ (32-mask), 就是说256的主机数, 就是24位子网掩码, 也就是2的(32-24)次方. 那么在这里我们知道了主机数反求子网掩码的话, 应该是log以2为底取主机数的对数. 然后32减去该对数得到子网掩码.
然后上述文章的作者用的CU前版主admirer的bc代码来求的mask:

点击(此处)折叠或打开

  1. mask=$(cat <<EOF | bc | tail -1
  2.     pow=32;
  3.     define log2(x) {
  4.         if (x<=1) return (pow);
  5.         pow--;
  6.         return(log2(x/2));
  7.         }
  8.     log2($cnt)
  9.     EOF
  10.     )

该作者用Here document调用bc计算机,再自定义bc函数加用递归的方式来通过函数调用次数找出子网掩码. 因为shell里没有现成的log()函数, 就需要不断的对该主机数(代码中的$cnt变量)除2取整. 我承认递归调用的代码确实很炫, 一般在C程序中, 除了遍历树的结构和递归查找文件, 我都不会考虑优先使用递归, 因为效率问题.

在这里我就用shell的循环简单的实现求子网掩码:

点击(此处)折叠或打开

  1. MASK=$(pow=32;for((i=$CNT;i>1;i>>=1)); do :; ((pow--)); done;echo $pow)
也可以用awk的自定义函数递归调用:

点击(此处)折叠或打开

  1. MASK=$(awk -v c=$CNT 'function log2(x){if(x<2)return(pow);pow--;return(log2(x/2))}BEGIN{pow=32;print log2(c)}')

因为我们只针对中国地区的IPV4的地址进行分类查询, 所以只需要过滤出我们所需要的内容, 然后逐一用whois工具进行查询获取结果即可. 我还实现了地区分类, 多进程并发处理.

点击(此处)折叠或打开

  1. #!/bin/bash
  2.     #include
  3.     #define
  4.     TMP=/tmp/apnic_file
  5.     FILE=$1
  6.     DIR=APNIC
  7.     PROG1="whois.sh"
  8.     PROG2="merge.sh"
  9.     THREAD=30
  10.     #function
  11.     #main
  12.     if [[ -z $1 ]]; then
  13.         echo "$(basename $0) "
  14.         exit
  15.     fi
  16.     which whois &>/dev/null
  17.     if [[ $? -ne 0 ]]; then
  18.         echo "Please install whois(apt-get install whois)"
  19.         exit 1
  20.     fi
  21.     rm -rf $DIR [0-9]* $PROG 2>/dev/null
  22.     tail -n +$(awk '/^#!/{if(i){print NR;exit}i++}' $0) "$0" > $PROG1
  23.     tail -n +$(awk '/^#!/{if(i==2){print NR;exit}i++}' $0) "$0" > $PROG2
  24.     chmod +x $PROG1 $PROG2 2>/dev/null
  25.     awk -F"[|]" '/apnic\|CN\|ipv4\|/{print $4,$5}' $FILE > $TMP
  26.     awk -vp=$THREAD 'BEGIN{while(getline i}' $TMP
  27.     for BLOCK in $(ls [0-9]*); do
  28.         ./$PROG1 $BLOCK &
  29.     done
  30.     wait
  31.     echo "The whois query is completed"
  32.     mkdir -p $DIR/CHINANET $DIR/UNICOM
  33.     for dir in $(ls [0-9]*); do
  34.         cd $DIR/$dir
  35.         for i in $(find . -type f); do
  36.             file=${i#*/}
  37.             cat $file >> ../$file
  38.         done
  39.         cd ../..
  40.     done
  41.     rm -rf $DIR/[0-9]* 2>/dev/null
  42.     rm -rf [0-9]* 2>/dev/null
  43.     for file in $(find $DIR -type f); do
  44.         case $file in
  45.             *bug|*error|*print)
  46.                 echo "ignore $file"
  47.                 ;;
  48.             *)
  49.                 ./merge.sh $file
  50.                 ;;
  51.         esac
  52.     done
  53.     rm $PROG1 $PROG2 2>/dev/null
  54.     rm /tmp/whois_*
  55.     echo "$(basename $0) Completed"
  56.     exit 0
  57.     #!/bin/bash
  58.     #################################################
  59.     # 主机数 = 2 ^ (32-mask)
  60.     # 所以以2为底取主机数的对数, 就是该mask的值.
  61.     #
  62.     #MASK=$(cat <<EOF | bc | tail -1
  63.     #pow=32;
  64.     #define log2(x) {
  65.     # if (x<2) return (pow);
  66.     # pow--;
  67.     # return(log2(x/2));
  68.     #}
  69.     #log2($CNT)
  70.     #EOF
  71.     #)
  72.     #MASK=$(pow=32;for((i=$CNT;i>1;i=i/2)); do :; ((pow--)); done;echo $pow)
  73.     #MASK=$(awk -v c=$CNT 'function log2(x){if(x<2)return(pow);pow--;return(log2(x/2))}BEGIN{pow=32;print log2(c)}')
  74.     #################################################
  75.     #include
  76.     #define
  77.     FILE=$1
  78.     WHOIS=/tmp/whois_$FILE
  79.     DIR=APNIC/$FILE
  80.     #function
  81.     province(){
  82.         case $4 in
  83.             FJ*|fj*|FuZhou|fuzhou)
  84.                 echo "$2/$3" >> $DIR/$1/fujian
  85.                 ;;
  86.             GD*)
  87.                 echo "$2/$3" >> $DIR/$1/guangdong
  88.                 ;;
  89.             NM)
  90.                 echo "$2/$3" >> $DIR/$1/neimenggu
  91.                 ;;
  92.             GZ)
  93.                 echo "$2/$3" >> $DIR/$1/guizhou
  94.                 ;;
  95.             NX|NINGXIA)
  96.                 echo "$2/$3" >> $DIR/$1/ningxia
  97.                 ;;
  98.             HL*)
  99.                 echo "$2/$3" >> $DIR/$1/heilongjiang
  100.                 ;;
  101.             SX|TY)
  102.                 echo "$2/$3" >> $DIR/$1/shanxi
  103.                 ;;
  104.             SN|SHAANXI)
  105.                 echo "$2/$3" >> $DIR/$1/shannxi
  106.                 ;;
  107.             HA)
  108.                 echo "$2/$3" >> $DIR/$1/henan
  109.                 ;;
  110.             BJ)
  111.                 echo "$2/$3" >> $DIR/$1/beijing
  112.                 ;;
  113.             CQ)
  114.                 echo "$2/$3" >> $DIR/$1/chongqing
  115.                 ;;
  116.             KM|YN)
  117.                 echo "$2/$3" >> $DIR/$1/yunan
  118.                 ;;
  119.             HB|DIAQOS1)
  120.                 echo "$2/$3" >> $DIR/$1/hubei
  121.                 ;;
  122.             XZ)
  123.                 echo "$2/$3" >> $DIR/$1/xizang
  124.                 ;;
  125.             HE)
  126.                 echo "$2/$3" >> $DIR/$1/hebei
  127.                 ;;
  128.             SD)
  129.                 echo "$2/$3" >> $DIR/$1/shandong
  130.                 ;;
  131.             GS)
  132.                 echo "$2/$3" >> $DIR/$1/gansu
  133.                 ;;
  134.             AH|Anhui)
  135.                 echo "$2/$3" >> $DIR/$1/anhui
  136.                 ;;
  137.             LN)
  138.                 echo "$2/$3" >> $DIR/$1/liaoning
  139.                 ;;
  140.             HN|HUNAN)
  141.                 echo "$2/$3" >> $DIR/$1/hunan
  142.                 ;;
  143.             JS|SZ)
  144.                 echo "$2/$3" >> $DIR/$1/jiangsu
  145.                 ;;
  146.             XJ)
  147.                 echo "$2/$3" >> $DIR/$1/xinjiang
  148.                 ;;
  149.             JX)
  150.                 echo "$2/$3" >> $DIR/$1/jiangxi
  151.                 ;;
  152.             JL)
  153.                 echo "$2/$3" >> $DIR/$1/jilin
  154.                 ;;
  155.             SH|INSURANCE)
  156.                 echo "$2/$3" >> $DIR/$1/shanghai
  157.                 ;;
  158.             GX)
  159.                 echo "$2/$3" >> $DIR/$1/guangxi
  160.                 ;;
  161.             HI)
  162.                 echo "$2/$3" >> $DIR/$1/hainan
  163.                 ;;
  164.             TJ)
  165.                 echo "$2/$3" >> $DIR/$1/tianjin
  166.                 ;;
  167.             SC)
  168.                 echo "$2/$3" >> $DIR/$1/sichuan
  169.                 ;;
  170.             QH|GEERMU)
  171.                 echo "$2/$3" >> $DIR/$1/qinghai
  172.                 ;;
  173.             HK)
  174.                 echo "$2/$3" >> $DIR/$1/xianggang
  175.                 ;;
  176.             ZJ)
  177.                 echo "$2/$3" >> $DIR/$1/zhejiang
  178.                 ;;
  179.             *)
  180.                 echo "$2/$3" >> $DIR/$1/_other
  181.                 ;;
  182.         esac
  183.     }
  184.     whois_query(){
  185.         echo -e "Process[$FILE]\twhois [$1]"
  186.         whois $1 > $WHOIS
  187.         return $?
  188.     }
  189.     ntoa(){
  190.         awk '{c=256;print int($0/c^3)"."int($0%c^3/c^2)"."int($0%c^3%c^2/c)"."$0%c^3%c^2%c}' <<<$1
  191.     }
  192.     aton(){
  193.         awk '{c=256;split($0,ip,".");print ip[4]+ip[3]*c+ip[2]*c^2+ip[1]*c^3}' <<<$1
  194.     }
  195.     add_network(){
  196.         echo "$2/$3 $1 $4" >> $DIR/print
  197.         case $1 in
  198.             CHINANET)
  199.                 province $1 $2 $3 $4
  200.                 ;;
  201.             UNICOM)
  202.                 province $1 $2 $3 $4
  203.                 ;;
  204.             CMNET)
  205.                 echo "$2/$3" >> $DIR/$1
  206.                 ;;
  207.             CTTNET)
  208.                 echo "$2/$3" >> $DIR/$1
  209.                 ;;
  210.             CERNET)
  211.                 echo "$2/$3" >> $DIR/$1
  212.                 ;;
  213.             *)
  214.                 echo "$2/$3 $1 $4" >> $DIR/bug
  215.                 echo "$2/$3" >> $DIR/others
  216.                 ;;
  217.         esac
  218.     }
  219.     bool_sub(){
  220.         START=$HEAD
  221.         MASK=32
  222.         local NET
  223.         local i=$((~0))
  224.         while [[ $START -lt $TAIL ]]; do
  225.             ((i<<=1))
  226.             NET=$((HEAD&i))
  227.             START=$((~(NET^i)))
  228.             ((MASK--))
  229.             if [[ $START -eq $TAIL ]]; then
  230.                 return 0
  231.             fi
  232.         done
  233.         return 1
  234.     }
  235.     do_whois(){
  236.         local NET
  237.         local i=$((~0))
  238.         local j
  239.         eval $(awk 'BEGIN{i=256}/^inetnum:/{split($4,ipe,".");ipt=ipe[4]+ipe[3]*i+ipe[2]*i^2+ipe[1]*i^3}END{print "TAIL="ipt}' $WHOIS)
  240.         eval $(awk '/^$/{if(i)exit;}\
  241.                     /^netname:/{i++;split($2,a,"-");isp=a[1];area=a[2];if(isp=="CNC"||isp=="UNI"||isp=="uni")isp="UNICOM";\
  242.                         if((isp=="UNICOM"&&length(area)) || (isp=="CHINANET"&&length(area)))exit}\
  243.                     /^mnt-by:.*CNCGROUP/{n=split($2,a,"-");isp="UNICOM";for(x=1;x<=n;x++){if(a[x]=="CNCGROUP"){area=a[x+1];break}};exit}\
  244.                     /^mnt-by:.*CHINANET/{n=split($2,a,"-");isp="CHINANET";for(x=1;x<=n;x++){if(a[x]=="CHINANET"){area=a[x+1];break}};exit}\
  245.                     /^mnt-by:.*CERNET/{n=split($2,a,"-");isp="CERNET";for(x=1;x<=n;x++){if(a[x]=="CERNET"){area=a[x+1];break}};exit}\
  246.                     /^mnt-by: *MAINT-CN-SNXIAN/{isp="CHINANET";area="SN";exit}\
  247.                     /^netname: *guangzhou-.*-corp/{isp="UNICOM";area="GD";exit}\
  248.                     /^mnt-lower:.*CERNET/{isp="CERNET";exit}\
  249.                     /^mnt-lower:.*CHINANET/{n=split($2,a,"-");isp="CHINANET";for(x=1;x<=n;x++){if(a[x]=="CHINANET"){area=a[x+1];break}};exit}\
  250.                     END{print "ISP="isp";AREA="area}' $WHOIS)
  251.         HEAD=$(aton $IP)
  252.         bool_sub
  253.         if [[ $? -eq 0 ]]; then
  254.             add_network $ISP $IP $MASK $AREA
  255.         else
  256.             j=$((32-MASK))
  257.             ((i<<=j))
  258.             while [[ $NET -ne $HEAD ]]; do
  259.                 ((i>>=1))
  260.                 NET=$((HEAD&i))
  261.                 ((MASK++))
  262.             done
  263.             IP=$(ntoa $HEAD)
  264.             add_network $ISP $IP $MASK $AREA
  265.             TAIL=$((~(NET^i)))
  266.             ((TAIL++))
  267.             IP=$(ntoa $TAIL)
  268.             whois_query $IP
  269.             if [[ $? -eq 0 ]]; then
  270.                 do_whois
  271.             else
  272.                 echo "$IP/$MASK" >> $DIR/error
  273.             fi
  274.         fi
  275.     }
  276.     #main
  277.     FILE=$1
  278.     rm -rf $DIR 2>/dev/null
  279.     mkdir -p $DIR/CHINANET $DIR/UNICOM
  280.     while read IP CNT; do
  281.         START=$(aton $IP)
  282.         END=$((START+CNT-1))
  283.         TAIL=0
  284.         MASK_MAX=$(pow=32;for((i=$CNT;i>1;i>>=1)); do :; ((pow--)); done;echo $pow)
  285.         while [[ $TAIL -lt $END ]]; do
  286.             whois_query $IP
  287.             if [[ $? -eq 0 ]]; then
  288.                 do_whois
  289.                 ((TAIL++))
  290.                 IP=$(ntoa $TAIL)
  291.             else
  292.                 echo "$IP/$MASK" >> $DIR/error
  293.             fi
  294.         done
  295.     done < $FILE
  296.     exit 0
  297.     #!/bin/bash
  298.     if [[ -z $1 ]]; then
  299.         echo "file not found"
  300.         exit
  301.     fi
  302.     if [[ ! -f $1 ]]; then
  303.         echo "$1 is not exsit"
  304.         exit
  305.     fi
  306.     TMP=/tmp/merge
  307.     while :; do
  308.         awk -F"/" '
  309.         function ntoa(n){c=256;return int(n/c^3)"."int(n%c^3/c^2)"."int(n%c^3%c^2/c)"."n%c^3%c^2%c}
  310.         function aton(d){c=256;split(d,ip,".");return ip[4]+ip[3]*c+ip[2]*c^2+ip[1]*c^3}
  311.         function ntobc(a,b){e=compl(0);f=lshift(e,32-b);s=and(a,f);return compl(xor(s,f))}
  312.         function ntosub(j,k){g=compl(0);h=lshift(g,32-k);return and(j,h)}
  313.         NR>1{
  314.          if($1==ntoa(bc+1) && $2==mask && ip_int==ntosub(ip_int,$2-1)){
  315.                 mask=$2-1;bc=ntobc(ip_int,mask);
  316.                 next;
  317.          }else{
  318.                 print add"/"mask
  319.          }
  320.         }
  321.         {add=$1;ip_int=aton($1);mask=$2;bc=ntobc(ip_int,mask)}
  322.         END{print add"/"mask}
  323.         ' $1 > $TMP
  324.         FILE_SIZE=$(ls -l $1 2>/dev/null | awk '{print $5}')
  325.         TMP_SIZE=$(ls -l $TMP 2>/dev/null | awk '{print $5}')
  326.         if [[ $FILE_SIZE -eq $TMP_SIZE ]]; then
  327.             break
  328.         fi
  329.         cp $TMP $1
  330.     done
  331.     exit 0





由于CU博客代码框原因, 有问题的下载原件.

代码原件:
apnic.rar

APNIC目录下有三个日志文件:
bug 是除开这几大分类之外的地址, 记录的解析后的信息, 其IP段已经加入了others
print是所有查询处理的信息,
error是由于whois超时而未查询到的地址.

谢谢CU的waker, blackold, cjaizss等我的挚友们.
阅读(14220) | 评论(2) | 转发(2) |
给主人留下些什么吧!~~

zooyo2018-10-18 20:01:05

jpomichael:PROG1=\"whois.sh\"
    PROG2=\"merge.sh\"
这两个文件在哪里?

脚本自动分割成另外两个脚本。

回复 | 举报

jpomichael2018-08-20 16:38:27

PROG1=\"whois.sh\"
    PROG2=\"merge.sh\"
这两个文件在哪里?