Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1212394
  • 博文数量: 56
  • 博客积分: 400
  • 博客等级: 一等列兵
  • 技术积分: 2800
  • 用 户 组: 普通用户
  • 注册时间: 2010-03-30 13:08
个人简介

一个人的差异在于业余时间

文章分类

全部博文(56)

文章存档

2023年(1)

2019年(1)

2018年(1)

2017年(1)

2016年(2)

2015年(20)

2014年(10)

2013年(7)

2012年(12)

2011年(1)

分类: LINUX

2014-07-17 01:01:36

     搞过网络的人,一般都会用到抓包分析工具,在windows下一般就是wireshark,在linux下面一般系统自带tcpdump. 这里我们就说说tcpdump,对于它的如何使用,请看资料,本身它支持的选项并不复杂,复杂的是它支持的过滤表达式. 不论wireshark或tcpdump我觉得最重要的两点,一个就是它们的基本原理,另外就是强大的解码系统.
   可以从下载最新的源码,和libpcap库.
 参考代码:tcpdump4.5.1  libpcap1.5.3
    我们看tcpdump主函数代码 tcpdump.c :

点击(此处)折叠或打开

  1. int
  2. main(int argc, char **argv)
  3. {
  4.     register int cnt, op, i;
  5.     bpf_u_int32 localnet, netmask;
  6.     register char *cp, *infile, *cmdbuf, *device, *RFileName, *VFileName, *WFileName;
  7.     pcap_handler callback;
  8.     int type;
  9.     int dlt;
  10.     int new_dlt;
  11.     const char *dlt_name;
  12.     struct bpf_program fcode;
  13. #ifndef WIN32
  14.     RETSIGTYPE (*oldhandler)(int);
  15. #endif
  16.     struct print_info printinfo;
  17.     struct dump_info dumpinfo;
  18.     u_char *pcap_userdata;
  19.     char ebuf[PCAP_ERRBUF_SIZE];
  20.     char VFileLine[PATH_MAX + 1];
  21.     char *username = NULL;
  22.     char *chroot_dir = NULL;
  23.     char *ret = NULL;
  24.     char *end;
  25. #ifdef HAVE_PCAP_FINDALLDEVS
  26.     pcap_if_t *devpointer;
  27.     int devnum;
  28. #endif
  29.     int status;
  30.     FILE *VFile;
  31. #ifdef WIN32
  32.     if(wsockinit() != 0) return 1;
  33. #endif /* WIN32 */

  34.     jflag=-1;    /* not set */
  35.         gndo->ndo_Oflag=1;
  36.     gndo->ndo_Rflag=1;
  37.     gndo->ndo_dlt=-1;
  38.     gndo->ndo_default_print=ndo_default_print;
  39.     gndo->ndo_printf=tcpdump_printf;
  40.     gndo->ndo_error=ndo_error;
  41.     gndo->ndo_warning=ndo_warning;
  42.     gndo->ndo_snaplen = DEFAULT_SNAPLEN;
  43.   
  44.     cnt = -1;
  45.     device = NULL;
  46.     infile = NULL;
  47.     RFileName = NULL;
  48.     VFileName = NULL;
  49.     VFile = NULL;
  50.     WFileName = NULL;
  51.     dlt = -1;
  52.     if ((cp = strrchr(argv[0], '/')) != NULL)
  53.         program_name = cp + 1;
  54.     else
  55.         program_name = argv[0];

  56.     if (abort_on_misalignment(ebuf, sizeof(ebuf)) < 0)
  57.         error("%s", ebuf);

  58. #ifdef LIBSMI
  59.     smiInit("tcpdump");
  60. #endif

  61.     while (    //  对命令参数的处理 ,解析后,会设置flags,然后针对flags做相应的处理.
  62.      (op = getopt(argc, argv, "aAb" B_FLAG "c:C:d" D_FLAG "eE:fF:G:hHi:" I_FLAG j_FLAG J_FLAG "KlLm:M:nNOp" P_FLAG "qr:Rs:StT:u" U_FLAG "vV:w:W:xXy:Yz:Z:")) != -1)
  63.         switch (op) {

  64.         case 'a':
  65.             /* compatibility for old -a */
  66.             break;

  67.         case 'A':
  68.             ++Aflag;
  69.             break;
我们还需要看一个头文件,netdissect.h里定义了一个数据结构struct netdissect_options来描述tcdpump支持的所有参数动作,每一个参数有对应的flag, 在tcpdump 的main  里面,会根据用户的传入的参数来增加相应flag 数值,最后根据这些flag数值来实现特定动作。各个参数含义请参考源代码注释

点击(此处)折叠或打开

  1. struct netdissect_options {
  2.   int ndo_aflag;        /* translate network and broadcast addresses */
  3.   int ndo_bflag;        /* print 4 byte ASes in ASDOT notation */
  4.   int ndo_eflag;        /* print ethernet header */
  5.   int ndo_fflag;        /* don't translate "foreign" IP address */
  6.   int ndo_Kflag;        /* don't check TCP checksums */
  7.   int ndo_nflag;        /* leave addresses as numbers */
  8.   int ndo_Nflag;        /* remove domains from printed host names */
  9.   int ndo_qflag;        /* quick (shorter) output */
  10.   int ndo_Rflag;        /* print sequence # field in AH/ESP*/
  11.   int ndo_sflag;        /* use the libsmi to translate OIDs */
  12.   int ndo_Sflag;        /* print raw TCP sequence numbers */
  13.   int ndo_tflag;        /* print packet arrival time */
  14.   int ndo_Uflag;        /* "unbuffered" output of dump files */
  15.   int ndo_uflag;        /* Print undecoded NFS handles */
  16.   int ndo_vflag;        /* verbose */
  17.   int ndo_xflag;        /* print packet in hex */
  18.   int ndo_Xflag;        /* print packet in hex/ascii */
  19.   int ndo_Aflag;        /* print packet only in ascii observing TAB,
  20.                  * LF, CR and SPACE as graphical chars
  21.                  */
  22.   int ndo_Bflag;        /* buffer size */
  23.   int ndo_Iflag;        /* rfmon (monitor) mode */
  24.   int ndo_Oflag; /* run filter code optimizer */
  25.   int ndo_dlt; /* if != -1, ask libpcap for the DLT it names*/
  26.   int ndo_jflag; /* packet time stamp source */
  27.   int ndo_pflag; /* don't go promiscuous */

  28.   int ndo_Cflag; /* rotate dump files after this many bytes */
  29.   int ndo_Cflag_count; /* Keep track of which file number we're writing */
  30.   int ndo_Gflag; /* rotate dump files after this many seconds */
  31.   int ndo_Gflag_count; /* number of files created with Gflag rotation */
  32.   time_t ndo_Gflag_time; /* The last time_t the dump file was rotated. */
  33.   int ndo_Wflag; /* recycle output files after this number of files */
  34.   int ndo_WflagChars;
  35.   int ndo_Hflag;        /* dissect 802.11s draft mesh standard */
  36.   int ndo_suppress_default_print; /* don't use default_print() for unknown packet types */
  37.   const char *ndo_dltname;

  38.   char *ndo_espsecret;
  39.   struct sa_list *ndo_sa_list_head; /* used by print-esp.c */
  40.   struct sa_list *ndo_sa_default;

  41.   char *ndo_sigsecret;     /* Signature verification secret key */

  42.   struct esp_algorithm *ndo_espsecret_xform; /* cache of decoded */
  43.   char *ndo_espsecret_key;

  44.   int ndo_packettype;    /* as specified by -T */

  45.   char *ndo_program_name;    /*used to generate self-identifying messages */

  46.   int32_t ndo_thiszone;    /* seconds offset from gmt to local time */

  47.   int ndo_snaplen;

  48.   /*global pointers to beginning and end of current packet (during printing) */
  49.   const u_char *ndo_packetp;
  50.   const u_char *ndo_snapend;

  51.   /* bookkeeping for ^T output */
  52.   int ndo_infodelay;

  53.   /* pointer to void function to output stuff */
  54.   void (*ndo_default_print)(netdissect_options *,
  55.            register const u_char *bp, register u_int length);
  56.   void (*ndo_info)(netdissect_options *, int verbose);

  57.   int (*ndo_printf)(netdissect_options *,
  58.          const char *fmt, ...)
  59. #ifdef __ATTRIBUTE___FORMAT_OK_FOR_FUNCTION_POINTERS
  60.          __attribute__ ((format (printf, 2, 3)))
  61. #endif
  62.          ;
  63.   void (*ndo_error)(netdissect_options *,
  64.          const char *fmt, ...)
  65. #ifdef __ATTRIBUTE___NORETURN_OK_FOR_FUNCTION_POINTERS
  66.          __attribute__ ((noreturn))
  67. #endif /* __ATTRIBUTE___NORETURN_OK_FOR_FUNCTION_POINTERS */
  68. #ifdef __ATTRIBUTE___FORMAT_OK_FOR_FUNCTION_POINTERS
  69.          __attribute__ ((format (printf, 2, 3)))
  70. #endif /* __ATTRIBUTE___FORMAT_OK_FOR_FUNCTION_POINTERS */
  71.          ;
  72.   void (*ndo_warning)(netdissect_options *,
  73.          const char *fmt, ...)
  74. #ifdef __ATTRIBUTE___FORMAT_OK_FOR_FUNCTION_POINTERS
  75.          __attribute__ ((format (printf, 2, 3)))
  76. #endif
  77.          ;
  78. }
而在tcpdum.c定义了一个全局的:

点击(此处)折叠或打开

  1. netdissect_options Gndo;
  2. netdissect_options *gndo = &Gndo;
而在interface.h 又定义了很多宏,方便gndo里参数的调用:

点击(此处)折叠或打开

  1. extern netdissect_options *gndo;

  2. #define bflag gndo->ndo_bflag
  3. #define eflag gndo->ndo_eflag
  4. #define fflag gndo->ndo_fflag
  5. #define jflag gndo->ndo_jflag
  6. #define Kflag gndo->ndo_Kflag
  7. #define nflag gndo->ndo_nflag
  8. #define Nflag gndo->ndo_Nflag
  9. #define Oflag gndo->ndo_Oflag
  10. #define pflag gndo->ndo_pflag
  11. #define qflag gndo->ndo_qflag
  12. #define Rflag gndo->ndo_Rflag
  13. #define sflag gndo->ndo_sflag
  14. #define Sflag gndo->ndo_Sflag
  15. #define tflag gndo->ndo_tflag
  16. #define Uflag gndo->ndo_Uflag
  17. #define uflag gndo->ndo_uflag
  18. #define vflag gndo->ndo_vflag
  19. #define xflag gndo->ndo_xflag
  20. #define Xflag gndo->ndo_Xflag
  21. #define Cflag gndo->ndo_Cflag
  22. #define Gflag gndo->ndo_Gflag
  23. #define Aflag gndo->ndo_Aflag
  24. #define Bflag gndo->ndo_Bflag
  25. #define Iflag gndo->ndo_Iflag
  26. #define suppress_default_print gndo->ndo_suppress_default_print
  27. #define packettype gndo->ndo_packettype
  28. #define sigsecret gndo->ndo_sigsecret
  29. #define Wflag gndo->ndo_Wflag
  30. #define WflagChars gndo->ndo_WflagChars
  31. #define Cflag_count gndo->ndo_Cflag_count
  32. #define Gflag_count gndo->ndo_Gflag_count
  33. #define Gflag_time gndo->ndo_Gflag_time
  34. #define Hflag gndo->ndo_Hflag
  35. #define snaplen gndo->ndo_snaplen
  36. #define snapend gndo->ndo_snapend
这里不解释各个参数的使用及其调用.
如果我们要监控一个网络接口,一般我们会指定-i选项 后面是我们的接口ethX.

点击(此处)折叠或打开

  1. device = optarg;
  2.             break;
然后直接调用到:

点击(此处)折叠或打开

  1. #else
  2.         *ebuf = '\0';
  3.         pd = pcap_open_live(device, snaplen, !pflag, 1000, ebuf);
  4.         if (pd == NULL)
  5.             error("%s", ebuf);
  6.         else if (*ebuf)
  7.             warning("%s", ebuf);
  8. #endif /* HAVE_PCAP_CREATE */
如果没有指定接口那么会调用pcap_lookupdev函数来查询一个.比如从用户空间查询/proc/net/dev

点击(此处)折叠或打开

  1. else {
  2.         /*
  3.          * We're doing a live capture.
  4.          */
  5.         if (device == NULL) {
  6.             device = pcap_lookupdev(ebuf);
  7.             if (device == NULL)
  8.                 error("%s", ebuf);
  9.         }
它会调用libpcap库.来查找设备链表,和内核设备链表类似.本质上它查询的是/proc/net/dev


点击(此处)折叠或打开

  1. #if !defined(WIN32) && !defined(MSDOS)

  2. /*
  3.  * Return the name of a network interface attached to the system, or NULL
  4.  * if none can be found. The interface must be configured up; the
  5.  * lowest unit number is preferred; loopback is ignored.
  6.  */
  7. char *
  8. pcap_lookupdev(errbuf)
  9.     register char *errbuf;
  10. {
  11.     pcap_if_t *alldevs;
  12. /* for old BSD systems, including bsdi3 */
  13. #ifndef IF_NAMESIZE
  14. #define IF_NAMESIZE IFNAMSIZ
  15. #endif
  16.     static char device[IF_NAMESIZE + 1];
  17.     char *ret;

  18.     if (pcap_findalldevs(&alldevs, errbuf) == -1)
  19.         return (NULL);

  20.     if (alldevs == NULL || (alldevs->flags & PCAP_IF_LOOPBACK)) {
  21.         /*
  22.          * There are no devices on the list, or the first device
  23.          * on the list is a loopback device, which means there
  24.          * are no non-loopback devices on the list. This means
  25.          * we can't return any device.
  26.          *
  27.          * XXX - why not return a loopback device? If we can't
  28.          * capture on it, it won't be on the list, and if it's
  29.          * on the list, there aren't any non-loopback devices,
  30.          * so why not just supply it as the default device?
  31.          */
  32.         (void)strlcpy(errbuf, "no suitable device found",
  33.          PCAP_ERRBUF_SIZE);
  34.         ret = NULL;
  35.     } else {
  36.         /*
  37.          * Return the name of the first device on the list.
  38.          */
  39.         (void)strlcpy(device, alldevs->name, sizeof(device));
  40.         ret = device;
  41.     }

  42.     pcap_freealldevs(alldevs);
  43.     return (ret);
  44. }
处理完参数,它会调用到pcap_open_live函数

点击(此处)折叠或打开

  1. pcap_t *
  2. pcap_open_live(const char *source, int snaplen, int promisc, int to_ms, char *errbuf)
  3. {
  4.     pcap_t *p;
  5.     int status;

  6.     p = pcap_create(source, errbuf);
  7.     if (p == NULL)
  8.         return (NULL);
  9.     status = pcap_set_snaplen(p, snaplen);
  10.     if (status < 0)
  11.         goto fail;
  12.     status = pcap_set_promisc(p, promisc);
  13.     if (status < 0)
  14.         goto fail;
  15.     status = pcap_set_timeout(p, to_ms);
  16.     if (status < 0)
  17.         goto fail;
  18.     /*
  19.      * Mark this as opened with pcap_open_live(), so that, for
  20.      * example, we show the full list of DLT_ values, rather
  21.      * than just the ones that are compatible with capturing
  22.      * when not in monitor mode. That allows existing applications
  23.      * to work the way they used to work, but allows new applications
  24.      * that know about the new open API to, for example, find out the
  25.      * DLT_ values that they can select without changing whether
  26.      * the adapter is in monitor mode or not.
  27.      */
  28.     p->oldstyle = 1;
  29.     status = pcap_activate(p);
  30.     if (status < 0)
  31.         goto fail;
  32.     return (p);
  33. fail:
  34.     if (status == PCAP_ERROR)
  35.         snprintf(errbuf, PCAP_ERRBUF_SIZE, "%s: %s", source,
  36.          p->errbuf);
  37.     else if (status == PCAP_ERROR_NO_SUCH_DEVICE ||
  38.      status == PCAP_ERROR_PERM_DENIED ||
  39.      status == PCAP_ERROR_PROMISC_PERM_DENIED)
  40.         snprintf(errbuf, PCAP_ERRBUF_SIZE, "%s: %s (%s)", source,
  41.          pcap_statustostr(status), p->errbuf);
  42.     else
  43.         snprintf(errbuf, PCAP_ERRBUF_SIZE, "%s: %s", source,
  44.          pcap_statustostr(status));
  45.     pcap_close(p);
  46.     return (NULL);
  47. }
这个函数做了最重要的工作,它完成了和内核底层的通信.
我们看pcap_activate函数:

点击(此处)折叠或打开

  1. int
  2. pcap_activate(pcap_t *p)
  3. {
  4.     int status;

  5.     /*
  6.      * Catch attempts to re-activate an already-activated
  7.      * pcap_t; this should, for example, catch code that
  8.      * calls pcap_open_live() followed by pcap_activate(),
  9.      * as some code that showed up in a Stack Exchange
  10.      * question did.
  11.      */
  12.     if (pcap_check_activated(p))
  13.         return (PCAP_ERROR_ACTIVATED);
  14.     status = p->activate_op(p);
  15.     if (status >= 0)
  16.         p->activated = 1;
  17.     else {
  18.         if (p->errbuf[0] == '\0') {
  19.             /*
  20.              * No error message supplied by the activate routine;
  21.              * for the benefit of programs that don't specially
  22.              * handle errors other than PCAP_ERROR, return the
  23.              * error message corresponding to the status.
  24.              */
  25.             snprintf(p->errbuf, PCAP_ERRBUF_SIZE, "%s",
  26.              pcap_statustostr(status));
  27.         }

  28.         /*
  29.          * Undo any operation pointer setting, etc. done by
  30.          * the activate operation.
  31.          */
  32.         initialize_ops(p);
  33.     }
  34.     return (status);
  35. }
这里面最主要的调用就是 p->activate_op(p);但是activate_op被初始化的是什么呢?它是调用函数pcap_create中调用pcap_create_interface里被赋值的:

点击(此处)折叠或打开

  1. pcap_t *
  2. pcap_create_interface(const char *device, char *ebuf)
  3. {
  4.     pcap_t *handle;

  5.     handle = pcap_create_common(device, ebuf, sizeof (struct pcap_linux));
  6.     if (handle == NULL)
  7.         return NULL;

  8.     handle->activate_op = pcap_activate_linux;
  9.     handle->can_set_rfmon_op = pcap_can_set_rfmon_linux;
我们看到被赋值为了pcap_activate_linux,那么它到底做了什么呢?在这个函数里它调用了一个很关键的函数activate_new

点击(此处)折叠或打开

  1. /* ===== Functions to interface to the newer kernels ================== */

  2. /*
  3.  * Try to open a packet socket using the new kernel PF_PACKET interface.
  4.  * Returns 1 on success, 0 on an error that means the new interface isn't
  5.  * present (so the old SOCK_PACKET interface should be tried), and a
  6.  * PCAP_ERROR_ value on an error that means that the old mechanism won't
  7.  * work either (so it shouldn't be tried).
  8.  */
  9. static int
  10. activate_new(pcap_t *handle)
  11. {
  12. #ifdef HAVE_PF_PACKET_SOCKETS
  13.     struct pcap_linux *handlep = handle->priv;
  14.     const char        *device = handle->opt.source;
  15.     int            is_any_device = (strcmp(device, "any") == 0);
  16.     int            sock_fd = -1, arptype;
  17. #ifdef HAVE_PACKET_AUXDATA
  18.     int            val;
  19. #endif
  20.     int            err = 0;
  21.     struct packet_mreq    mr;

  22.     /*
  23.      * Open a socket with protocol family packet. If the
  24.      * "any" device was specified, we open a SOCK_DGRAM
  25.      * socket for the cooked interface, otherwise we first
  26.      * try a SOCK_RAW socket for the raw interface.
  27.      */
  28.     sock_fd = is_any_device ?
  29.         socket(PF_PACKET, SOCK_DGRAM, htons(ETH_P_ALL)) :
  30.         socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
我们看到本质是建立了socket类型为PF_PACKET,SOCK_RAW ,网络类型是ETH_P_ALL.后面还有setsockopt的调用设置一些选项,它也会触发内核初始化一些东西.
就让我们看看socket系统调用的实现:
在socket.c中:

点击(此处)折叠或打开

  1. SYSCALL_DEFINE3(socket, int, family, int, type, int, protocol)
  2. {
  3.     int retval;
  4.     struct socket *sock;
  5.     int flags;

  6.     /* Check the SOCK_* constants for consistency. */
  7.     BUILD_BUG_ON(SOCK_CLOEXEC != O_CLOEXEC);
  8.     BUILD_BUG_ON((SOCK_MAX | SOCK_TYPE_MASK) != SOCK_TYPE_MASK);
  9.     BUILD_BUG_ON(SOCK_CLOEXEC & SOCK_TYPE_MASK);
  10.     BUILD_BUG_ON(SOCK_NONBLOCK & SOCK_TYPE_MASK);

  11.     flags = type & ~SOCK_TYPE_MASK;
  12.     if (flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
  13.         return -EINVAL;
  14.     type &= SOCK_TYPE_MASK;

  15.     if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK))
  16.         flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK;

  17.     retval = sock_create(family, type, protocol, &sock);
  18.     if (retval < 0)
  19.         goto out;

  20.     retval = sock_map_fd(sock, flags & (O_CLOEXEC | O_NONBLOCK));
  21.     if (retval < 0)
  22.         goto out_release;

  23. out:
  24.     /* It may be already another descriptor 8) Not kernel problem. */
  25.     return retval;

  26. out_release:
  27.     sock_release(sock);
  28.     return retval;
  29. }
这里我们跟踪sock_create:它调用了__sock_create:

点击(此处)折叠或打开

  1. static int __sock_create(struct net *net, int family, int type, int protocol,
  2.              struct socket **res, int kern)
  3. {
  4.     int err;
  5.     struct socket *sock;
  6.     const struct net_proto_family *pf;

  7.     /*
  8.      * Check protocol is in range
  9.      */
  10.     if (family < 0 || family >= NPROTO)
  11.         return -EAFNOSUPPORT;
  12.     if (type < 0 || type >= SOCK_MAX)
  13.         return -EINVAL;

  14.     /* Compatibility.

  15.      This uglymoron is moved from INET layer to here to avoid
  16.      deadlock in module load.
  17.      */
  18.     if (family == PF_INET && type == SOCK_PACKET) {
  19.         static int warned;
  20.         if (!warned) {
  21.             warned = 1;
  22.             printk(KERN_INFO "%s uses obsolete (PF_INET,SOCK_PACKET)\n",
  23.              current->comm);
  24.         }
  25.         family = PF_PACKET;
  26.     }

  27.     err = security_socket_create(family, type, protocol, kern);
  28.     if (err)
  29.         return err;

  30.     /*
  31.      *    Allocate the socket and allow the family to set things up. if
  32.      *    the protocol is 0, the family is instructed to select an appropriate
  33.      *    default.
  34.      */
  35.     sock = sock_alloc();
  36.     if (!sock) {
  37.         if (net_ratelimit())
  38.             printk(KERN_WARNING "socket: no more sockets\n");
  39.         return -ENFILE;    /* Not exactly a match, but its the
  40.                  closest posix thing */
  41.     }

  42.     sock->type = type;

  43. #ifdef CONFIG_MODULES
  44.     /* Attempt to load a protocol module if the find failed.
  45.      *
  46.      * 12/09/1996 Marcin: this makes REALLY only sense, if the user
  47.      * requested real, full-featured networking support upon configuration.
  48.      * Otherwise module support will
  49.      */
  50.     if (net_families[family] == NULL)
  51.         request_module("net-pf-%d", family);
  52. #endif

  53.     rcu_read_lock();
  54.     pf = rcu_dereference(net_families[family]);  //查询协议注册数组
  55.     err = -EAFNOSUPPORT;
  56.     if (!pf)
  57.         goto out_release;

  58.     /*
  59.      * We will call the ->create function, that possibly is in a loadable
  60.      * module, so we have to bump that loadable module refcnt first.
  61.      */
  62.     if (!try_module_get(pf->owner))
  63.         goto out_release;

  64.     /* Now protected by module ref count */
  65.     rcu_read_unlock();

  66.     err = pf->create(net, sock, protocol);  //调用注册协议的create函数
  67.     if (err < 0)
  68.         goto out_module_put;

  69.     /*
  70.      * Now to bump the refcnt of the [loadable] module that owns this
  71.      * socket at sock_release time we decrement its refcnt.
  72.      */
  73.     if (!try_module_get(sock->ops->owner))
  74.         goto out_module_busy;

  75.     /*
  76.      * Now that we're done with the ->create function, the [loadable]
  77.      * module can have its refcnt decremented
  78.      */
  79.     module_put(pf->owner);
  80.     err = security_socket_post_create(sock, family, type, protocol, kern);  //内核的一个安全权限检查,这里我们不深入分析.
  81.     if (err)
  82.         goto out_sock_release;
  83.     *res = sock;

  84.     return 0;

  85. out_module_busy:
  86.     err = -EAFNOSUPPORT;
  87. out_module_put:
  88.     sock->ops = NULL;
  89.     module_put(pf->owner);
  90. out_sock_release:
  91.     sock_release(sock);
  92.     return err;

  93. out_release:
  94.     rcu_read_unlock();
  95.     goto out_sock_release;
  96. }
这里查询net_families,找到内核注册的协议,并调用create函数

点击(此处)折叠或打开

  1. err = pf->create(net, sock, protocol);
我们看看PF_PACKET的注册,在函数af_packet.c中

点击(此处)折叠或打开

  1. static int __init packet_init(void)
  2. {
  3.     int rc = proto_register(&packet_proto, 0);

  4.     if (rc != 0)
  5.         goto out;

  6.     sock_register(&packet_family_ops);
  7.     register_pernet_subsys(&packet_net_ops);
  8.     register_netdevice_notifier(&packet_netdev_notifier);
  9. out:
  10.     return rc;
  11. }
packet_proto:

点击(此处)折叠或打开

  1. static struct proto packet_proto = {
  2.     .name     = "PACKET",
  3.     .owner     = THIS_MODULE,
  4.     .obj_size = sizeof(struct packet_sock),
  5. };
这里关键的是sock_register(&packet_family_ops);的注册

点击(此处)折叠或打开

  1. static struct net_proto_family packet_family_ops = {
  2.     .family =    PF_PACKET,
  3.     .create =    packet_create,
  4.     .owner    =    THIS_MODULE,
  5. };
上面调用的create函数就在这里.即packet_create函数:

点击(此处)折叠或打开

  1. /*
  2.  *    Create a packet of type SOCK_PACKET.
  3.  */

  4. static int packet_create(struct net *net, struct socket *sock, int protocol)
  5. {
  6.     struct sock *sk;
  7.     struct packet_sock *po;
  8.     __be16 proto = (__force __be16)protocol; /* weird, but documented */
  9.     int err;

  10.     if (!capable(CAP_NET_RAW))
  11.         return -EPERM;
  12.     if (sock->type != SOCK_DGRAM && sock->type != SOCK_RAW &&
  13.      sock->type != SOCK_PACKET)
  14.         return -ESOCKTNOSUPPORT;

  15.     sock->state = SS_UNCONNECTED;

  16.     err = -ENOBUFS;
  17.     sk = sk_alloc(net, PF_PACKET, GFP_KERNEL, &packet_proto);
  18.     if (sk == NULL)
  19.         goto out;

  20.     sock->ops = &packet_ops;
  21.     if (sock->type == SOCK_PACKET)
  22.         sock->ops = &packet_ops_spkt;

  23.     sock_init_data(sock, sk);

  24.     po = pkt_sk(sk);
  25.     sk->sk_family = PF_PACKET;
  26.     po->num = proto;

  27.     sk->sk_destruct = packet_sock_destruct;
  28.     sk_refcnt_debug_inc(sk);

  29.     /*
  30.      *    Attach a protocol block
  31.      */

  32.     spin_lock_init(&po->bind_lock);
  33.     mutex_init(&po->pg_vec_lock);
  34.     po->prot_hook.func = packet_rcv;

  35.     if (sock->type == SOCK_PACKET)
  36.         po->prot_hook.func = packet_rcv_spkt;

  37.     po->prot_hook.af_packet_priv = sk;

  38.     if (proto) {
  39.         po->prot_hook.type = proto;
  40.         dev_add_pack(&po->prot_hook);
  41.         sock_hold(sk);
  42.         po->running = 1;
  43.     }

  44.     write_lock_bh(&net->packet.sklist_lock);
  45.     sk_add_node(sk, &net->packet.sklist);
  46.     sock_prot_inuse_add(net, &packet_proto, 1);
  47.     write_unlock_bh(&net->packet.sklist_lock);
  48.     return 0;
  49. out:
  50.     return err;
  51. }
其实pf_packet是一个特殊的协议,socket,是内核专门用来嗅探数据报文的,方便调试用.
它对sock的ops重新初始化:

点击(此处)折叠或打开

  1. sock->ops = &packet_ops;

点击(此处)折叠或打开

  1. static const struct proto_ops packet_ops = {
  2.     .family =    PF_PACKET,
  3.     .owner =    THIS_MODULE,
  4.     .release =    packet_release,
  5.     .bind =        packet_bind,
  6.     .connect =    sock_no_connect,
  7.     .socketpair =    sock_no_socketpair,
  8.     .accept =    sock_no_accept,
  9.     .getname =    packet_getname,
  10.     .poll =        packet_poll,
  11.     .ioctl =    packet_ioctl,
  12.     .listen =    sock_no_listen,
  13.     .shutdown =    sock_no_shutdown,
  14.     .setsockopt =    packet_setsockopt,
  15.     .getsockopt =    packet_getsockopt,
  16.     .sendmsg =    packet_sendmsg,
  17.     .recvmsg =    packet_recvmsg,
  18.     .mmap =        packet_mmap,
  19.     .sendpage =    sock_no_sendpage,
  20. }
并初始化特殊操作函数:

点击(此处)折叠或打开

  1. po->prot_hook.func = packet_rcv;

  2.     if (sock->type == SOCK_PACKET)
  3.         po->prot_hook.func = packet_rcv_spkt;
在create的时候默认是packet_rcv函数,然后调用协议注册函数dev_add_pack:

点击(此处)折叠或打开

  1. if (proto) {
  2.         po->prot_hook.type = proto;
  3.         dev_add_pack(&po->prot_hook);
  4.         sock_hold(sk);
  5.         po->running = 1;
  6.     }
然而我们知道这个socket的协议类型是ETH_P_ALL,我们之前在帧的接收和发送的时候讲过,它会注册到ptype_all的链表.嗅探器会用到.
这里我们在回顾一下:
在报文接收的时候在dev.c netif_recevice_skb中:

点击(此处)折叠或打开

  1. list_for_each_entry_rcu(ptype, &ptype_all, list) {
  2.         if (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
  3.          ptype->dev == orig_dev) {
  4.             if (pt_prev)
  5.                 ret = deliver_skb(skb, pt_prev, orig_dev);
  6.             pt_prev = ptype;
  7.         }
  8.     }
它就会查询注册的协议,调用处理函数,这里是packet_rcv
而发送的时候是在dev_hard_start_xmit中调用dev_queue_xmit_nit

点击(此处)折叠或打开

  1. int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
  2.             struct netdev_queue *txq)
  3. {
  4.     const struct net_device_ops *ops = dev->netdev_ops;
  5.     int rc;

  6.     if (likely(!skb->next)) {
  7.         if (!list_empty(&ptype_all))
  8.             dev_queue_xmit_nit(skb, dev);

点击(此处)折叠或打开

  1. /*
  2.  *    Support routine. Sends outgoing frames to any network
  3.  *    taps currently in use.
  4.  */

  5. static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
  6. {
  7.     struct packet_type *ptype;

  8. #ifdef CONFIG_NET_CLS_ACT
  9.     if (!(skb->tstamp.tv64 && (G_TC_FROM(skb->tc_verd) & AT_INGRESS)))
  10.         net_timestamp(skb);
  11. #else
  12.     net_timestamp(skb);
  13. #endif

  14.     rcu_read_lock();
  15.     list_for_each_entry_rcu(ptype, &ptype_all, list) {
  16.         /* Never send packets back to the socket
  17.          * they originated from - MvS (miquels@drinkel.ow.org)
  18.          */
  19.         if ((ptype->dev == dev || !ptype->dev) &&
  20.          (ptype->af_packet_priv == NULL ||
  21.          (struct sock *)ptype->af_packet_priv != skb->sk)) {
  22.             struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC);
  23.             if (!skb2)
  24.                 break;

  25.             /* skb->nh should be correctly
  26.              set by sender, so that the second statement is
  27.              just protection against buggy protocols.
  28.              */
  29.             skb_reset_mac_header(skb2);

  30.             if (skb_network_header(skb2) < skb2->data ||
  31.              skb2->network_header > skb2->tail) {
  32.                 if (net_ratelimit())
  33.                     printk(KERN_CRIT "protocol %04x is "
  34.                      "buggy, dev %s\n",
  35.                      skb2->protocol, dev->name);
  36.                 skb_reset_network_header(skb2);
  37.             }

  38.             skb2->transport_header = skb2->network_header;
  39.             skb2->pkt_type = PACKET_OUTGOING;
  40.             ptype->func(skb2, skb->dev, ptype, skb->dev);
  41.         }
  42.     }
  43.     rcu_read_unlock();
  44. }
把报文复制一份,然后调用packet_rcv传递给上层.
当然在调用setsockopt时即调用packet_setsockopt函数,会根据flags重新初始化接收处理函数:
packet_set_ring它会处理包更高效:

点击(此处)折叠或打开

  1. po->prot_hook.func = (po->rx_ring.pg_vec) ?
  2.                         tpacket_rcv : packet_rcv;
判断之后,重新调用dev_add_pack注册,在实际中会变成tpacket_rcv函数.至于为什么,原因这里就不分析了.
在实际应用中我们可以看一个例子:
# cat /proc/net/ptype 
Type Device      Function
0800          ip_rcv+0x0/0x510
0011          llc_rcv+0x0/0x3cc
8863          pppoe_disc_rcv+0x0/0x200
0004          llc_rcv+0x0/0x3cc
8864          pppoe_rcv+0x0/0x240
0806          arp_rcv+0x0/0x16c
88d9 br0      packet_rcv+0x0/0x20
886c br0      packet_rcv+0x0/0x20
86dd          ipv6_rcv+0x0/0x68c



# tcpdump -i eth2 &
# tcpdump: WARNING: eth2: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
#  
# cat /proc/net/ptype 
Type Device      Function
ALL  eth2     tpacket_rcv+0x0/0x20
0800          ip_rcv+0x0/0x510
0011          llc_rcv+0x0/0x3cc
8863          pppoe_disc_rcv+0x0/0x200
0004          llc_rcv+0x0/0x3cc
8864          pppoe_rcv+0x0/0x240
0806          arp_rcv+0x0/0x16c
88d9 br0      packet_rcv+0x0/0x20
886c br0      packet_rcv+0x0/0x20
86dd          ipv6_rcv+0x0/0x68c

我们会发现是tpacket_rcv函数.这需要注意一下.
这里的原因来自libpcap调用的时候的
setsockopt(handle->fd, SOL_PACKET, PACKET_RX_RING,(void *) &req, sizeof(req))
这个PACKET_RX_RING的标志!它触发了内核的新初始化.

然后我们回到tcpdump的主函数,最后会调用pcap_loop来无限处理报文.
在内核调用接收报文函数的时候里面会有一个run_filter,它就是报文过滤规则,这里不多说,需要专门来分析.包括过滤规则和解码的以后分析吧



















阅读(13307) | 评论(4) | 转发(5) |
给主人留下些什么吧!~~

linuxDOS2015-06-18 23:41:43

CH__DTK:最近我也在研究libpcap还有pf_ring,看了前辈的分析,终于有点清晰了,非常感谢前辈的无私分享。
我想请教一下前辈,是用什么工具去跟踪的?
我一直都是用source insight去看,这样没办法调试。
然后我又学着用gdb去调试,但这样有没有源码跟踪下去。

自己下载源码库 编译 动态调试

回复 | 举报

CH__DTK2015-05-28 14:36:17

最近我也在研究libpcap还有pf_ring,看了前辈的分析,终于有点清晰了,非常感谢前辈的无私分享。
我想请教一下前辈,是用什么工具去跟踪的?
我一直都是用source insight去看,这样没办法调试。
然后我又学着用gdb去调试,但这样有没有源码跟踪下去。

poke90012014-07-18 12:40:37

写的很好

l7l1l0l2014-07-17 23:05:53

支持,学习一下,写的非常不错。