浅析linux下替代usbhound的usb总线sniffer抓包模块usbmon安装和使用
操作系统: ubuntu 8.10 内核版本: 2.6.27-7-generic
ubuntu 8.10内置了debugfs,也提供了/lib/modules/2.6.27-7-generic/kernel/drivers/usb/mon/usbmon.ko模块, 然后使用如下2行语句加载usbmon.ko模块 luther@gliethttp:~$ sudo mount -t debugfs none_debugs /sys/kernel/debug luther@gliethttp:~$ sudo modprobe usbmon luther@gliethttp:~$ ls /sys/kernel/debug/usbmon/ 0s 0u 1s 1t 1u 2s 2t 2u 3s 3t 3u 4s 4t 4u 5s 5t 5u 6s 6t 6u 7s 7t 7u 如果使用其他操作系统没有usbmon.ko,可以自己编译一个,编译方法如下 1.查看当前内核版本 luther@gliethttp:~$ uname -r 2.去下载相应版本内核 3.解压后进入drivers/usb/mon/ 执行编译操作 luther@gliethttp:~/linux-2.6.27.7/drivers/usb/mon$ make -C /lib/modules/`uname -r`/build M=`pwd` modules 4.插入模块 luther@gliethttp:~/linux-2.6.27.7/drivers/usb/mon$ mount -t debugfs none_debugs /sys/kernel/debug luther@gliethttp:~/linux-2.6.27.7/drivers/usb/mon$ insmod ./usbmon.ko luther@gliethttp:~/linux-2.6.27.7/drivers/usb/mon$ ls /sys/kernel/debug/usbmon/ 0s 0u 1s 1t 1u 2s 2t 2u 3s 3t 3u 4s 4t 4u 5s 5t 5u 6s 6t 6u 7s 7t 7u
下面对数据格式进行分析:[luther.gliethttp] static void mon_text_read_head_u(struct mon_reader_text *rp, struct mon_text_ptr *p, const struct mon_event_text *ep) { char udir, utype;
udir = (ep->is_in ? 'i' : 'o'); switch (ep->xfertype) { case USB_ENDPOINT_XFER_ISOC: utype = 'Z'; break; case USB_ENDPOINT_XFER_INT: utype = 'I'; break; case USB_ENDPOINT_XFER_CONTROL: utype = 'C'; break; default: /* PIPE_BULK */ utype = 'B'; } p->cnt += snprintf(p->pbuf + p->cnt, p->limit - p->cnt, "%lx %u %c %c%c:%d:%03u:%u", ep->id, ep->tstamp, ep->type, utype, udir, ep->busnum, ep->devnum, ep->epnum); } mon_text_read_head_u一共合成如下8个数据域[luther.gliethttp] 1.urb标志,用来表示一个urb 2.时间戳 3.事件类型(S-submission, C-Callback, E-submission error) 4.端点类型I(中断),C(控制),B(Bulk)和Z(ISOC) 5.数据方向(i或者o) 6.bus总线号 7.该bus总线分配到的设备地址[luther.gliethttp] 8.端点号,对于in端点,为异或^0x80,即去掉第8位1值的数据,比如0x81,那么这里数据为0x81 ^ 0x80 = 1;[luther.gliethttp]
接下来的数据根据不同的端点类型将做不同的格式显示 I(中断) : ep->status和ep->interval 显示端点的状态和端点interval中断间隔值[luther.gliethttp] C(控制) : 如果事件类型为S,那么显示s bmRequestType bRequest wValue wIndex wLength 如果事件类型非S,即C,那么只显示ep->status B(Bulk) : ep->status 只显示状态 Z(ISOC) : 如果事件类型为S,那么显示ep->status, ep->interval, ep->start_frame 如果事件类型非S,那么显示ep->status, ep->interval, ep->start_frame, ep->error_count 同时显示ep->numdesc,dp->status, dp->offset, dp->length等信息,是组合最多的一个输出[luther.gliethttp]
接下来的数据就是ep->length长度了[luther.gliethttp].
再接下来就是mon_text_read_data填充数据了[luther.gliethttp]. 1.如果数据长度ep->length为0,那么直接填入一个'\n'换行. 2.如果确实有数据部分,那么先追入' =',然后在=等号后面追加实际数据内容. 3.如果没有数据部分,比如那么将ep->data_flag作为char类型打印出来[luther.gliethttp] ep->data_flag共有5种数值,由mon_text_get_data()函数返回 3.1 L -- 表示期待数据长度为0 3.2 < -- 表示有数据要上传,后面有需要接收的数据,后面会有IN动作, 表示in类型,后面还有In读取操作需要读取数据, 同时为S-submission或者E-submission error 3.3 > -- 表示数据部分已经成功下发 表示out类型,同时为C-Callback 3.4 Z -- 表示transfer_buffer为NULL 3.5 0 -- 表示成功获取data.
static inline char mon_text_get_data(struct mon_event_text *ep, struct urb *urb, int len, char ev_type, struct mon_bus *mbus) { if (len <= 0) return 'L'; if (len >= DATA_MAX) len = DATA_MAX;
if (ep->is_in) { if (ev_type != 'C') return '<'; } else { if (ev_type != 'S') return '>'; }
/* * The check to see if it's safe to poke at data has an enormous * number of corner cases, but it seems that the following is * more or less safe. * * We do not even try to look at transfer_buffer, because it can * contain non-NULL garbage in case the upper level promised to * set DMA for the HCD. */ if (urb->dev->bus->uses_dma && (urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)) { return mon_dmapeek(ep->data, urb->transfer_dma, len); }
if (urb->transfer_buffer == NULL) return 'Z'; /* '0' would be not as pretty. */
memcpy(ep->data, urb->transfer_buffer, len); return 0; }
luther@gliethttp:~$ lsusb Bus 007 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 005 Device 002: ID 0461:4d22 Primax Electronics, Ltd Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 001 Device 002: ID 413c:2105 Dell Computer Corp. Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
可以看到Bus 001 Device 002: ID 413c:2105 Dell Computer Corp. 我的Dell键盘就位于bus总线1, 我们捕获一下Caps Lock按键和Num Lock按键 luther@gliethttp:~$ sudo cat /sys/kernel/debug/usbmon/1u f794ea80 373903961 C Ii:1:002:1 0:16 8 = 00000000 00000000 f794ea80 373904003 S Ii:1:002:1 -115:16 8 < f794ea80 375647970 C Ii:1:002:1 0:16 8 = 00003900 00000000 f794ea80 375648013 S Ii:1:002:1 -115:16 8 < f794e200 375648186 S Co:1:002:0 s 21 09 0200 0000 0001 1 = 03 f794e200 375651967 C Co:1:002:0 0 1 > 数据0x03已经成功下发 f794ea80 375791967 C Ii:1:002:1 0:16 8 = 00000000 00000000 f794ea80 375792005 S Ii:1:002:1 -115:16 8 < f794ea80 376511970 C Ii:1:002:1 0:16 8 = 00005300 00000000 f794ea80 376512007 S Ii:1:002:1 -115:16 8 < f794e200 376512216 S Co:1:002:0 s 21 09 0200 0000 0001 1 = 02 f794e200 376515970 C Co:1:002:0 0 1 >数据0x02已经成功下发 f794ea80 376591971 C Ii:1:002:1 0:16 8 = 00000000 00000000 f794ea80 376592007 S Ii:1:002:1 -115:16 8 < f794ea80 378719980 C Ii:1:002:1 0:16 8 = 01000000 00000000 f794ea80 378720015 S Ii:1:002:1 -115:16 8 < f794ea80 378911979 C Ii:1:002:1 0:16 8 = 01000600 00000000 f794ea80 378912016 S Ii:1:002:1 -115:16 8 <
下面来自kernel文档usbmon.txt linux-2.6.27.7/Documentation/usbmon.txt
* Introduction
The name "usbmon" in lowercase refers to a facility in kernel which is used to collect traces of I/O on the USB bus. This function is analogous to a packet socket used by network monitoring tools such as tcpdump(1) or Ethereal. Similarly, it is expected that a tool such as usbdump or USBMon (with uppercase letters) is used to examine raw traces produced by usbmon.
The usbmon reports requests made by peripheral-specific drivers to Host Controller Drivers (HCD). So, if HCD is buggy, the traces reported by usbmon may not correspond to bus transactions precisely. This is the same situation as with tcpdump.
* How to use usbmon to collect raw text traces
Unlike the packet socket, usbmon has an interface which provides traces in a text format. This is used for two purposes. First, it serves as a common trace exchange format for tools while more sophisticated formats are finalized. Second, humans can read it in case tools are not available.
To collect a raw text trace, execute following steps.
1. Prepare
Mount debugfs (it has to be enabled in your kernel configuration), and load the usbmon module (if built as module). The second step is skipped if usbmon is built into the kernel.
# mount -t debugfs none_debugs /sys/kernel/debug # modprobe usbmon #
Verify that bus sockets are present.
# ls /sys/kernel/debug/usbmon 0s 0t 0u 1s 1t 1u 2s 2t 2u 3s 3t 3u 4s 4t 4u #
Now you can choose to either use the sockets numbered '0' (to capture packets on all buses), and skip to step #3, or find the bus used by your device with step #2.
2. Find which bus connects to the desired device
Run "cat /proc/bus/usb/devices", and find the T-line which corresponds to the device. Usually you do it by looking for the vendor string. If you have many similar devices, unplug one and compare two /proc/bus/usb/devices outputs. The T-line will have a bus number. Example:
T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=0557 ProdID=2004 Rev= 1.00 S: Manufacturer=ATEN S: Product=UC100KM V2.00
Bus=03 means it's bus 3.
3. Start 'cat'
# cat /sys/kernel/debug/usbmon/3u > /tmp/1.mon.out
to listen on a single bus, otherwise, to listen on all buses, type:
# cat /sys/kernel/debug/usbmon/0u > /tmp/1.mon.out
This process will be reading until killed. Naturally, the output can be redirected to a desirable location. This is preferred, because it is going to be quite long.
4. Perform the desired operation on the USB bus
This is where you do something that creates the traffic: plug in a flash key, copy files, control a webcam, etc.
5. Kill cat
Usually it's done with a keyboard interrupt (Control-C).
At this point the output file (/tmp/1.mon.out in this example) can be saved, sent by e-mail, or inspected with a text editor. In the last case make sure that the file size is not excessive for your favourite editor.
* Raw text data format
Two formats are supported currently: the original, or '1t' format, and the '1u' format. The '1t' format is deprecated in kernel 2.6.21. The '1u' format adds a few fields, such as ISO frame descriptors, interval, etc. It produces slightly longer lines, but otherwise is a perfect superset of '1t' format.
If it is desired to recognize one from the other in a program, look at the "address" word (see below), where '1u' format adds a bus number. If 2 colons are present, it's the '1t' format, otherwise '1u'.
Any text format data consists of a stream of events, such as URB submission, URB callback, submission error. Every event is a text line, which consists of whitespace separated words. The number or position of words may depend on the event type, but there is a set of words, common for all types.
Here is the list of words, from left to right:
- URB Tag. This is used to identify URBs is normally a kernel mode address of the URB structure in hexadecimal.
- Timestamp in microseconds, a decimal number. The timestamp's resolution depends on available clock, and so it can be much worse than a microsecond (if the implementation uses jiffies, for example).
- Event Type. This type refers to the format of the event, not URB type. Available types are: S - submission, C - callback, E - submission error.
- "Address" word (formerly a "pipe"). It consists of four fields, separated by colons: URB type and direction, Bus number, Device address, Endpoint number. Type and direction are encoded with two bytes in the following manner: Ci Co Control input and output Zi Zo Isochronous input and output Ii Io Interrupt input and output Bi Bo Bulk input and output Bus number, Device address, and Endpoint are decimal numbers, but they may have leading zeros, for the sake of human readers.
- URB Status word. This is either a letter, or several numbers separated by colons: URB status, interval, start frame, and error count. Unlike the "address" word, all fields save the status are optional. Interval is printed only for interrupt and isochronous URBs. Start frame is printed only for isochronous URBs. Error count is printed only for isochronous callback events.
The status field is a decimal number, sometimes negative, which represents a "status" field of the URB. This field makes no sense for submissions, but is present anyway to help scripts with parsing. When an error occurs, the field contains the error code.
In case of a submission of a Control packet, this field contains a Setup Tag instead of an group of numbers. It is easy to tell whether the Setup Tag is present because it is never a number. Thus if scripts find a set of numbers in this word, they proceed to read Data Length (except for isochronous URBs). If they find something else, like a letter, they read the setup packet before reading the Data Length or isochronous descriptors.
- Setup packet, if present, consists of 5 words: one of each for bmRequestType, bRequest, wValue, wIndex, wLength, as specified by the USB Specification 2.0. These words are safe to decode if Setup Tag was 's'. Otherwise, the setup packet was present, but not captured, and the fields contain filler.
- Number of isochronous frame descriptors and descriptors themselves. If an Isochronous transfer event has a set of descriptors, a total number of them in an URB is printed first, then a word per descriptor, up to a total of 5. The word consists of 3 colon-separated decimal numbers for status, offset, and length respectively. For submissions, initial length is reported. For callbacks, actual length is reported.
- Data Length. For submissions, this is the requested length. For callbacks, this is the actual length.
- Data tag. The usbmon may not always capture data, even if length is nonzero. The data words are present only if this tag is '='.
- Data words follow, in big endian hexadecimal format. Notice that they are not machine words, but really just a byte stream split into words to make it easier to read. Thus, the last word may contain from one to four bytes. The length of collected data is limited and can be less than the data length report in Data Length word.
Here is an example of code to read the data stream in a well known programming language:
class ParsedLine { int data_len; /* Available length of data */ byte data[];
void parseData(StringTokenizer st) { int availwords = st.countTokens(); data = new byte[availwords * 4]; data_len = 0; while (st.hasMoreTokens()) { String data_str = st.nextToken(); int len = data_str.length() / 2; int i; int b; // byte is signed, apparently?! XXX for (i = 0; i < len; i++) { // data[data_len] = Byte.parseByte( // data_str.substring(i*2, i*2 + 2), // 16); b = Integer.parseInt( data_str.substring(i*2, i*2 + 2), 16); if (b >= 128) b *= -1; data[data_len] = (byte) b; data_len++; } } } }
Examples:
An input control transfer to get a port status.
d5ea89a0 3575914555 S Ci:1:001:0 s a3 00 0000 0003 0004 4 < d5ea89a0 3575914560 C Ci:1:001:0 0 4 = 01050000
An output bulk transfer to send a SCSI command 0x5E in a 31-byte Bulk wrapper to a storage device at address 5:
dd65f0e8 4128379752 S Bo:1:005:2 -115 31 = 55534243 5e000000 00000000 00000600 00000000 00000000 00000000 000000 dd65f0e8 4128379808 C Bo:1:005:2 0 31 >
* Raw binary format and API
The overall architecture of the API is about the same as the one above, only the events are delivered in binary format. Each event is sent in the following structure (its name is made up, so that we can refer to it):
struct usbmon_packet { u64 id; /* 0: URB ID - from submission to callback */ unsigned char type; /* 8: Same as text; extensible. */ unsigned char xfer_type; /* ISO (0), Intr, Control, Bulk (3) */ unsigned char epnum; /* Endpoint number and transfer direction */ unsigned char devnum; /* Device address */ u16 busnum; /* 12: Bus number */ char flag_setup; /* 14: Same as text */ char flag_data; /* 15: Same as text; Binary zero is OK. */ s64 ts_sec; /* 16: gettimeofday */ s32 ts_usec; /* 24: gettimeofday */ int status; /* 28: */ unsigned int length; /* 32: Length of data (submitted or actual) */ unsigned int len_cap; /* 36: Delivered length */ unsigned char setup[8]; /* 40: Only for Control 'S' */ }; /* 48 bytes total */
These events can be received from a character device by reading with read(2), with an ioctl(2), or by accessing the buffer with mmap.
The character device is usually called /dev/usbmonN, where N is the USB bus number. Number zero (/dev/usbmon0) is special and means "all buses". However, this feature is not implemented yet. Note that specific naming policy is set by your Linux distribution.
If you create /dev/usbmon0 by hand, make sure that it is owned by root and has mode 0600. Otherwise, unpriviledged users will be able to snoop keyboard traffic.
The following ioctl calls are available, with MON_IOC_MAGIC 0x92:
MON_IOCQ_URB_LEN, defined as _IO(MON_IOC_MAGIC, 1)
This call returns the length of data in the next event. Note that majority of events contain no data, so if this call returns zero, it does not mean that no events are available.
MON_IOCG_STATS, defined as _IOR(MON_IOC_MAGIC, 3, struct mon_bin_stats)
The argument is a pointer to the following structure:
struct mon_bin_stats { u32 queued; u32 dropped; };
The member "queued" refers to the number of events currently queued in the buffer (and not to the number of events processed since the last reset).
The member "dropped" is the number of events lost since the last call to MON_IOCG_STATS.
MON_IOCT_RING_SIZE, defined as _IO(MON_IOC_MAGIC, 4)
This call sets the buffer size. The argument is the size in bytes. The size may be rounded down to the next chunk (or page). If the requested size is out of [unspecified] bounds for this kernel, the call fails with -EINVAL.
MON_IOCQ_RING_SIZE, defined as _IO(MON_IOC_MAGIC, 5)
This call returns the current size of the buffer in bytes.
MON_IOCX_GET, defined as _IOW(MON_IOC_MAGIC, 6, struct mon_get_arg)
This call waits for events to arrive if none were in the kernel buffer, then returns the first event. Its argument is a pointer to the following structure:
struct mon_get_arg { struct usbmon_packet *hdr; void *data; size_t alloc; /* Length of data (can be zero) */ };
Before the call, hdr, data, and alloc should be filled. Upon return, the area pointed by hdr contains the next event structure, and the data buffer contains the data, if any. The event is removed from the kernel buffer.
MON_IOCX_MFETCH, defined as _IOWR(MON_IOC_MAGIC, 7, struct mon_mfetch_arg)
This ioctl is primarily used when the application accesses the buffer with mmap(2). Its argument is a pointer to the following structure:
struct mon_mfetch_arg { uint32_t *offvec; /* Vector of events fetched */ uint32_t nfetch; /* Number of events to fetch (out: fetched) */ uint32_t nflush; /* Number of events to flush */ };
The ioctl operates in 3 stages.
First, it removes and discards up to nflush events from the kernel buffer. The actual number of events discarded is returned in nflush.
Second, it waits for an event to be present in the buffer, unless the pseudo- device is open with O_NONBLOCK.
Third, it extracts up to nfetch offsets into the mmap buffer, and stores them into the offvec. The actual number of event offsets is stored into the nfetch.
MON_IOCH_MFLUSH, defined as _IO(MON_IOC_MAGIC, 8)
This call removes a number of events from the kernel buffer. Its argument is the number of events to remove. If the buffer contains fewer events than requested, all events present are removed, and no error is reported. This works when no events are available too.
FIONBIO
The ioctl FIONBIO may be implemented in the future, if there's a need.
In addition to ioctl(2) and read(2), the special file of binary API can be polled with select(2) and poll(2). But lseek(2) does not work.
* Memory-mapped access of the kernel buffer for the binary API
The basic idea is simple:
To prepare, map the buffer by getting the current size, then using mmap(2). Then, execute a loop similar to the one written in pseudo-code below:
struct mon_mfetch_arg fetch; struct usbmon_packet *hdr; int nflush = 0; for (;;) { fetch.offvec = vec; // Has N 32-bit words fetch.nfetch = N; // Or less than N fetch.nflush = nflush; ioctl(fd, MON_IOCX_MFETCH, &fetch); // Process errors, too nflush = fetch.nfetch; // This many packets to flush when done for (i = 0; i < nflush; i++) { hdr = (struct ubsmon_packet *) &mmap_area[vec[i]]; if (hdr->type == '@') // Filler packet continue; caddr_t data = &mmap_area[vec[i]] + 64; process_packet(hdr, data); } }
Thus, the main idea is to execute only one ioctl per N events.
Although the buffer is circular, the returned headers and data do not cross the end of the buffer, so the above pseudo-code does not need any gathering.
|