It's fairly easy to add support for more protocols to l7-filter. All you need to do is add a new pattern file to
/etc/l7-protocols. This directory and its subdirectories are searched (non-recursively) for pattern files. (Thus, it will find
/etc/l7-protocols/http.pat and
/etc/l7-protocols/protocols/http.pat, but not
/etc/l7-protocols/foo/bar/http.pat.) Please consider submitting any patterns you write for inclusion into the official distribution.
W/ M3 ?+ {( Z& @0 @; CFile formatBasic formatThe basic format is very simple:
- ~* }# N( e) V' [0 F7 }) x
- The name of the protocol on one line
- A regular expression defining the protocol on the next line (see below)
The name of the file must match the name of the protocol. (If the protocol is "ftp", the file must be "ftp.pat".) Lines starting with '#' and blank lines are ignored. Both the and versions of l7-filter will use the given regular expression. For example, vnc.pat could be:
/ ~5 r" J4 [- R! }vnc^rfb 00[1-9]\.00[0-9]\x0a$Defining a separate userspace patternSometimes it will be desirable to define a separate regular expression for the kernel and userspace versions or to pass a custom set of flags to the userspace version's regcomp/regexec. (See
below for why.) In this case, add either or both of these lines after the two above:
4 O' a% e7 L+ F" `& D' \" ^userspace pattern=
- x- @$ s K/ M: v7 \, ouserspace flags=- ?9 I! _% ]% IFor example, smtp.pat could be:
3 m; T; v c6 j2 T O# P" Gsmtp^220[\x09-\x0d -~]* (e?smtp|simple mail)userspace pattern=^220[\x09-\x0d -~]* (E?SMTP|[Ss]imple [Mm]ail)userspace flags=REG_NOSUB REG_EXTENDEDMeta-dataPattern files that are part of the official distribution need some metadata at the top for and for the use of frontends. The top four lines should look like this:
6 O& ~" Z1 m7 _#
# Pattern attributes: [attribute word]*# Protocol groups: [group name]*# Wiki: [link]*"Pattern attributes" give information about how good the pattern is on various scales. Attribute words can be any of undermatch, overmatch, superset, subset, great, good, ok, marginal, poor, veryfast, fast, nosofast, or slow. Any number of these may be used. They are defined .
6 J* K, G3 D Y. L5 O( a4 E: @"Protocol groups" are supposed to give frontends a way to group similar protocols. Group names can be whatever you like, but should match existing names if possible. Any number may be used. More relevant groups should be listed first for sorting purposes. Group names in use as of 2007-01-14 are:/ w0 W2 A! |$ R2 H: l
- chat
- document_retrieval
- file
- game
- ietf_draft_standard
- ietf_internet_standard
- ietf_proposed_standard
- ietf_rfc_documented
- mail
- monitoring
- networking
- obsolete
- open_source
- p2p
- printer
- proprietary
- remote_access
- secure
- streaming_audio
- streaming_video
- time_synchronization
- version_control
- voip
- worm
- x_consortium_standard
"Wiki" gives zero or more links to pages documenting the pattern and other methods of identifying the protocol on .3 G$ i5 d6 ~$ \6 T2 k2 b2 j8 N1 V
Regular expressionsThe and versions of l7-filter use different regular expressions libraries. They use generally the same syntax, but have some differences.
0 E! [+ c6 u/ C. E; K5 GGeneral informationBecause patterns frequently need to use non-printable characters, both versions of l7-filter add on top of their stock libraries. This uses \xHH notation, so to match a tab, use "\x09". Note that regexp control characters are still control characters even when written in hex:
, m! ]; h4 i/ f. S: }- g\x24 == $ \x28 == (\x29 == ) \x2a == *\x2b == + \x2e == .\x3f == ? \x5b == [\x5c == \ \x5d == ]\x5e == ^ \x7b == { (only a control character for the userspace version)\x7c == | \x7d == } (only a control character for the userspace version)Both versions of l7-filter strip out the nulls (\x00 bytes) from network data so that they can treat it as normal C strings. So (1) you can't match on nulls and (2) fields may appear shorter than expected. For example, if a protocol has a 4 byte field and any of those bytes can be null, it can appear to be any length from 0 to 4.5 v7 O" W8 @, z1 I: T& |
Kernel versionThe kernel version of l7-filter uses Henry Spencer's 1987 implementation of ("V8 regexps"), with a few modifications, noted here. V8 regexps are likely more limited than the regexps you are used to. Notably, you cannot use bounds ("foo{3}"), character classes ("[[:punct:]]") or backreferences.8 o5 X p3 p& q1 {3 L; U
Because this library does not have a flag for case-sensitivity, the kernel version of l7-filter is always case insensitive. Upper case in patterns is identical to lower case. (This is true even if you write an uppercase letter in hex!), J0 ]1 R+ s0 P; J
The kernel version completely ignores any lines in the pattern file after the second non-comment line.
1 F% w$ j% c# I: [ N" {3 dUserspace versionThe userspace version of l7-filter uses the GNU regular expression library, so its behaviour should be more familiar. This library is documented in 3 regcomp and man 7 regex.
; e1 F. u, ]6 @1 v; QIf only one regular expression is specified in the pattern file (see above), the userspace version compiles it with the flags REG_EXTENDED | REG_ICASE | REG_NOSUB and executes it with no flags.- e1 O2 q# M$ g! I! _
If the userspace pattern and userspace flags lines are given, the userspace pattern will be used instead of the first one. It will be compiled and executed with the given flags. (l7-filter will sort out which flags go to regcomp and which to regexec.)
% g+ R! g' T9 s, e5 EIf only the userspace pattern line is given, the userspace pattern will be compiled with REG_EXTENDED | REG_ICASE | REG_NOSUB and executed with no flags. If only the userspace flags line is given, the single regular expression will be compiled and executed with the given flags.
" I& @; h+ c0 wWhat l7-filter sees and doesIf you have set up your iptables rules correctly (see the ), l7-filter sees the data going in both directions in the order that it passes through the computer. For instance, in FTP, the first thing it sees is "221 server ready", then "USER bob", then "331 send password", then "PASS frogbeard", and so on.
; \, V- y4 ]* u) ]6 E0 Xl7-filter can match across packets. For instance, with the above FTP example, the match is first attempted on "221 server ready", then on "221 server readyUser bob", then "221 server readyUSER bob331 send password", so you could match it with "220.*user.*331". At each match attempt, the regexp special character ^ will match the beginning of the stream and $ will match the end of the last packet seen so far. Because the Linux kernel's ip_conntrack module tracks connectionless UDP and ICMP sessions as "connections", this works with them as well as TCP.( { j: u# n* a
Usually the identifying characteristics of a connection are found at the beginning of that connection. For this reason, and to save processing time, l7-filter only looks at the first 10 packets or 2kB of each connection, whichever is smaller. Any match made within this time is applied to the rest of the connection as well.
( P. T, u: C7 s2 T1Yes, there should be CRLFs in there, which I omitted for clarity. Picky, picky.
* f& O6 R( u9 P5 _* X( E( V8 U* p3 \
What makes a good patternThere are two general guidelines:
& n- }# M9 i6 ]) W1) A pattern must be neither too specific nor not specific enough.
$ E( c0 A6 I K; n! M) Y7 fExample 1: The pattern "bear" for Bearshare is not specific enough. This pattern could match a wide variety of non-Bearshare connections. For instance, an HTTP request for would be matched.6 g8 J1 U1 ^0 Y2 z$ I' t
Example 2: "220 .*ftp.*(\[.*\]|\(.*\))" for FTP is too specific. Not all servers send ()s or []s after their 220. In fact, servers are not even required to send the string "ftp" at any time, but the vast majority do. Good judgement and testing are necessary for instances such as this.4 R- N8 J8 j& e! n
2) It should use a minimum of processing power. If it's possible to reduce the number of instances of *, + and | in your pattern, you should do so. Use the performance testing program included in the patterns package., y% H( Q1 j- S( n* K
3) It should complete its match on the earliest packet possible. The FTP pattern could be "^220[\x09-\x0d -~]*\x0d\x0aUSER[\x09-\x0d -~]*\x0d\x0a331", but that won't match until the third data packet. Instead, we use "^220[\x09-\x0d -~]*ftp", which matches on the first data packet.
# ^! {" Q- {( `2 NMiscellaneous tips[\x09-\x0d -~] == printable characters, including whitespace[\x09-\x0d ] == any whitespace[!-~] == non-whitespace printable charactersRecommended procedure for writing patterns
- Find and read the spec for the protocol you wish to match. If it's an Internet standard, are a good place to start, although not all standards are RFCs. If it is a proprietary protocol, it is likely that someone has written a reverse-engineered spec for it. Do a general web search to find it. Skipping this step is a good way to write patterns that are overly specific!
- Use something like (formerly known as Ethereal) to watch packets of this protocol go by in a typical session of its use. (If you failed to find a spec for your protocol, but Wireshark can parse it, reading the Wireshark source code may also be worth your time.)
- Write a pattern that will reliably match one of the first few packets that are sent in your protocol. Test it. Test its performance.
- Send your pattern to l7-filter-developers{/-\T}lists*sf*net for it to be incorporated into the official pattern definitions (you must subscribe first).
HOWTO send a packet dump to the mailing listIf you do not feel that you are able to do all of the above yourself, you may want to send some packets you have captured to the mailing list so that others can do the rest. In order for this to be useful, please follow these guidelines:
0 Z: }1 f. |7 N
- If you have never done anything like this before, use . It's easy to use and available for GNU/Linux, Mac and Windows (and FreeBSD, HP-UX, NetBSD, Solaris...). Use File→Save to save the captured packets.
- Make sure that you start capturing packets before the application that you are testing has started using the network. l7-filter looks at the opening packets of a connection. If these are not present in the packet dump, it is useless.
- If it makes sense for the protocol in question, send a recognizable text string so that the relevant connection can be found in the packet dump. For instance, if testing an instant messenger, send a message with "hello hello hello."
- Along with your capture, send us anything that could be helpful in picking out the relevant data. For example, this could include the server's IP address, what network operations you performed, the version numbers of all software used, any strings you expect to appear in the packets (such as instant messenger text, e-mail addresses, gaming handles, etc.), etc.
- Try not to capture an excessive number of packets. In particular:
- Avoid having other programs use the network during your capture. Assuming their traffic is recognizable, the excess packets can be filtered out, but it's annoying.
- Avoid sending captures that have many thousands of packets from the same connection. All but the first few are useless.
- However, if you are not sure when the application opens connections, or if it opens many simultaneous connections, it might be necessary to send a large number of packets. This is ok.
- Send the packets in libpcap format or something else that Wireshark can read. Do not:
- send only a text hexdump of the packets. This is unnecessarily hard to read.
- send only the data portion of the packets. The TCP headers in particular are essential for finding streams. You may anonymize addresses if necessary, but try to avoid it.
- compress the captured packets with anything other than gzip or bzip2. In fact, no compression is needed unless the file is very large.
If you aren't sure how to follow these guidelines, try your best and send the result to us. If it's wrong, we'll be happy to tell you how to fix it.$ G8 j( p U. ~! G1 H u7 q/ E/ e4 p
' X# o# p# b. ]$ b$ F% J; l
Last updated 23 April 2008 来源:
阅读(1417) | 评论(1) | 转发(0) |