之前大致写过iptables即netfilter的基本框架,在安全领域用到的比较多,一般小的设备防火墙,也有自己开发的比如思科的ACL,简单的包过滤什么的,都可以支持,但是它最大的缺点就是影响性能,或许我们需要借鉴tcpdump/wireshark的机制(它们都包含了对报文的解析) ,现在流行的dpi 什么的, 甚至工控领域也用到了。虽然它没有那么完美,但是框架比较好,而我们可以去改进和定制它.
参考:iptables1.4.21 kernel 3.8.13
iptables的机制分两个部分:
1. 用户空间的iptables工具
2. 内核netfilter机制的支持
它分ipv4部分和ipv6部分,这里只分析ipv4部分.代码目录(内核):
net/ipv4/netfilter
net/ipv6/netfilter
net/netfilter
这里先分析内核部分。
我们都知道netfiter分四个基本模块
1. Ct 链接追踪
2. Filter 过滤
3. Nat 地址转换
4. Mangle 修改数据报文
CT是基础核心模块是状态防火墙和nat的基础. 而其他模块会维护一个全局表,也即我们用iptables命令的时候需要指定的表。
它们工作在内核的五个钩子点上,即链的概念.
对于上面的模块在实际代码中是相对独立的初始化的:
net/ipv4/netfilter/
1.iptable_filter.c -->iptable_filter_init
2. iptable_mangle.c -->iptable_mangle_init
3.iptable_nat.c -->iptable_nat_init
4.iptable_raw.c -->iptable_raw_init
5.iptable_security.c --->iptable_security_init
而关于链接追踪它是其他的基础,比较特殊。
Net/netfilter/nf_conntrack_core.c
nf_conntrack_init
又被nf_conntrack_standalone.c
-
static int nf_conntrack_net_init(struct net *net)
-
{
-
int ret;
-
-
ret = nf_conntrack_init(net);
-
if (ret < 0)
-
goto out_init;
-
ret = nf_conntrack_standalone_init_proc(net);
-
if (ret < 0)
-
goto out_proc;
-
net->ct.sysctl_checksum = 1;
-
net->ct.sysctl_log_invalid = 0;
-
ret = nf_conntrack_standalone_init_sysctl(net);
-
if (ret < 0)
-
goto out_sysctl;
-
return 0;
-
-
out_sysctl:
-
nf_conntrack_standalone_fini_proc(net);
-
out_proc:
-
nf_conntrack_cleanup(net);
-
out_init:
-
return ret;
-
}
-
-
static void nf_conntrack_net_exit(struct net *net)
-
{
-
nf_conntrack_standalone_fini_sysctl(net);
-
nf_conntrack_standalone_fini_proc(net);
-
nf_conntrack_cleanup(net);
-
}
-
-
static struct pernet_operations nf_conntrack_net_ops = {
-
.init = nf_conntrack_net_init,
-
.exit = nf_conntrack_net_exit,
-
};
-
-
static int __init nf_conntrack_standalone_init(void)
-
{
-
return register_pernet_subsys(&nf_conntrack_net_ops);
-
}
register_pernet_subsys这个函数不多说,它是内核命名空间注册子系统的一个接口.
对于内核netfilter那么多代码,想简单理清一个框架思路,首先还是要看makefile:
1. Net/netfilter/Makefile
首先是一些基础核心的代码。然后
1. Netfilter_netlink接口相关的
2. 链接追踪支持的协议l3/l4等:nf_conntrack_l4proto_register、nf_conntrack_l3proto_register
3. 链接追踪的 helpers:nf_conntrack_helper_register (主要关联连接,其他ct的识别等)
4. 其他就是match注册和targets注册
5. 核心CT的初始化等.
6. nat和ct相关的基础。
7.工具ipset、ipvs
那么net/ipv4/netfilter/Makefile呢?
它主要针对ipv4协议的处理:
1. Nat相关的helpers和协议注册
2. 链接追踪
3. Filter、mangle、nat、raw、security实例
4. Matches和targets的注册
5. Arp其他一些东西.
做了一个简单的了解,那么就从filter说起,要使配置生效比如filter那么它会从内核调用ipt_do_table查询rules配置并触发target。
具体先看filter初始化流程:
先看iptable_filter.c中的初始化函数:
iptable_filter_init
注册钩子需要注册函数接口:nf_register_hooks
-
static struct nf_hook_ops ipt_ops[] __read_mostly = {
-
{
-
.hook = ipt_local_in_hook,
-
.owner = THIS_MODULE,
-
.pf = NFPROTO_IPV4,
-
.hooknum = NF_INET_LOCAL_IN,
-
.priority = NF_IP_PRI_FILTER,
-
},
-
{
-
.hook = ipt_hook,
-
.owner = THIS_MODULE,
-
.pf = NFPROTO_IPV4,
-
.hooknum = NF_INET_FORWARD,
-
.priority = NF_IP_PRI_FILTER,
-
},
-
{
-
.hook = ipt_local_out_hook,
-
.owner = THIS_MODULE,
-
.pf = NFPROTO_IPV4,
-
.hooknum = NF_INET_LOCAL_OUT,
-
.priority = NF_IP_PRI_FILTER,
-
},
-
};
很明显filter只在local_in /local_out/ forward三个钩子,并且它们的钩子函数最终都调用了
点击(此处)折叠或打开
-
/* Returns one of the generic firewall policies, like NF_ACCEPT. */
-
unsigned int
-
ipt_do_table(struct sk_buff *skb,
-
unsigned int hook,
-
const struct net_device *in,
-
const struct net_device *out,
-
struct xt_table *table)
那么我们就从ipt_do_table函数分析吧
关于参数传递除了struct xt_table需要说明下外其他不需要说明了吧
1. NF_HOOK的调用查询hook函数
elem = &nf_hooks[pf][hook];
找到对应的nf_hook_ops
2. hook函数的调用
对比2.6.32和3.8.13已结有了较大的改动,不过原理一样。
在iptable_filter.c中 filter的钩子函数已经统一成一个函数了即
-
static unsigned int
-
iptable_filter_hook(unsigned int hook, struct sk_buff *skb,
-
const struct net_device *in, const struct net_device *out,
-
int (*okfn)(struct sk_buff *))
-
{
-
const struct net *net;
-
-
if (hook == NF_INET_LOCAL_OUT &&
-
(skb->len < sizeof(struct iphdr) ||
-
ip_hdrlen(skb) < sizeof(struct iphdr)))
-
/* root is playing with raw sockets. */
-
return NF_ACCEPT;
-
-
net = dev_net((in != NULL) ? in : out);
-
return ipt_do_table(skb, hook, in, out, net->ipv4.iptable_filter);
-
}
对于ipt_do_table最需要我们关心的就是net->ipv4.iptable_filter这个东西从哪里来的,当然我们知道它就是存放rules的地方
从搜索的内核代码来看: 点击(此处)折叠或打开
-
static int __net_init iptable_filter_net_init(struct net *net)
-
{
-
struct ipt_replace *repl;
-
-
repl = ipt_alloc_initial_table(&packet_filter);
-
if (repl == NULL)
-
return -ENOMEM;
-
/* Entry 1 is the FORWARD hook */
-
((struct ipt_standard *)repl->entries)[1].target.verdict =
-
forward ? -NF_ACCEPT - 1 : -NF_DROP - 1;
-
-
net->ipv4.iptable_filter =
-
ipt_register_table(net, &packet_filter, repl);
-
kfree(repl);
-
return PTR_RET(net->ipv4.iptable_filter);
-
}
上述代码的12,13 行初始化了它,而net则是inet层初始化的时候创建的全局变量.
net->ipv4.iptable_filter的类型是struct xt_table *
用struct xt_table表示,每一个功能模块都需要维护一个表,注册API:ipt_register_table(它是xt_register_table的封装)
Include/linux/netfilter中X_tables.h
-
-
/* Furniture shopping... */
-
-
struct xt_table {
-
-
struct list_head list;
-
-
-
/* What hooks you will enter on */
-
-
unsigned int valid_hooks;
-
-
-
/* Man behind the curtain... */
-
-
struct xt_table_info *private;
-
-
-
/* Set this to THIS_MODULE if you are a module, otherwise NULL */
-
-
struct module *me;
-
-
-
u_int8_t af; /* address/protocol family */
-
-
int priority; /* hook order */
-
-
-
/* A unique name... */
-
-
const char name[XT_TABLE_MAXNAMELEN];
-
-
};
还有另外一个重要的结构体: struct xt_table_info *private;
当然关于tables的rules需要用户空间来下发配置。以及注册match和target相关的东西.关于rules的结构和表的关系后续我们会讲到.
关于全局表的维护:通过API接口注册的表都挂到了net->xt.tables中.
xt的类型是struct netns_xt:
-
struct netns_xt {
-
struct list_head tables[NFPROTO_NUMPROTO];
-
bool notrack_deprecated_warning;
-
#if defined(CONFIG_BRIDGE_NF_EBTABLES) || \
-
defined(CONFIG_BRIDGE_NF_EBTABLES_MODULE)
-
struct ebt_table *broute_table;
-
struct ebt_table *frame_filter;
-
struct ebt_table *frame_nat;
-
#endif
-
};
还有另外一个全局变量
Struct xt_af xt;
-
struct xt_af {
-
struct mutex mutex;
-
struct list_head match;
-
struct list_head target;
-
#ifdef CONFIG_COMPAT
-
struct mutex compat_mutex;
-
struct compat_delta *compat_tab;
-
unsigned int number; /* number of slots in compat_tab[] */
-
unsigned int cur; /* number of used slots in compat_tab[] */
-
#endif
-
};
有了表,我们看看match和target的注册:match和target放到了xt.match 和xt.target中(和之前内核不太一样了表和match/target已经分开处理了)
list_add(&match->list,&xt[af].match);
注册接口:
xt_register_table
xt_register_match
xt_register_target
对应的结构体:
Struct xt_table
Struct xt_match
Struct xt_target
那么规则呢? Rules呢?我们先看一段注释说明:
/* This
structure defines each of the firewall rules. Consists of 3 parts which are
1) general IP header stuff
2) match specificstuff
3) the target to perform if the rule
matches */
struct ipt_entry
{
...
我们继续看ipt_do_table函数
找到第一个ipt_entry然后调用ip_packet_match(五元组匹配模式开启!知道找到匹配的才结束)
比较ip头和 ipt_entry->ipt_ip
不匹配则返回false
1. 比较sip 和dip
然后比较入接口和出接口名字名字
2. 协议的比较
3. 判断IPT_F_FRAG
找到匹配的entry
接着调用entry里的match函数
Ipt_entry->elems :match and target
? ipt_entry:标准匹配结构,主要包含数据包的源、目的IP,出、入接口和掩码等;
? ipt_entry_match:扩展匹配。一条rule规则可能有零个或多个ipt_entry_match结构;
? ipt_entry_target:一条rule规则有且仅有一个target动作。就是当所有的标准匹配和扩展匹配都符合之后才来执行该target.
具体代码:
-
xt_ematch_foreach(ematch, e) {
-
-
acpar.match = ematch->u.kernel.match;
-
-
acpar.matchinfo = ematch->data;
-
-
if (!acpar.match->match(skb, &acpar))
-
-
goto no_match;
-
-
}
那么match函数从哪里来?Ipt_entry又在什么赋值的呢?后续我们会看到答案.
如果match匹配那么调用相应的target,这里不要把类似struct xt_match 和struct xt_entry_match搞混了。
例:点击(此处)折叠或打开
-
struct xt_entry_match {
-
-
union {
-
-
struct {
-
-
__u16 match_size;
-
-
/* Used by userspace */
-
-
char name[XT_EXTENSION_MAXNAMELEN];
-
-
__u8 revision;
-
-
} user;
-
-
struct {
-
-
__u16 match_size;
-
-
/* Used inside the kernel */
-
-
struct xt_match *match;
-
-
} kernel;
-
-
/* Total length */
-
-
__u16 match_size;
-
-
} u;
-
-
unsigned char data[0];
-
-
};
对于现在来说,它通过ipt_entry找到rules;关于rules的组成部分
Ipt_entry+ipt_entry_match+ …+target
但是iptables如何传递过来的呢?
又是如何和已经注册的match和target关联起来的呢?上面我们知道了内核通过表找到需要的rules然后解析匹配动作,还没有和应用联系起来,是时候统一一下了.
我们先看一条简单的命令比如过滤tcp协议:
iptables -A OUTPUT -p tcp --dport 31337 -j DROP
我们找到iptables1.4.21的源代码:
iptables命令的主函数在iptables-standalone.c
Iptables_main
而它又被封装了一层在xtables-multi.c:
-
static const struct subcommand multi_subcommands[] = {
-
-
#ifdef ENABLE_IPV4
-
-
{"iptables", iptables_main},
-
-
{"main4", iptables_main},
-
-
{"iptables-save", iptables_save_main},
-
-
{"save4", iptables_save_main},
-
-
{"iptables-restore", iptables_restore_main},
-
-
{"restore4", iptables_restore_main},
-
-
#endif
-
-
{"iptables-xml", iptables_xml_main},
-
-
{"xml", iptables_xml_main},
-
-
#ifdef ENABLE_IPV6
-
-
{"ip6tables", ip6tables_main},
-
-
{"main6", ip6tables_main},
-
-
{"ip6tables-save", ip6tables_save_main},
-
-
{"save6", ip6tables_save_main},
-
-
{"ip6tables-restore", ip6tables_restore_main},
-
-
{"restore6", ip6tables_restore_main},
-
-
#endif
-
-
{NULL},
-
-
};
-
-
-
int main(int argc, char **argv)
-
-
{
-
-
return subcmd_main(argc, argv, multi_subcommands);
-
-
}
编译过iptables后,查看sbin下iptables命令都软连接到了xtables-multi。
先看主函数:
-
int
-
iptables_main(int argc, char *argv[])
-
-
{
-
-
int ret;
-
-
char *table = "filter";
-
-
struct xtc_handle *handle = NULL;
-
-
-
iptables_globals.program_name = "iptables";
-
-
ret = xtables_init_all(&iptables_globals, NFPROTO_IPV4);
-
-
if (ret < 0) {
-
-
fprintf(stderr, "%s/%s Failed to initialize xtables\n",
-
-
iptables_globals.program_name,
-
-
iptables_globals.program_version);
-
-
exit(1);
-
-
}
-
-
#if defined(ALL_INCLUSIVE) || defined(NO_SHARED_LIBS)
-
-
init_extensions();
-
-
init_extensions4();
-
-
#endif
-
-
-
-
ret = do_command4(argc, argv, &table, &handle, false);
-
-
if (ret) {
-
-
ret = iptc_commit(handle);
-
-
iptc_free(handle);
-
-
}
-
...
我们看到命令行解析是在do_command4里
-
opts = xt_params->orig_opts;
-
-
while ((cs.c = getopt_long(argc, argv,
-
-
"-:A:C:D:R:I:L::S::M:F::Z::N:X::E:P:Vh::o:p:s:d:j:i:fbvwnt:m:xc:g:46",
-
-
opts, NULL)) != -1) {
而opts是什么?
#define opts iptables_globals.opts
还有在之前初始化的时候:xt_params = &iptables_globals;
-
struct xtables_globals *xt_params = NULL;
这个结构体包含什么呢?
-
struct xtables_globals
-
-
{
-
-
unsigned int option_offset;
-
-
const char *program_name, *program_version;
-
-
struct option *orig_opts;
-
-
struct option *opts;
-
-
void (*exit_err)(enum xtables_exittype status, const char *msg, ...) __attribute__((noreturn, format(printf,2,3)));
-
-
};
关键点就是struct option,既然上面orig_opts和opts指向相同的地方,而xt_params又指向iptables_globals
-
static struct option original_opts[] = {
-
-
{.name = "append", .has_arg = 1, .val = 'A'},
-
-
{.name = "delete", .has_arg = 1, .val = 'D'},
-
-
{.name = "check", .has_arg = 1, .val = 'C'},
-
-
{.name = "insert", .has_arg = 1, .val = 'I'},
-
...
点击(此处)折叠或打开
-
struct xtables_globals iptables_globals = {
-
-
.option_offset = 0,
-
-
.program_version = IPTABLES_VERSION,
-
-
.orig_opts = original_opts,
-
-
.exit_err = iptables_exit_error,
-
-
};
那么可能好奇struct option了是什么玩意,我们来看一下:
原型在getopt.h中
-
struct option {
-
-
const char *name;
-
-
int has_arg;
-
-
int *flag;
-
-
int val;
-
-
};
因为getopt_long里要用到,对这个函数不熟悉的,可以自己编个小程序测试下用法.加深理解。
补充说明:
extern char *optarg; //选项的参数指针
extern int optind, //下一次调用getopt的时,从optind存储的位置处重新开始检查选项。
extern int opterr, //当opterr=0时,getopt不向stderr输出错误信息。
extern int optopt; //当命令行选项字符不包括在optstring中或者选项缺少必要的参数时,该选项存储在optopt中,getopt返回'?’
1.单个字符,表示选项,
2.单个字符后接一个冒号:表示该选项后必须跟一个参数。参数紧跟在选项后或者以空格隔开。该参数的指针赋给optarg。
3 单个字符后跟两个冒号,表示该选项后必须跟一个参数。参数必须紧跟在选项后不能以空格隔开。该参数的指针赋给optarg。(这个特性是GNU的扩张)。
4 optind 下个参数选项的索引,可以通过argv[optind]查看每解析一个选项optind就会加1.
getopt_long根据传递的opt来解析长字符串参数.
对于单字母的参数很容易解析出来,但是类似--dport由于longstring里没有对应的说明那么就会进入default处理。
对于tcp相关的dport处理代码在libxt_tcp.c:tcp match的注册
-
static struct xtables_match tcp_match = {
-
-
.family = NFPROTO_UNSPEC,
-
-
.name = "tcp",
-
-
.version = XTABLES_VERSION,
-
-
.size = XT_ALIGN(sizeof(struct xt_tcp)),
-
-
.userspacesize = XT_ALIGN(sizeof(struct xt_tcp)),
-
-
.help = tcp_help,
-
-
.init = tcp_init,
-
-
.parse = tcp_parse,
-
-
.print = tcp_print,
-
-
.save = tcp_save,
-
-
.extra_opts = tcp_opts, //...
-
-
};
在注册match的时候xtables_register_match
/* place on linked list of matches pending
full registration */
me->next= xtables_pending_matches;
xtables_pending_matches= me;
...
-
static void tcp_help(void)
-
-
{
-
-
printf(
-
-
"tcp match options:\n"
-
-
"[!] --tcp-flags mask comp match when TCP flags & mask == comp\n"
-
-
" (Flags: SYN ACK FIN RST URG PSH ALL NONE)\n"
-
-
"[!] --syn match when only SYN flag set\n"
-
-
" (equivalent to --tcp-flags SYN,RST,ACK,FIN SYN)\n"
-
-
"[!] --source-port port[:port]\n"
-
-
" --sport ...\n"
-
-
" match source port(s)\n"
-
-
"[!] --destination-port port[:port]\n"
-
-
" --dport ...\n"
-
-
" match destination port(s)\n"
-
-
"[!] --tcp-option number match if TCP option set\n");
-
-
}
-
-
static const struct option tcp_opts[] = {
-
-
{.name = "source-port", .has_arg = true, .val = '1'},
-
-
{.name = "sport", .has_arg = true, .val = '1'}, /* synonym */
-
-
{.name = "destination-port", .has_arg = true, .val = '2'},
-
-
{.name = "dport", .has_arg = true, .val = '2'}, /* synonym */
-
-
{.name = "syn", .has_arg = false, .val = '3'},
-
-
{.name = "tcp-flags", .has_arg = true, .val = '4'},
-
-
{.name = "tcp-option", .has_arg = true, .val = '5'},
-
-
XT_GETOPT_TABLEEND,
-
-
};
这就是参数的解析过程,不论什么参数都是这样解析,那么解析完如何把rules传递给内核呢?
我们在回顾一下iptables配置的命令:
iptables -A OUTPUT -p tcp --dport 31337 -j DROP
下面我们就逐一跟踪命令的解析过程:
(需要说明的是默认是filter表,-t filter)
1. 默认初始化char *table = "filter";
2. Do_command4命令解析之getopt_long
这个自测过,对于-开头的命令可以直接解析,但是对于--的就需要进入default处理流程。
1> 第一个命令 -A OUTPUT 点击(此处)折叠或打开
-
case 'A':
-
-
add_command(&command, CMD_APPEND, CMD_NONE,
-
-
cs.invert);
-
-
chain = optarg;
-
-
break;
add_command即把command赋值为CMD_APPEND
cs.invert为0
Chain 为 OUTPUT
或许我们需要回顾下getopt_long点击(此处)折叠或打开
-
static struct option original_opts[] = {
-
{.name = "append", .has_arg = 1, .val = 'A'},
根据上面的初始化值来解析参数A,明显has_arg=1表示A后面跟一个参数即OUTPUT
2>第二个命令
-p tcp
-
/*
-
-
* Option selection
-
-
*/
-
-
case 'p':
-
-
set_option(&cs.options, OPT_PROTOCOL, &cs.fw.ip.invflags,
-
-
cs.invert);
-
-
-
-
/* Canonicalize into lower case */
-
-
for (cs.protocol = optarg; *cs.protocol; cs.protocol++)
-
-
*cs.protocol = tolower(*cs.protocol); // tolower把字符串转换成小写.
-
-
-
-
cs.protocol = optarg;
-
-
cs.fw.ip.proto = xtables_parse_protocol(cs.protocol);
-
-
-
-
if (cs.fw.ip.proto == 0
-
-
&& (cs.fw.ip.invflags & XT_INV_PROTO))
-
-
xtables_error(PARAMETER_PROBLEM,
-
-
"rule would never match protocol");
-
-
break;
cs.options = OPT_PROTOCOL;
cs.protocol = optarg; // optarg为tcp,赋值给cs.protocol
cs.fw.ip.proto =xtables_parse_protocol(cs.protocol); // IPPROTO_TCP
xtables_parse_protocol判断这个协议是否支持
return xtables_chain_protos[i].num;返回协议号
协议初始化在libxtables中xtables.c
-
const struct xtables_pprot xtables_chain_protos[] = {
-
{"tcp", IPPROTO_TCP},
-
{"sctp", IPPROTO_SCTP},
-
{"udp", IPPROTO_UDP},
-
{"udplite", IPPROTO_UDPLITE},
-
{"icmp", IPPROTO_ICMP},
-
{"icmpv6", IPPROTO_ICMPV6},
-
{"ipv6-icmp", IPPROTO_ICMPV6},
-
{"esp", IPPROTO_ESP},
-
{"ah", IPPROTO_AH},
-
{"ipv6-mh", IPPROTO_MH},
-
{"mh", IPPROTO_MH},
-
{"all", 0},
-
{NULL},
-
};
Cs.fw.ip.proto值为IPPROTO_TCP
3>. 第三个命令--dport 31337
进入command_default流程后
首先判断cs->target是否为null,
然后判断
cs->matches是否为null;点击(此处)折叠或打开
-
for (matchp = cs->matches; matchp; matchp = matchp->next) {
-
-
m = matchp->match;
当然第一次处理直接进入load_proto
会根据协议名字struct xtables_match * xtables_find_match(const char *name, enum xtables_tryload tryload, struct xtables_rule_match **matches)
对cs.matchs初始化为tcp match
-
find_proto(cs->protocol, XTF_TRY_LOAD,
-
-
cs->options & OPT_NUMERIC, &cs->matches);
在xtables_find_match中分两个部分:
1.点击(此处)折叠或打开
-
/* Second and subsequent clones */
-
clone = xtables_malloc(sizeof(struct xtables_match));
-
memcpy(clone, ptr, sizeof(struct xtables_match));
-
clone->udata = NULL;
-
clone->mflags = 0;
-
/* This is a clone: */
-
clone->next = clone;
-
-
ptr = clone;
-
break;
-
}
2.点击(此处)折叠或打开
-
if (ptr && matches) {
-
struct xtables_rule_match **i;
-
struct xtables_rule_match *newentry;
-
-
newentry = xtables_malloc(sizeof(struct xtables_rule_match));
-
-
for (i = matches; *i; i = &(*i)->next) {
-
if (strcmp(name, (*i)->match->name) == 0)
-
(*i)->completed = true;
-
}
-
newentry->match = ptr;
-
newentry->completed = false;
-
newentry->next = NULL;
-
*i = newentry;
-
}
完成了cs.matchs初始化为tcp match(cloned)。
找到tcp注册的match之后 ,申请xt_entry_match节点(xtables_calloc),调用xs_init_match(m);
对于tcp match就是:点击(此处)折叠或打开
-
static void tcp_init(struct xt_entry_match *m)
-
{
-
struct xt_tcp *tcpinfo = (struct xt_tcp *)m->data;
-
-
tcpinfo->spts[1] = tcpinfo->dpts[1] = 0xFFFF;
-
}
然后调用xtables_merge_options把tcp match的ext_opt复制到全局的opts中。optind-- ,回退重新处理--dport。
重新进入command_default:由于cs.matchs不为null
-
for (matchp = cs->matches; matchp; matchp = matchp->next) {
-
m = matchp->match;
-
-
if (matchp->completed ||
-
(m->x6_parse == NULL && m->parse == NULL))
-
continue;
-
if (cs->c < matchp->match->option_offset ||
-
cs->c >= matchp->match->option_offset + XT_OPTION_OFFSET_SCALE)
-
continue;
-
xtables_option_mpcall(cs->c, cs->argv, cs->invert, m, &cs->fw);
-
return 0;
-
}
进入xtables_option_mpcall解析
-
if (m->x6_parse == NULL) {
-
if (m->parse != NULL)
-
m->parse(c - m->option_offset, argv, invert,
-
&m->mflags, fw, &m->m);
-
return;
-
}
m->parse即之前已经注册tcp_match中的tcp_parse
由于之前已经把tcp扩展的options添加到了全局表中,所以重新解析后cs.c值为2,optarg即我们传递的端口号
-
static int
-
tcp_parse(int c, char **argv, int invert, unsigned int *flags,
-
const void *entry, struct xt_entry_match **match)
-
{
-
struct xt_tcp *tcpinfo = (struct xt_tcp *)(*match)->data;
-
-
switch (c) {
-
case '1':
-
if (*flags & TCP_SRC_PORTS)
-
xtables_error(PARAMETER_PROBLEM,
-
"Only one `--source-port' allowed");
-
parse_tcp_ports(optarg, tcpinfo->spts);
-
if (invert)
-
tcpinfo->invflags |= XT_TCP_INV_SRCPT;
-
*flags |= TCP_SRC_PORTS;
-
break;
-
-
case '2':
-
if (*flags & TCP_DST_PORTS)
-
xtables_error(PARAMETER_PROBLEM,
-
"Only one `--destination-port' allowed");
-
parse_tcp_ports(optarg, tcpinfo->dpts);
-
if (invert)
-
tcpinfo->invflags |= XT_TCP_INV_DSTPT;
-
*flags |= TCP_DST_PORTS;
-
break;
struct xt_tcp *tcpinfo = (struct xt_tcp *)(*match)->data; 和optarg
记得之前找到match后会对xt_entry_match初始化data强制转换成xt_tcp结构指针类型初始化(柔性数组动态扩展)
parse_tcp_ports解析端口信息,如果是数字字符串则转换成数字赋值,如果不是则以它为名字查询服务获得服务的端口号(查询/etc/services通过getservbyname接口)
(1)cs->matchs =xtables_malloc(sizeof(struct xtables_rule_match));
(2)cs->matchs->match = tcp_match; // cloned
(3) cs->matchs->match->m->data 赋值 (dport)
4>. 接着处理第4个参数 –j DROP
cs->options | = OPT_JUMP;
Cs->jumpto=optarg (即DROP) --> “standard”
cs->target = target_standard (cloned) // 跟match find 几乎一样的处理流程。 struct xtable_target
在libxt_standard.c
-
static void standard_help(void)
-
{
-
printf(
-
"standard match options:\n"
-
"(If target is DROP, ACCEPT, RETURN or nothing)\n");
-
}
-
-
static struct xtables_target standard_target = {
-
.family = NFPROTO_UNSPEC,
-
.name = "standard",
-
.version = XTABLES_VERSION,
-
.size = XT_ALIGN(sizeof(int)),
-
.userspacesize = XT_ALIGN(sizeof(int)),
-
.help = standard_help,
-
};
-
-
void _init(void)
-
{
-
xtables_register_target(&standard_target);
-
}
cs->target->t =xtables_calloc(1, size); 申请空间。 // struct xt_entry_match
cs->target->t赋值strcpy(cs->target->t->u.user.name, cs->jumpto)等。
终于几个参数初步解析完了,其他参数解析也类似。看后续的代码:
一些安全检查后
Shostnetworkmask 、Dhostnetworkmask 初始化为0.0.0.0/0 默认
iptables有参数-d,可以指定网址,比如 iptables -A OUTPUT -d -j DROP
则会对Dhostnetworkmask赋值处理和解析。
主要是根据主机名解析所有的ip地址,使用了gethostbyname函数. 这只是一个小插曲.
到这里需要注意:
点击(此处)折叠或打开
-
1. if (!*handle)
-
*handle = iptc_init(*table);
系统初始化时handle为null,*table为“filter”:
-
1. if (!*handle)
-
*handle = iptc_init(*table);
在Libip4tc.c中:iptc_init
#define TC_INIT iptc_init
1.建立了socket(TC_AF, SOCK_RAW, IPPROTO_RAW);
2.fcntl(sockfd, F_SETFD, FD_CLOEXEC)
3.getsockopt(sockfd, TC_IPPROTO, SO_GET_INFO, &info, &s)
info结构信息:
-
/* The argument to IPT_SO_GET_INFO */
-
struct ipt_getinfo {
-
/* Which table: caller fills this in. */
-
char name[XT_TABLE_MAXNAMELEN];
-
-
/* Kernel fills these in. */
-
/* Which hook entry points are valid: bitmask */
-
unsigned int valid_hooks;
-
-
/* Hook entry points: one per netfilter hook. */
-
unsigned int hook_entry[NF_INET_NUMHOOKS];
-
-
/* Underflow points. */
-
unsigned int underflow[NF_INET_NUMHOOKS];
-
-
/* Number of entries */
-
unsigned int num_entries;
-
-
/* Size of entries. */
-
unsigned int size;
-
};
它和内核的struct xt_table_info很相似
-
/* The table itself */
-
struct xt_table_info {
-
/* Size per table */
-
unsigned int size;
-
/* Number of entries: FIXME. --RR */
-
unsigned int number;
-
/* Initial number of entries. Needed for module usage count */
-
unsigned int initial_entries;
-
-
/* Entry points and underflows */
-
unsigned int hook_entry[NF_INET_NUMHOOKS];
-
unsigned int underflow[NF_INET_NUMHOOKS];
-
-
/*
-
* Number of user chains. Since tables cannot have loops, at most
-
* @stacksize jumps (number of user chains) can possibly be made.
-
*/
-
unsigned int stacksize;
-
unsigned int __percpu *stackptr;
-
void ***jumpstack;
-
/* ipt_entry tables: one per CPU */
-
/* Note : this field MUST be the last one, see XT_TABLE_INFO_SZ */
-
void *entries[1];
-
};
关于getinfo的操作,它下发到内核获取信息:
-
static int
-
do_ipt_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
-
{
-
int ret;
-
-
if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
-
return -EPERM;
-
-
switch (cmd) {
-
case IPT_SO_GET_INFO:
-
ret = get_info(sock_net(sk), user, len, 0);
-
break;
4.创建handle,并初始化。
5.getsockopt(h->sockfd, TC_IPPROTO, SO_GET_ENTRIES, h->entries,&tmp) 获取entires信息
-
/* The argument to IPT_SO_GET_ENTRIES. */
-
struct ipt_get_entries {
-
-
/* Which table: user fills this in. */
-
char name[XT_TABLE_MAXNAMELEN];
-
-
/* User fills this in: total entry size. */
-
unsigned int size;
-
-
/* The entries. */
-
struct ipt_entry entrytable[0];
-
-
};
一开始认识info 和entries应该为空,其实错了,在初始化filter表的时候,即在iptable_filter_net_init里面做了重要工作:
-
struct ipt_replace *repl;
-
-
repl = ipt_alloc_initial_table(&packet_filter);
-
void *ipt_alloc_initial_table(const struct xt_table *info)
-
{
-
return xt_alloc_initial_table(ipt, IPT);
-
}
很有意思的是上面这个函数:
-
#define xt_alloc_initial_table(type, typ2) ({ \
-
unsigned int hook_mask = info->valid_hooks; \
-
unsigned int nhooks = hweight32(hook_mask); \
-
unsigned int bytes = 0, hooknum = 0, i = 0; \
-
struct { \
-
struct type##_replace repl; \
-
struct type##_standard entries[nhooks]; \
-
struct type##_error term; \
-
} *tbl = kzalloc(sizeof(*tbl), GFP_KERNEL); \
-
if (tbl == NULL) \
-
return NULL; \
-
strncpy(tbl->repl.name, info->name, sizeof(tbl->repl.name)); \
-
tbl->term = (struct type##_error)typ2##_ERROR_INIT; \
-
tbl->repl.valid_hooks = hook_mask; \
-
tbl->repl.num_entries = nhooks + 1; \
-
tbl->repl.size = nhooks * sizeof(struct type##_standard) + \
-
sizeof(struct type##_error); \
-
for (; hook_mask != 0; hook_mask >>= 1, ++hooknum) { \
-
if (!(hook_mask & 1)) \
-
continue; \
-
tbl->repl.hook_entry[hooknum] = bytes; \
-
tbl->repl.underflow[hooknum] = bytes; \
-
tbl->entries[i++] = (struct type##_standard) \
-
typ2##_STANDARD_INIT(NF_ACCEPT); \
-
bytes += sizeof(struct type##_standard); \
-
} \
-
tbl; \
-
})
为什么这么说,还记得iptcc_find_label这个函数吗?它去查询handle->chains 而在TC_INIT的时候,从内核获取info和entries之后通过parse_table赋值给handle.
补充几个结构体:
-
/* Standard entry. */
-
struct ipt_standard {
-
struct ipt_entry entry;
-
struct xt_standard_target target;
-
};
-
/* Standard entry. */
-
struct ipt_standard {
-
struct ipt_entry entry;
-
struct xt_standard_target target;
-
};
-
struct xt_standard_target {
-
struct xt_entry_target target;
-
int verdict;
-
};
下面看看filter表初始化的时候做了什么工作:
-
#define IPT_STANDARD_INIT(__verdict) \
-
{ \
-
.entry = IPT_ENTRY_INIT(sizeof(struct ipt_standard)), \
-
.target = XT_TARGET_INIT(XT_STANDARD_TARGET, \
-
sizeof(struct xt_standard_target)), \
-
.target.verdict = -(__verdict) - 1, \
-
}
和
-
#define XT_TARGET_INIT(__name, __size) \
-
{ \
-
.target.u.user = { \
-
.target_size = XT_ALIGN(__size), \
-
.name = __name, \
-
}, \
-
}
Entry主要初始化了target_offset为ipt_entry大小 和 next_offset 为外加target
Target初始化了name 为XT_STANDARD_TARGET 和target_size
总共初始化了三个entry根据filter的hooknum
在xt_register_table的时候把struct xt_table_info *newinfo;赋给table->private
这也就是为什么用户空间为一开始获取内核的entries。然后才能顺利开展后续工作.
回到主函数命令处理:
然后进入生成entry阶段,这里比较关键,它初始化了iptables table ,后续
很重要的部分:
-
else {
-
-
e= generate_entry(&cs.fw, cs.matches, cs.target->t);
-
-
free(cs.target->t);
-
-
}
因为它用代码说明了rules的组成:点击(此处)折叠或打开
-
static struct ipt_entry *
-
generate_entry(const struct ipt_entry *fw,
-
struct xtables_rule_match *matches,
-
struct xt_entry_target *target)
-
{
-
unsigned int size;
-
struct xtables_rule_match *matchp;
-
struct ipt_entry *e;
-
-
size = sizeof(struct ipt_entry);
-
for (matchp = matches; matchp; matchp = matchp->next)
-
size += matchp->match->m->u.match_size;
-
-
e = xtables_malloc(size + target->u.target_size);
-
*e = *fw;
-
e->target_offset = size;
-
e->next_offset = size + target->u.target_size;
-
-
size = 0;
-
for (matchp = matches; matchp; matchp = matchp->next) {
-
memcpy(e->elems + size, matchp->match->m, matchp->match->m->u.match_size);
-
size += matchp->match->m->u.match_size;
-
}
-
memcpy(e->elems + size, target, target->u.target_size);
-
-
return e;
-
}
Ipt_entry+ xt_entry_match+…+xt_entry_target (…表示match可以不止一个,但target只有一个)
申请新的ipt_entry *e空间,这里把ipt_entry->ip 赋值,复制entry_matchs和entry_target.
接着是对command的处理,
-
switch (command) {
-
-
case CMD_APPEND:
-
-
ret = append_entry(chain, e,
-
-
nsaddrs, saddrs, smasks,
-
-
ndaddrs, daddrs, dmasks,
-
-
cs.options&OPT_VERBOSE,
-
-
*handle);
-
-
break;
append_entry的参数的传递这里就不解释了,前面都分析完了.
1.ip初始化 2.iptc_append_entry. 这个函数很好理解,就是附加entry.
处理完命令 ,然后释放掉原来的空间.
在完成do_comand4后,自然是要提交我们的东西到内核里
即通过ipt_commit函数
原理是利用Setsockopt下发配置,关于setsockopt这里就不详细讲解了.