分类: LINUX
2006-11-30 20:59:28
具体来说,它必须拥有以下的功能:
(1)服务转发。能接受来自外部网中的多种基于TCP/IP的服务请求如FTP 、TELNET、 HTTP等,并且将它们转发到当前负载最轻的机器上执行。
(2)动态负载平衡。平衡器能够监视内部网中的实际服务器的负载状况并且找到负载最轻的机器。
(3)连接持续性。来自外部网的同一客户的所有请求必须转发到内部网中的同一台实际服务器上进行处理。
2.环境设置3.1 IP伪装技术
本集群系统主要采用了IP Masquerade(IP伪装)机制。该负载平衡系统采用了NAT(network address translation)机制。NAT机制主要用于内部私有网与外部网之间进行通讯。IP地址中的那些私有地址如10.0.0.0/255.0.0.0, 172.16.0.0/255.240.0.0 以及192.168.0.0/255.255.0.0等是无法直接与Internet上的机器通讯的,如果它们想与Internet上的机器通讯,需要采用网络地址翻译(Network Address Translation,NAT)机制。
NAT意味着将IP地址从一组映射到另外一组,如果这种映射关系是N-N的,则称之为静态网络地址翻译;如果映射是M-N(M>N)的,则叫做动态网络地址翻译;IP伪装机制实际上就是一种M-1的动态网络地址翻译,它能够将多个内部网中的IP地址映射到一个与Internet相连接的外部网IP地址上,这样这些无法直接与Internet上的机器通讯的具有内部网IP地址的机器就可以通过这台映射机器与外界进行通讯了。而网络地址端口翻译是对网络地址翻译的一种扩展,它将许多网络地址以及它们的TCP/UDP端口翻译为一个IP地址和TCP/UDP 端口。本集群系统采用的就是网络地址端口翻译机制。
3.2 IP伪装机制的建立过程
(1) 编译核心使之能够支持IP伪装:(编译核心时要注意以下选项的选择)
* Prompt for development and/or incomplete code/drivers
(CONFIG_EXPERIMENTAL) [Y/n/?]
- YES: though not required for IP MASQ, this option allows
the kernel to create the MASQ modules and enable the option
for port forwarding
-- Non-MASQ options skipped --
* Enable loadable module support (CONFIG_MODULES) [Y/n/?]
- YES: allows you to load kernel IP MASQ modules
-- Non-MASQ options skipped --
* Networking support (CONFIG_NET) [Y/n/?]
- YES: Enables the network subsystem
-- Non-MASQ options skipped --
* Sysctl support (CONFIG_SYSCTL) [Y/n/?]
- YES: Enables the ability to enable disable options such as forwarding,
dynamic IPs, LooseUDP, etc.
-- Non-MASQ options skipped --
* Packet socket (CONFIG_PACKET) [Y/m/n/?]
- YES: Though this is OPTIONAL, this recommended feature will allow you
to use TCPDUMP to debug any problems with IP MASQ
* Kernel/User netlink socket (CONFIG_NETLINK) [Y/n/?]
- YES: Though this is OPTIONAL, this feature will allow the logging of
advanced firewall issues such as routing messages, etc
* Routing messages (CONFIG_RTNETLINK) [Y/n/?]
- NO: This option does not have anything to do with packet firewall logging
-- Non-MASQ options skipped --
* Network firewalls (CONFIG_FIREWALL) [Y/n/?]
- YES: Enables the kernel to be comfigured by the IPCHAINS firewall tool
* Socket Filtering (CONFIG_FILTER) [Y/n/?]
- OPTIONAL: Though this doesn't have anything do with IPMASQ, if you plan
on implimenting a DHCP server on the internal network, you WILL need this
option.
* Unix domain sockets (CONFIG_UNIX) [Y/m/n/?]
- YES: This enables the UNIX TCP/IP sockets mechanisms
* TCP/IP networking (CONFIG_INET) [Y/n/?]
- YES: Enables the TCP/IP protocol
-- Non-MASQ options skipped --
* IP: advanced router (CONFIG_IP_ADVANCED_ROUTER) [Y/n/?]
- YES: This will allow you to configure advanced MASQ options farther down
* IP: policy routing (CONFIG_IP_MULTIPLE_TABLES) [N/y/?]
- NO: Not needed by MASQ though users who need advanced features
such as TCP/IP source address-based or TOS-enabled routing will
need to enable this option.
* IP: equal cost multipath (CONFIG_IP_ROUTE_MULTIPATH) [N/y/?]
- NO: Not needed for normal MASQ functionality
* IP: use TOS value as routing key (CONFIG_IP_ROUTE_TOS) [N/y/?]
- NO: Not needed for normal MASQ functionality
* IP: verbose route monitoring (CONFIG_IP_ROUTE_VERBOSE) [Y/n/?]
- YES: This is useful if you use the routing code to drop IP
spoofed packets (highly recommended) and you want to log them.
* IP: large routing tables (CONFIG_IP_ROUTE_LARGE_TABLES) [N/y/?]
- NO: Not needed for normal MASQ functionality
* IP: kernel level autoconfiguration (CONFIG_IP_PNP) [N/y/?] ?
- NO: Not needed for normal MASQ functionality
* IP: firewalling (CONFIG_IP_FIREWALL) [Y/n/?]
- YES: Enable the firewalling feature
* IP: firewall packet netlink device
(CONFIG_IP_FIREWALL_NETLINK) [Y/n/?]
- OPTIONAL: Though this is OPTIONAL, this feature will allow
IPCHAINS to copy some packets to UserSpace tools for additional
checks
* IP: transparent proxy support (CONFIG_IP_TRANSPARENT_PROXY) [N/y/?]
- NO: Not needed for normal MASQ functionality
* IP: masquerading (CONFIG_IP_MASQUERADE) [Y/n/?]
- YES: Enable IP Masquerade to re-address specific internal to
external TCP/IP packets
* IP: ICMP masquerading (CONFIG_IP_MASQUERADE_ICMP) [Y/n/?]
- YES: Enable support for masquerading ICMP ping packets
(ICMP error codes will be MASQed regardless). This is an
important feature for troubleshooting connections.
* IP: masquerading special modules support
(CONFIG_IP_MASQUERADE_MOD) [Y/n/?]
- YES: Though OPTIONAL, this enables the OPTION to later enable
the TCP/IP Port forwarding system to allow external computers to
directly connect to specified internal MASQed machines.
* IP: ipautofw masq support (EXPERIMENTAL)
(CONFIG_IP_MASQUERADE_IPAUTOFW) [N/y/m/?]
- NO: IPautofw is a legacy method of port forwarding. It is
mainly old code and has been found to have some issues. NOT
recommended.
* IP: ipportfw masq support (EXPERIMENTAL)
(CONFIG_IP_MASQUERADE_IPPORTFW) [Y/m/n/?]
- YES: Enables IPPORTFW which allows external computers on
the Internet to directly communicate to specified internal
MASQed machines. This feature is typically used to access
internal SMTP, TELNET, and WWW servers. FTP port forwarding
will need an additional patch as described in the FAQ section of
the MASQ HOWTO. Additional information on port forwarding is
available in the Forwards section of this HOWTO.
* IP: ip fwmark masq-forwarding support (EXPERIMENTAL)
(CONFIG_IP_MASQUERADE_MFW) [Y/m/n/?]
- OPTIONAL: This is a new method of doing PORTFW. With this option,
IPCHAINS can mark packets that should have additional work on.
Using a UserSpace tool, much like IPMASQADM or IPPORFW, IPCHAINS
would then automaticaly re-address the packets. Currently, this
code is less tested than PORTFW but it looks promising. For now,
the recommended method is to use IPMASQADM and IPPORTFW. If you
have thoughts on MFW, please email me.
* IP: optimize as router not host (CONFIG_IP_ROUTER) [Y/n/?]
- YES: This optimizes the kernel for the network subsystem though
it isn't known if it makes a siginificant performance difference.
* IP: tunneling (CONFIG_NET_IPIP) [N/y/m/?]
- NO: This OPTIONAL section is for IPIP tunnels through IP Masq.
If you need tunneling/VPN functionality, it is recommended to
use either GRE or IPSEC tunnels.
* IP: GRE tunnels over IP (CONFIG_NET_IPGRE) [N/y/m/?]
- NO: This OPTIONAL selection is to enable PPTP and
GRE tunnels through the IP MASQ box
-- Non-MASQ options skipped --
* IP: TCP syncookie support (not enabled per default)
(CONFIG_SYN_COOKIES) [Y/n/?]
- YES: HIGHLY recommended for basic TCP/IP network security
-- Non-MASQ options skipped --
* IP: Allow large windows (not recommended if <16Mb of memory) *
(CONFIG_SKB_LARGE) [Y/n/?]
- YES: This is recommended to optimize Linux's TCP window
-- Non-MASQ options skipped --
* Network device support (CONFIG_NETDEVICES) [Y/n/?]
- YES: Enables the Linux Network device sublayer
-- Non-MASQ options skipped --
* Dummy net driver support (CONFIG_DUMMY) [M/n/y/?]
- YES: Though OPTIONAL, this option can help when debugging problems
== Don't forget to compile in support for your network card !! ==
-- Non-MASQ options skipped --
== Don't forget to compile in support for PPP/SLIP if you use a modem or
use a PPPoE DSL modem ==
-- Non-MASQ options skipped --
* /proc filesystem support (CONFIG_PROC_FS) [Y/n/?]
- YES: Required to enable the Linux network forwarding system
(2)重新编译了核心以后,应该通过以下命令重新编译并安装IP伪装模块:
make modules; make modules_install
3.3 IP端口转发机制的建立过程
现在需要增加适当的转发机制,从而将数据报文转发到合适的机器上去。首先要注意的是IPFWADM已经不再是2.1.x 和2.2.x核心中控制IP伪装规则的工具,这些核心现在使用的工具是IPCHAINS。
(1)首先根据以下的规则创建/etc/rc.d/rc.firewall文件。
|
(2)在编辑完/etc/rc.d/rc.firewall文件后,运行 chmod 700 /etc/rc.d/rc.firewall命令使该文件变为可执行的。
(3)在 /etc/rc.d/rc.local文件中增加一行来在每次重启以后激活IP伪装模块。
|
在依次完成了3.2和3.3小节的操作以后,实际上已经建立起了一个基于Round-Robin调度算法的集群系统,如果用户想根据计算机性能的不同为之赋相应的权重,只需修改rc.firewall中的规则即可。例如,如果要把server2(10.1.1.2)的权重改为2,只须将原来的规则:
|
这样就可以了。
3.4 建立实现动态负载平衡的应用程序
该应用程序监视集群中的各个实际服务器的负载情况,并将用户的请求转发到负载最轻的实际服务器上。具体的实现请参考文章后半部分有关调度模块的实现方法
本集群系统实现了IP级的负载平衡。当客户向平衡器发送一个请求报文时,在平衡器的IP层对此请求报文的目标地址进行了替换工作,将目标地址替换为内部网中的实际服务器中的负载最轻的机器的IP地址。然后将此报文再次转发出去。当内部网中的实际服务器将请求处理完了以后,它将请求回应发向平衡器,平衡器再次在IP层将目标地址替换为发出请求的外部网中的客户的IP地址,然后将此报文再次转发到客户。
对目标地址进行替换的工作是在操作系统的核心中实现的,而选取负载最轻的机器的IP地址是在应用层实现的。之所以这样做是因为在应用层取负载数据可以提高系统的可扩展性,当需要向内部网中增加一台新的实际服务器时,只需要在应用程序的数组变量中增加一项就可以了;而且在应用层可以灵活地决定调度策略,可以采用静态的调度策略如Round Robin、Weighted Round Robin等,也可以采用动态的调度策略如Least Connection 、Weighted Least Connection等。对IP报文进行目标地址改写的工作主要在核心完成,这是因为这样速度很快,省掉了从用户到核心的通讯过程。
当外部网中的客户向负载平衡器发出一个服务请求(如www、ftp、telnet等) 时,从这个请求中可以获得外部网机器的IP地址和端口号(laddr ,lport),以及平衡器的IP地址,根据这些信息查询IP端口转发双向链表看是否有匹配(laddr, lport)的表项存在,如果存在的话,就取出该表项中的(raddr, rport)的值,即内部网中机器的IP地址和端口号,并且替换IP包的目标地址和端口号为(raddr, rport),再将此IP包重新发送到内部网中的对应机器上去。如果没有对应表项,则创建新的IP端口转发表项,以及对应的IP伪装表项。再进行目标地址替换和包重发的工作。
本负载平衡系统主要分为IP伪装模块、IP端口转发模块和调度模块,其中IP伪装模块和IP端口转发模块都是在IP层实现的,在Linux源代码所在目录下都可以找到它们对应的程序。而调度模块是在应用层实现的。
模块名 | 标识符 | 说明 |
ip伪装模块 | ip_masq | 对ip报头进行改写,对ip报文进行转发。 |
ip端口转发模块 | ip_portfw | 接收外界的请求,根据调度算法决定将ip报文转发到哪一台实际服务器上。 |
调度模块 | sched | 根据负载信息收集模块确定的负载最轻的机器的地址将用户请求转发到负载最轻的机器上。 |