raw socket-@sky-ChinaUnix博客

厚德博学敬业乐群

首页　| 　博文目录　| 　关于我

@sky

博客访问： 1111551
博文数量： 252
博客积分： 4561
博客等级：上校
技术积分： 2833
用户组：普通用户
注册时间： 2008-03-15 08:23

文章分类

全部博文（252）

extjs（2）
javascript（4）
python（2）
freebsd（1）
java（47）
杂文（99）
linux驱动学习笔（76）
未分配的博文（21）

文章存档

2015年（2）

2014年（1）

2013年（1）

2012年（16）

2011年（42）

2010年（67）

2009年（87）

2008年（36）

我的朋友

相关博文

raw socket

分类： LINUX

2011-06-27 15:35:19

$Id: rawipspoof.sgml,v 1.3 2007/03/30 19:30:32 murat Exp $

Many of the designations used by the manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this document, and the author was aware of the trademark claim, the designations have been followed by the ™ and ® symbols.

The information and source code provided in this article is provided for educational purposes only. The author cannot be held liable for the consequences of the application of given information and published source code by the readers.

This article aims to give the readers a quick grasp of raw sockets idea, its design internals, and its successfull implemenatation BSD Raw Sockets API. IP Spoofing will be discussed in detail, with the help of three sample applications, spoofing ICMP, UDP and TCP packets respectively. After reading this article, the readers will be able to clearly understand why they cannot trust even the TCP protocol for the security of their communication, and why they must employ cryptographic protocols to make sure that their communication is seen only by the authorized parties (Cryptography), that their communication is not altered on the way (Integrity), and that communucation channels are always available (Availability).

Recent versions of this article can be found at .

This is an enhanced version of the original article in Turkish, which can be found here at

To read and understand this paper, you need basic C knowledge and a moderate amount of prior knowledge on TCP/IP protocol suite. Gdb ve gcc experience will make life much easier. It is strongly advised that you read W. Richard Stevens' book TCP/IP Illustrated Volume I.

The basis for network I/O in BSD UNIX centers on an abstraction known as the socket. Sun™ defines socket as one endpoint of a two-way communication link between two programs running on the network. It is the generalization of the UNIX file access mechanism that provides an endpoint for communication [6]. For a userspace application, it is an interface to the underlying transport protocols that are handled by the kernel. Every socket is bound to a port, which designates the application who is to send/receive the data in the transport layer. When an application wants to open a connection to somewhere, a socket particular for that connection is created. Afterwards, data flow between the end- points is carried over that socket. In BSD sockets terminology, this implies that the user-space application is given a socket descriptor to access the kernel-level socket object.

Reading up to here, if you feel somewhat uncomfortable with the basics, it is strongly advised that you read some more on BSD socket API A tutorial for Turkish speaking guys has been made available by Baris Simsek .

However, "sockets" as described above, does not fit all our needs. Normal sockets lack some functionality some of which is given below:

We cannot read/write ICMP or IGMP protocols with normal sockets. Our famous ping(8) tool cannot be written using them.
Some Operating Systems do not process IPv4 protocols other than ICMP, IGMP, TCP or UDP. What if we have a propriatery protocol that we want to handle? How do we send/receive data using that protocol?

Obviously we need a different mechanism to supplement normal sockets. This mechanism should enable us to bypass some of the protocol layering. And this mechanism is called "the Raw Sockets Interface". We are able to create much more functionality normally we cannot do with standart TCP/UDP sockets. Raw sockets provide "privileged users" with the ability to directly access raw protocol, which in turn enables them to design and develop new propriatery protocols on top of existing protocol layering, and to access modules they cannot access with normal sockets.

It would be wise to go over the basic concepts first. However be advised that this information will not suffice you to grasp what's going on here. You will still need to read and completely understand the relevant sections in Stevens' book.

Like any other networking protocol, TCP/IP protocol suite is a layered one. Every single layer in the protocol is responsible for one aspect of the communication. There are four distinct layers in TCP/IP prococol suite:

---------------------------------------------------------- | 4. Application | telnet, ftp, dns etc. | ---------------------------------------------------------- | 3. Transport | TCP UDP | ---------------------------------------------------------- | 2. Network | IP ICMP IGMP | ---------------------------------------------------------- | 1. Link | device driver, network adapter | ----------------------------------------------------------

When your application sends some data using a socket descriptor, the data is delivered to the transport (TCP/UDP) layer. Transport layer determines a local source port, if there needs to be a connection establishment procedure beforehand, opens the connection. Transport layer specific header is created and appended to the data, then the packet is delivered to the Network Layer (Layer II). Layer-II (e.g. IP) header is created, and appended to the packet. Source and destination address information is written onto the related fields in the IP header. The last station is the Link layer, where the packet is injected onto the wire.

Likewise, an incoming packet is demultiplexed as it goes upwards within the stack hierarchy, and is finaly delivered to the userspace application.

Very first of the TCP/IP layers. When the packet is received off the wire, the early processing is done in here. Duties include:

send/receive datagrams for the IP protocol
send/receive ARP requests and replies for the ARP protocol
send/receive RARP request and replies for the RARP protocol

Depending on the hardware equipment used, the header length for this layer is variable. Some of the most widely used Link Layer protocols are, Ethernet, PPP, ATM and SLIP.

Figure 1. Ip header

In the network layer, IP protocol employs a minimum header size of 20 octets. Below are the IP header fields that are in common use (from /usr/include/netinet/ip.h):

struct ip { u_int ip_hl:4, /* header length */ ip_v:4; /* ip version */ u_char ip_tos; /* type of service */ u_short ip_len; /* total length */ u_short ip_id; /* identification */ u_short ip_off; /* fragment offset */ u_char ip_ttl; /* time to live */ u_char ip_p; /* protocol */ u_short ip_sum; /* checksum */ struct in_addr ip_src,ip_dst; /* source and dest address */ }; Field Length Example ------------------------------ --------------- ------------------- Version 4 bits 4 Header length 4 bits 5 Type of Service 8 bits 0 Total length of the whole 16 bits 45 datagram Identification 16 bits 43211 Flags 3 bits 0 Fragment Offset 13 bits 0 Time to Live (a.k.a TTL) 8 bits 64 Layer III Protocol 8 bits 6 [TCP] Checksum 16 bits 0x3a43 Source IP address 32 bits 192.168.1.1 Destination IP address 32 bits 192.168.1.2

This layer has the functionality and is responsible for the transportation of data between applications. TCP and UDP protocols are in this layer. UDP is a connectionless and un-reliable protocol whereas TCP is considered connection-oriented, reliable and thus much more complex. Protocol id field in the IP header decides which protocol to run in this layer.

Figure 2. UDP header

Below are the protocol headers for the UDP and TCP protocols. UDP protocol fields from /usr/include/netinet/udp.h:

/* * Udp protocol header. * Per RFC 768, September, 1981. */ struct udphdr { u_short uh_sport; /* source port */ u_short uh_dport; /* destination port */ u_short uh_ulen; /* udp length */ u_short uh_sum; /* udp checksum */ }; Field Length Example ------------------------------ --------------- ------------------- Source Port 16 bits 12831 Destination Port 16 bits 53 UDP datagram length 16 bits 321 UDP checksum 16 bits 0xeb8a

Figure 3. TCP header

TCP protocol fields from file /usr/include/netinet/tcp.h:

/* * TCP header. * Per RFC 793, September, 1981. */ struct tcphdr { u_short th_sport; /* source port */ u_short th_dport; /* destination port */ tcp_seq th_seq; /* sequence number */ tcp_seq th_ack; /* acknowledgement number */ u_int th_x2:4, /* (unused) */ th_off:4; /* data offset */ u_char th_flags; u_short th_win; /* window */ u_short th_sum; /* checksum */ u_short th_urp; /* urgent pointer */ }; Field Length Example ------------------------------ --------------- ------------------- Source Port 16 bits 12783 Destination Port 16 bits 80 Sequence Number 32 bits 0xbfcdab00 Ack Number 32 bits 0xaeb10908 x2 4 bits Data Offset 4 bits 20 Flags 8 bits TH_ACK, TH_PUSH Window size 16 bits 8192 TCP Checksum 16 bits 0xebcd Urgent Pointer 16 bits 0x0

This is where the application data is processed (sent/received) in the first place. Protocols like TELNET, SSH, FTP, HTTP are all in the Application layer. With Normal sockets, the user is limited to the full control of application layer data, however with raw sockets, it is permissible to control all the protocol layers starting from link layer up to application layer.

Just like normal sockets, we create raw sockets with the socket(2) system call:

int socket(int domain, int type, int protocol)

However, type and protocol parameters are set to SOCK_RAW and protocol name accordingly:

if ((sd = socket(AF_INET, SOCK_RAW, IPPROTO_ICMP)) < 0) { .. }

We can start crafting ICMP packets now. But, this way, Layer II data will be constructed by the kernel IP code, we want to tell kernel that our packet includes the Layer II data already. To do this, we set the IP_HDRINCL option for our socket via setsockopt(2) system call:

const int on = 1; if (setsockopt(sd, IPPROTO_IP, IP_HDRINCL, &on, sizeof(on) < 0) { .. }

In some occassions, you can even call bind(2) and connect(2) on a raw socket, though you won't probably want to do that because it will cost you flexibility.

After creating your socket, and building your datagram, you can inject it via sendto(2) and sendmsg(2) system calls. A raw socket is datagram oriented, each send call requires a destination address.

If you're not using raw sockets on a "send-and-forget" basis, you will be interested in reading the reply packet(s) for your raw packet(s). The decision logic for whether a packet will be delivered to a raw socket can be enumarated as such:

TCP and UDP packets are never delivered to raw sockets, they are always handled by the kernel protocol stack.
Copies of ICMP packets are delivered to a matching raw socket. For some of the ICMP types (ICMP echo request, ICMP timestamp request, mask request) the kernel, at the same time, may wish to do some processing and generate replies.
All IGMP packets are delivered to raw sockets: e.g. OSPF packets.
All other packets destined for protocols that are not processed by a kernel subsystem are delivered to raw sockets.

The fact that you're dealing with a protocol for which reply packets are delivered to your raw socket does not necessarily mean that you'll get the reply packet. For this you may also need to consider:

setting the protocol accordingly while creating your socket via socket(2)system call. For instance, if you're sending an ICMP echo-request packet, and want to receive ICMP echo-reply, you can set the protocol argument (3rd argument) to IPPROTO_ICMP).
setting the protocol argument in socket(2) to 0, so any protocol number in the received packet header will match.
defining a local address for your socket (via e.g. bind(2)), so if the destination address matches the socket's local address, it'll be delivered to your application also.

So, what about TCP and UDP? For reading the reply packets for your TCP/UDP raw socket, the only viable option is to utilize a packet filter. Use BPF (or an equivelant one; e.g. SOCK_PACKET for Linux, NIT for Sun, DLPI for Solaris®, HP-UX® and SCO Openserver® SNOOP for IRIX®) to get all the packets off the wire.

From this point on, we're going to construct our raw packet bit-by-bit; and call sendto(2) or sendmsg(2) to inject the packet into the network. We'll examplify ICMP, UDP and TCP raw packet injection in the following sections.

You can get the source code for the sample applications from . Also, you'll need to use tcpdump(1) tool to see the injected packets live on the wire.

Just before we start coding, it is important that we talk a little bit about byte ordering.

Packets travelling in the network are all big-endian, meaning the most significant bit of the octet is transferred first. If the host system uses a different byte-ordering scheme (e.g. i386® architecture is little-endian) data need to be converted into the network byte order or vice versa. On the receive path, and if the byte ordering scheme of the host system is different from network byte order, data need to be converted from big-endian to little-endian, and on the send path, the reverse operation (from little-endian to big-endian) is necessary.

htons(3) and htonl(3) convert 16 bit and 32 bit quantities from host byte order into the network byte order respectively. To convert a 16-bit quantity into network byte order, one needs to use htons(3) macro.

Similarly, ntohs(3) and ntohl(3) macros convert 16-bit and 32-bit quantities from network byte order into the host byte order. To convert the acknowledgement number field in the received tcp header, one needs to use ntohl(3) macro.

On machines which have a byte order same as the network order, routines are defined as null macros.

Here, we're going to create our ICMP packet (type echo-request) and hand it over to the raw sockets API to deliver it to the network. We are going to explain the source code, while we are going over the concepts.

Here are necessary header files that are to included for the ICMP raw socket application:

#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include

Internet checksum function (from BSD Tahoe) We can use this function to calculate checksums for all layers. ICMP protocol mandates checksum, so we have to calculate it.

unsigned short in_cksum(unsigned short *addr, int len) { int nleft = len; int sum = 0; unsigned short *w = addr; unsigned short answer = 0; while (nleft > 1) { sum += *w++; nleft -= 2; } if (nleft == 1) { *(unsigned char *) (&answer) = *(unsigned char *) w; sum += answer; } sum = (sum >> 16) + (sum & 0xFFFF); sum += (sum >> 16); answer = ~sum; return (answer); }

our main:

int main(int argc, char **argv) { struct ip ip; struct udphdr udp; struct icmp icmp; int sd; const int on = 1; struct sockaddr_in sin; u_char *packet;

Grab some space for our packet:

packet = (u_char *)malloc(60);

Fill Layer II (IP protocol) fields... Header length (including options) in units of 32 bits (4 bytes). Assuming we will not send any IP options, IP header length is 20 bytes, so we need to stuff (20 / 4 = 5 here):

ip.ip_hl = 0x5;

Protocol Version is 4, meaning Ipv4:

ip.ip_v = 0x4;

Type of Service. Packet precedence:

ip.ip_tos = 0x0;

Total length for our packet: As said earlier in section 4.2, all multibyte fields (fields bigger than 8 bits) require to be converted to the network byte-order:

ip.ip_len = htons(60);

ID field uniquely identifies each datagram sent by this host:

ip.ip_id = htons(12830);

Fragment offset for our packet. We set this to 0x0 since we don't desire any fragmentation:

ip.ip_off = 0x0;

Time to live. Maximum number of hops that the packet can pass while travelling through its destination.

ip.ip_ttl = 64;

Upper layer (Layer III) protocol number:

ip.ip_p = IPPROTO_ICMP;

We set the checksum value to zero before passing the packet into the checksum function. Note that this checksum is calculate over the IP header only. Upper layer protocols have their own checksum fields, and must be calculated seperately.

ip.ip_sum = 0x0;

Source IP address, this might well be any IP address that may or may NOT be one of the assigned address to one of our interfaces:

ip.ip_src.s_addr = inet_addr("172.17.14.174");

Destination IP address:

ip.ip_dst.s_addr = inet_addr("172.17.14.169");

We pass the IP header and its length into the internet checksum function. The function returns us as 16-bit checksum value for the header:

ip.ip_sum = in_cksum((unsigned short *)&ip, sizeof(ip));

We're finished preparing our IP header. Let's copy it into the very begining of our packet:

memcpy(packet, &ip, sizeof(ip));

As for Layer III (ICMP) data, Icmp type:

icmp.icmp_type = ICMP_ECHO;

Code 0. Echo Request.

icmp.icmp_code = 0;

ID. random number:

icmp.icmp_id = 1000;

Icmp sequence number:

icmp.icmp_seq = 0;

Just like with the Ip header, we set the ICMP header checksum to zero and pass the icmp packet into the cheksum function. We store the returned value in the checksum field of ICMP header:

icmp.icmp_cksum = 0; icmp.icmp_cksum = in_cksum((unsigned short *)&icmp, 8);

We append the ICMP header to the packet at offset 20:

memcpy(packet + 20, &icmp, 8);

We crafted our packet byte-by-byte. It's time we inject it into the network. First create our raw socket:

if ((sd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW)) < 0) { perror("raw socket"); exit(1); }

We tell kernel that we've also prepared the IP header; there's nothing that the IP stack will do about it:

if (setsockopt(sd, IPPROTO_IP, IP_HDRINCL, &on, sizeof(on)) < 0) { perror("setsockopt"); exit(1); }

Still, the kernel is going to prepare Layer I data for us. For that, we need to specify a destination for the kernel in order for it to decide where to send the raw datagram. We fill in a struct in_addr with the desired destination IP address, and pass this structure to the sendto(2) or sendmsg(2) system calls:

memset(&sin, 0, sizeof(sin)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = ip.ip_dst.s_addr;

As for writing the packet... We cannot use send(2) system call for this, since the socket is not a "connected" type of socket. As stated in the above paragraph, we need to tell where to send the raw IP datagram. sendto(2) and sendmsg(2) system calls are designed to handle this:

if (sendto(sd, packet, 60, 0, (struct sockaddr *)&sin, sizeof(struct sockaddr)) < 0) { perror("sendto"); exit(1); } return 0; }

That's it. Let's compile and run our application. You can see the injected packet via tcpdump(1) on another terminal:

# tcpdump icmp host 172.16.17.160

And...

# make icmp # ./icmp

In tcpdump(1) output, you can see that an ICMP echo request is sent and an ICMP echo reply packet is received. But you have to make sure that the reply packets can find the way back to you. If you set the source IP address a different address from the one you have in your box's interfaces, which means the IP address is a spoofed one, consider to insert published arp entries in your ARP cache. tcpdump(1) output:

# tcpdump icmp tcpdump: listening on fxp0 11:50:43.789297 172.17.14.174 > 172.17.14.169: icmp: echo request 11:50:43.789439 172.17.14.169 > 172.17.14.174: icmp: echo reply #

Now a slightly different example. This time, we create a raw UDP packet. UDP checksums are not mandatory, however most modern Operating System IP stacks calculate and set UDP checksum. If your IP stack sends UDP datagrams without the UDP checksum, the replies for your packet are also missing UDP checksum, vice versa. In our example, we're going to calculate UDP checksum.

We need to use a pseudo header for the calculation of TCP or UDP checksums. This pseudo header contains source and destination IP addresses, which is normally found in the IP header, and the datagram length.

Let's get going, header files to be included:

#include #include #include #include #include #include #include #include #include #include #include #include #include

Structure definition for the pseudo header we're going to use to calculate UDP checksum. This encapsulates the standart UDP header:

struct psd_udp { struct in_addr src; struct in_addr dst; unsigned char pad; unsigned char proto; unsigned short udp_len; struct udphdr udp; };

Checksum function:

in_cksum_udp takes udp header, its length, source and destionation IP addresses, puts them into the pseudo header, and inputs it to the internet chekcsum function:

unsigned short in_cksum_udp(int src, int dst, unsigned short *addr, int len) { struct psd_udp buf; memset(&buf, 0, sizeof(buf)); buf.src.s_addr = src; buf.dst.s_addr = dst; buf.pad = 0; buf.proto = IPPROTO_UDP; buf.udp_len = htons(len); memcpy(&(buf.udp), addr, len); return in_cksum((unsigned short *)&buf, 12 + len); }

Our main:

int main(int argc, char **argv) { struct ip ip; struct udphdr udp; int sd; const int on = 1; struct sockaddr_in sin; u_char *packet;

Grab some space for our packet:

packet = (u_char *)malloc(60);

Just like in ICMP example, we fill in the IP header fields:

ip.ip_hl = 0x5; ip.ip_v = 0x4; ip.ip_tos = 0x0; ip.ip_len = 60; ip.ip_id = htons(12830); ip.ip_off = 0x0; ip.ip_ttl = 64; ip.ip_p = IPPROTO_UDP; ip.ip_sum = 0x0; ip.ip_src.s_addr = inet_addr("172.17.14.174"); ip.ip_dst.s_addr = inet_addr("172.17.14.169"); ip.ip_sum = in_cksum((unsigned short *)&ip, sizeof(ip)); memcpy(packet, &ip, sizeof(ip));

We prepare our UDP header, calculate its cheksum, append it to the packet just after the IP header; source UDP port:

udp.uh_sport = htons(45512);

Destination UDP port:

udp.uh_dport = htons(53);

Length of the UDP datagram (UDP header length + UDP data):

udp.uh_ulen = htons(8);

Set the checksum field to zero, and feed the checksum function, save the returned value in the checksum field:

udp.uh_sum = 0; udp.uh_sum = in_cksum_udp(ip.ip_src.s_addr, ip.ip_dst.s_addr, (unsigned short *)&udp, sizeof(udp)); memcpy(packet + 20, &udp, sizeof(udp));

Below is just the same with the ICMP sample:

if ((sd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW)) < 0) { perror("raw socket"); exit(1); } if (setsockopt(sd, IPPROTO_IP, IP_HDRINCL, &on, sizeof(on)) < 0) { perror("setsockopt"); exit(1); } memset(&sin, 0, sizeof(sin)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = ip.ip_dst.s_addr; if (sendto(sd, packet, 60, 0, (struct sockaddr *)&sin, sizeof(struct sockaddr)) < 0) { perror("sendto"); exit(1); } return 0; }

Up until now, looking at the ICMP and UDP examples, I assume that you've figured out the notion and the underlyings of raw sockets. TCP protocol, by contrast is a "connection-oriented" protocol, which means, there has to be a connection establishment procedure before any user data can be exchanged between the endpoints. This makes spoofing tcp connections somewhat harder. One has to keep a connection state, similar to the one that exist in the kernel TCP stack.

When compared to ICMP and UDP, it might seem harder to spoof TCP protocol, however it is also kind of trivial. If you're going to write a SYN flooder, that's fairly easy anyway, since it does not involve a tcp connection. Here we want to deal with a somewhat more sophisticated tcp raw socket example: we are going to create a spoofed tcp connection, which in fact does not exist in the real world.

TCP connection spoofing involves two phases: (though not limited to)

Spoofing the three-way connection establishment procedure
Keeping track of sequence numbers and advertized window size while sending/receiving user data

As said earlier, connection establishment in the TCP protocol involves a three-way handshake between the endpoints:

Client decides on an Initial Sequence Number (ISN), sends it within a TCP packet with the SYN flag set. Client enters SYN_SENT state.
On receiving the SYN packet, server enters SYN_RECEIVED state. Server also decides its own ISN. Server then increases client's ISN by one, and puts that value in the acknowledgment number field in the tcp header. Server sets SYN and ACK bits in the tcp flags field and sends back the reply.
Client receives the server's reply, increases server's ISN by one, puts that in acknowledgement number field; and sends back a tcp packet with the ACK flag set. After these steps, both parties enter the ESTABLISHED state.

Let us try to write some code to simulate those steps in a simplified way. You can get the source code of the tcp spoofer and the source code for the server application inside the tcp folder in the raws-sample.tar.gz tarball.

I'll go over and try to explain the code here. The application has two threads, the first one sends the first SYN packet, the latter one opens a BPF device and listens for all tcp packets from the BPF device. When it receives a SYN + ACK packet, which is a response to our SYN packet, it increases the ISN by one, and ACKs it by sending the corresponding ACK packet. This will create a full-featured tcp connection in the ESTABLISHED state on the server computer. You can start sending and receiving tcp packets altough that connection does not actually exist in your local system.

If you notice, the destination IP address we used in the tcp spoof example is not in our network. The source IP address is in our network, though it does not belong to any of the interfaces in our host. Reply packets will reach our router, the router will ask for the MAC address of the spoofed IP address by broadcasting ARP request packets, altough it will not be able to get an ARP reply back, thus it will fail to get a MAC address to route the packet to. Workarounds to this problem: we can:

send spoofed ARP responses
insert an ARP entry to the ARP table of the local system

For simplicity's sake, we'll choose the second option. We'll set the MAC address of our ethernet interface for the MAC address of the spoofed IP:

x-wing# arp -s 172.17.14.90 00:0c:76:0f:9b:5a permanent pub x-wing# arp -an ? (172.17.14.1) at 00:05:5e:07:dc:c2 on fxp0 [ethernet] ? (172.17.14.90) at 00:0c:76:0f:9b:5a on fxp0 permanent published [ethernet] ? (172.17.14.160) at 00:10:5a:af:3e:b2 on fxp0 [ethernet] x-wing#

We've made sure that our computer will answer ARP queries for our spoofed IP (172.17.14.90). Now, we'll go over the tcp example line by line:

#include #include #include

TCP header file:

#include int sd;

Just like with UDP, this is the TCP pseudo-header we are going to use to calculate the checksum.

struct psd_tcp { struct in_addr src; struct in_addr dst; unsigned char pad; unsigned char proto; unsigned short tcp_len; struct tcphdr tcp; };

TCP checksum function

unsigned short in_cksum_tcp(int src, int dst, unsigned short *addr, int len) { struct psd_tcp buf; u_short ans; memset(&buf, 0, sizeof(buf)); buf.src.s_addr = src; buf.dst.s_addr = dst; buf.pad = 0; buf.proto = IPPROTO_TCP; buf.tcp_len = htons(len); memcpy(&(buf.tcp), addr, len); ans = in_cksum((unsigned short *)&buf, 12 + len); return (ans); }

The following sends the first SYN packet:

void send_syn() { struct ip ip; struct tcphdr tcp; const int on = 1; struct sockaddr_in sin; u_char *packet; packet = (u_char *)malloc(60); ip.ip_hl = 0x5; ip.ip_v = 0x4; ip.ip_tos = 0x0; ip.ip_len = sizeof(struct ip) + sizeof(struct tcphdr); ip.ip_id = htons(12830); ip.ip_off = 0x0; ip.ip_ttl = 64; ip.ip_p = IPPROTO_TCP; ip.ip_sum = 0x0; ip.ip_src.s_addr = inet_addr("172.17.14.90"); ip.ip_dst.s_addr = inet_addr("172.16.1.204"); ip.ip_sum = in_cksum((unsigned short *)&ip, sizeof(ip)); memcpy(packet, &ip, sizeof(ip)); tcp.th_sport = htons(3333); tcp.th_dport = htons(33334); tcp.th_seq = htonl(0x131123); tcp.th_off = sizeof(struct tcphdr) / 4; tcp.th_flags = TH_SYN; tcp.th_win = htons(32768); tcp.th_sum = 0; tcp.th_sum = in_cksum_tcp(ip.ip_src.s_addr, ip.ip_dst.s_addr, (unsigned short *)&tcp, sizeof(tcp)); memcpy((packet + sizeof(ip)), &tcp, sizeof(tcp)); memset(&sin, 0, sizeof(sin)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = ip.ip_dst.s_addr; if (sendto(sd, packet, 60, 0, (struct sockaddr *)&sin, sizeof(struct sockaddr)) < 0) { perror("sendto"); exit(1); } }

This routine sends the ACK packet in response to the server's SYN+ACK. Takes the server's Initial Sequence Number from the th_seq field, increases it by one, puts it in the Acknowledgement Number field in the response packet (th_ack field). We increase the ISN by one, becuase a SYN flag consumes one ISN.

void send_syn_ack(int s_seq) { struct ip ip; struct tcphdr tcp; const int on = 1; struct sockaddr_in sin; u_char *packet; packet = (u_char *)malloc(60); ip.ip_hl = 0x5; ip.ip_v = 0x4; ip.ip_tos = 0x0; ip.ip_len = sizeof(struct ip) + sizeof(struct tcphdr); ip.ip_id = htons(12831); ip.ip_off = 0x0; ip.ip_ttl = 64; ip.ip_p = IPPROTO_TCP; ip.ip_sum = 0x0; ip.ip_src.s_addr = inet_addr("172.17.14.90"); ip.ip_dst.s_addr = inet_addr("172.16.1.204"); ip.ip_sum = in_cksum((unsigned short *)&ip, sizeof(ip)); memcpy(packet, &ip, sizeof(ip)); tcp.th_sport = htons(3333); tcp.th_dport = htons(33334); tcp.th_seq = htonl(0x131123 + 1); Server'in ack numarasini 1 artirip yaziyoruz: tcp.th_ack = htonl(s_seq + 1); tcp.th_off = sizeof(struct tcphdr) / 4; tcp.th_flags = TH_ACK; tcp.th_win = htons(32768); tcp.th_sum = 0; tcp.th_sum = in_cksum_tcp(ip.ip_src.s_addr, ip.ip_dst.s_addr, (unsigned short *)&tcp, sizeof(tcp)); memcpy((packet + sizeof(ip)), &tcp, sizeof(tcp)); memset(&sin, 0, sizeof(sin)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = ip.ip_dst.s_addr; if (sendto(sd, packet, 60, 0, (struct sockaddr *)&sin, sizeof(struct sockaddr)) < 0) { perror("sendto"); exit(1); } }

run function creates the raw socket, and calls send_syn to send the first packet to initiate the connection sequence

void *run(void *arg) { struct ip ip; struct tcphdr tcp; const int on = 1; struct sockaddr_in sin; if ((sd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW)) < 0) { perror("raw socket"); exit(1); } if (setsockopt(sd, IPPROTO_IP, IP_HDRINCL, &on, sizeof(on)) < 0) { perror("setsockopt"); exit(1); } send_syn(sd); }

raw_packet_receiver function is the packet handler dispatcher function which will be called whenever a packet is received by the packet filter library (libpcap). After we send the initial packet (with the SYN flag set, we're going to read the reply packet from here. The reply packet is a tcp packet with the SYN and ACK flags set. We extract the ISN, increase it by one and craft an ACK packet to complete the connection establishment sequence.

void raw_packet_receiver(u_char *udata, const struct pcap_pkthdr *pkthdr, const u_char *packet) { struct ip *ip; struct tcphdr *tcp; u_char *ptr; int l1_len = (int)udata; int s_seq; ip = (struct ip *)(packet + l1_len); tcp = (struct tcphdr *)(packet + l1_len + sizeof(struct ip)); printf("%d\n", l1_len); printf("a packet came, ack is: %d\n", ntohl(tcp->th_ack)); printf("a packet came, seq is: %u\n", ntohl(tcp->th_seq)); s_seq = ntohl(tcp->th_seq); send_syn_ack(s_seq); sleep(100); }

The packet capturer thread starts running here. The thread opens a BPF device, sets the packet filter (dst host 172....) and waits for packets to arrive. Whenever a packet arrives matching our filter, dispatcher function raw_packet_receiver is invoked.

void *pth_capture_run(void *arg) { pcap_t *pd; char *filter = "dst host 172.17.14.90 and ip"; char *dev = "fxp0"; char errbuf[PCAP_ERRBUF_SIZE]; bpf_u_int32 netp; bpf_u_int32 maskp; struct bpf_program fprog; /* Filter Program */ int dl = 0, dl_len = 0; if ((pd = pcap_open_live(dev, 1514, 1, 500, errbuf)) == NULL) { fprintf(stderr, "cannot open device %s: %s\n", dev, errbuf); exit(1); } pcap_lookupnet(dev, &netp, &maskp, errbuf); pcap_compile(pd, &fprog, filter, 0, netp); if (pcap_setfilter(pd, &fprog) == -1) { fprintf(stderr, "cannot set pcap filter %s: %s\n", filter, errbuf); exit(1); } pcap_freecode(&fprog); dl = pcap_datalink(pd); switch(dl) { case 1: dl_len = 14; break; default: dl_len = 14; break; } if (pcap_loop(pd, -1, raw_packet_receiver, (u_char *)dl_len) < 0) { fprintf(stderr, "cannot get raw packet: %s\n", pcap_geterr(pd)); exit(1); } }

Program main. Creates a thread for the packet capturer, and sends the first SYN packet.

int main(int argc, char **argv) { pthread_t tid_pr; if (pthread_create(&tid_pr, NULL, pth_capture_run, NULL) != 0) { fprintf(stderr, "cannot create raw packet reader: %s\n", strerror(errno)); exit(1); } printf("raw packet reader created, waiting 1 seconds for packet reader thread to settle down...\n"); sleep(1); run(NULL); pthread_join(tid_pr, NULL); return 0; }

Let's fire our program and see what happens:

x-wing# ./tcp raw packet reader created, waiting 1 seconds for packet reader thread to settle down... a packet came, ack is: 1249572 a packet came, seq is: 3431236214 x-wing#

You can see the effect with tcpdump(1)

x-wing# tcpdump host 172.16.1.204 tcpdump: listening on fxp0

Very first packet we spoofed (SYN)

11:35:42.839916 172.17.14.90.3333 > 172.16.1.204.33334: S 1249571:1249571(0) win 32768

Router sends an ARP request for the MAC address of our spoofed IP addres 172.17.14.90.

11:35:42.840290 arp who-has 172.17.14.90 tell 172.17.14.1

Since we entered a published ARP entry for the IP addres, our host answers the ARP query:

11:35:42.840299 arp reply 172.17.14.90 is-at 0:c:76:f:9b:5a

Server's SYN + ACK packet with our ISN + 1 acked, and its ISN set:

11:35:45.910287 172.16.1.204.33334 > 172.17.14.90.3333: S 3431236214:3431236214(0) ack 1249572 win 32768 (DF)

Our spoofed ACK packet in response to the server's SYN + ACK, with server's ISN + 1 acked:

11:35:45.910467 172.17.14.90.3333 > 172.16.1.204.33334: . ack 1 win 32768

The connection should be established now. You can confirm this by netstat(1) in the remote computer:

$ netstat -an | grep 33334 tcp 0 0 172.16.1.204.33334 172.17.14.90.3333 ESTABLISHED tcp 0 0 *.33334 *.* LISTEN $

However, when you look at the connections in your local computer via netstat(1), you'll notice that the connection in fact does not exist...

In this paper, the objective was to document sending and receiving raw datagrams using the raw sockets interface. Three sample source code has been discussed to examplify the concept and the API.

Two famous tools that utilize the raw sockets are the ping(8) and traceroute(8) . Apart from the samples presented here, you can have a look at the source code for those tools. The source code for ping(8) tool is /usr/src/sbin/ping/ping.c; and the for traceroute(8) , you can find the source in the /usr/src/contrib/traceroute/ directory in the FreeBSD source tree.

Apart from those applications, raw sockets are widely used (shall I say abused?) for security-related tools. IP Spoofing is a technique commonly utilized in the underground for a variety of attacks including man in the middle attacks. If you've read this paper carefully, you should have seen how easy it is to spoof a tcp connection; and as a result, come to an understanding that, normal TCP/UDP sockets are not enough for a so-called "secure" communication channel. On top of those layers, additional cryptographic mechanisms should be considered. Secure Sockets Layer (SSL) is a very good protocol to secure a TCP(or UDP) connection. On the lower layers, IPSec protocol can be a very effective means to secure IP and higher level layers.

Murat Balaban. March 29, 2007 Beylerbeyi, Istanbul.

Stevens, W. Richard. TCP/IP Illustrated Volume I: The Protocols. Boston, Mass. : Addison-Wesley, 2004. ISBN 0-201-70245-2
Wright G.R., Stevens, W. R. TCP/IP Illustrated Volume II: The Implementation. Boston, Mass. : Addison-Wesley, 1995. ISBN 0-201-63354-X
Stevens, W. Richard. Unix Network Programming Volume I: Network APIs. Boston, Mass. : Addison-Wesley, 2004. ISBN 0-201-70245-2
Marshall Kirk McKusick, George V. Neville-Neil The Design and Implementation of the FreeBSD Operating System. Boston, Mass. : Addison-Wesley, 2004. ISBN 0-201-70245-2
Tanenbaum, A. S. Modern Operating Systems. Prentice Hall, 2001. ISBN 0-13-03358-0
Comer, Douglas E. Internetworking with TCP/IP Volume I: Principles, Protocols and Architecture. Upper Saddle River, New Jersey. : Prentice Hall, 1995. ISBN 0-13-227836-7
Herbert, Thomas F. The Linux TCP/IP Stack: Networking for Embedded Systems. Hingham, Massachusetts. : Charles River Media, 2005. ISBN 1-58450-284-3
RAW IP FAQ:

$Id: rawipspoof.sgml,v 1.3 2007/03/30 19:30:32 murat Exp $

Recent versions of this article can be found at .

This is an enhanced version of the original article in Turkish, which can be found here at

However, "sockets" as described above, does not fit all our needs. Normal sockets lack some functionality some of which is given below:

We cannot read/write ICMP or IGMP protocols with normal sockets. Our famous ping(8) tool cannot be written using them.
Some Operating Systems do not process IPv4 protocols other than ICMP, IGMP, TCP or UDP. What if we have a propriatery protocol that we want to handle? How do we send/receive data using that protocol?

Likewise, an incoming packet is demultiplexed as it goes upwards within the stack hierarchy, and is finaly delivered to the userspace application.

Very first of the TCP/IP layers. When the packet is received off the wire, the early processing is done in here. Duties include:

send/receive datagrams for the IP protocol
send/receive ARP requests and replies for the ARP protocol
send/receive RARP request and replies for the RARP protocol

Depending on the hardware equipment used, the header length for this layer is variable. Some of the most widely used Link Layer protocols are, Ethernet, PPP, ATM and SLIP.

Figure 1. Ip header

In the network layer, IP protocol employs a minimum header size of 20 octets. Below are the IP header fields that are in common use (from /usr/include/netinet/ip.h):

Figure 2. UDP header

Below are the protocol headers for the UDP and TCP protocols. UDP protocol fields from /usr/include/netinet/udp.h:

Figure 3. TCP header

TCP protocol fields from file /usr/include/netinet/tcp.h:

Just like normal sockets, we create raw sockets with the socket(2) system call:

int socket(int domain, int type, int protocol)

However, type and protocol parameters are set to SOCK_RAW and protocol name accordingly:

if ((sd = socket(AF_INET, SOCK_RAW, IPPROTO_ICMP)) < 0) { .. }

const int on = 1; if (setsockopt(sd, IPPROTO_IP, IP_HDRINCL, &on, sizeof(on) < 0) { .. }

In some occassions, you can even call bind(2) and connect(2) on a raw socket, though you won't probably want to do that because it will cost you flexibility.

After creating your socket, and building your datagram, you can inject it via sendto(2) and sendmsg(2) system calls. A raw socket is datagram oriented, each send call requires a destination address.

TCP and UDP packets are never delivered to raw sockets, they are always handled by the kernel protocol stack.
Copies of ICMP packets are delivered to a matching raw socket. For some of the ICMP types (ICMP echo request, ICMP timestamp request, mask request) the kernel, at the same time, may wish to do some processing and generate replies.
All IGMP packets are delivered to raw sockets: e.g. OSPF packets.
All other packets destined for protocols that are not processed by a kernel subsystem are delivered to raw sockets.

setting the protocol accordingly while creating your socket via socket(2)system call. For instance, if you're sending an ICMP echo-request packet, and want to receive ICMP echo-reply, you can set the protocol argument (3rd argument) to IPPROTO_ICMP).
setting the protocol argument in socket(2) to 0, so any protocol number in the received packet header will match.
defining a local address for your socket (via e.g. bind(2)), so if the destination address matches the socket's local address, it'll be delivered to your application also.

You can get the source code for the sample applications from . Also, you'll need to use tcpdump(1) tool to see the injected packets live on the wire.

Just before we start coding, it is important that we talk a little bit about byte ordering.

On machines which have a byte order same as the network order, routines are defined as null macros.

Here are necessary header files that are to included for the ICMP raw socket application:

#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include

Internet checksum function (from BSD Tahoe) We can use this function to calculate checksums for all layers. ICMP protocol mandates checksum, so we have to calculate it.

our main:

int main(int argc, char **argv) { struct ip ip; struct udphdr udp; struct icmp icmp; int sd; const int on = 1; struct sockaddr_in sin; u_char *packet;

Grab some space for our packet:

packet = (u_char *)malloc(60);

ip.ip_hl = 0x5;

Protocol Version is 4, meaning Ipv4:

ip.ip_v = 0x4;

Type of Service. Packet precedence:

ip.ip_tos = 0x0;

Total length for our packet: As said earlier in section 4.2, all multibyte fields (fields bigger than 8 bits) require to be converted to the network byte-order:

ip.ip_len = htons(60);

ID field uniquely identifies each datagram sent by this host:

ip.ip_id = htons(12830);

Fragment offset for our packet. We set this to 0x0 since we don't desire any fragmentation:

ip.ip_off = 0x0;

Time to live. Maximum number of hops that the packet can pass while travelling through its destination.

ip.ip_ttl = 64;

Upper layer (Layer III) protocol number:

ip.ip_p = IPPROTO_ICMP;

ip.ip_sum = 0x0;

Source IP address, this might well be any IP address that may or may NOT be one of the assigned address to one of our interfaces:

ip.ip_src.s_addr = inet_addr("172.17.14.174");

Destination IP address:

ip.ip_dst.s_addr = inet_addr("172.17.14.169");

We pass the IP header and its length into the internet checksum function. The function returns us as 16-bit checksum value for the header:

ip.ip_sum = in_cksum((unsigned short *)&ip, sizeof(ip));

We're finished preparing our IP header. Let's copy it into the very begining of our packet:

memcpy(packet, &ip, sizeof(ip));

As for Layer III (ICMP) data, Icmp type:

icmp.icmp_type = ICMP_ECHO;

Code 0. Echo Request.

icmp.icmp_code = 0;

ID. random number:

icmp.icmp_id = 1000;

Icmp sequence number:

icmp.icmp_seq = 0;

Just like with the Ip header, we set the ICMP header checksum to zero and pass the icmp packet into the cheksum function. We store the returned value in the checksum field of ICMP header:

icmp.icmp_cksum = 0; icmp.icmp_cksum = in_cksum((unsigned short *)&icmp, 8);

We append the ICMP header to the packet at offset 20:

memcpy(packet + 20, &icmp, 8);

We crafted our packet byte-by-byte. It's time we inject it into the network. First create our raw socket:

if ((sd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW)) < 0) { perror("raw socket"); exit(1); }

We tell kernel that we've also prepared the IP header; there's nothing that the IP stack will do about it:

if (setsockopt(sd, IPPROTO_IP, IP_HDRINCL, &on, sizeof(on)) < 0) { perror("setsockopt"); exit(1); }

memset(&sin, 0, sizeof(sin)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = ip.ip_dst.s_addr;

if (sendto(sd, packet, 60, 0, (struct sockaddr *)&sin, sizeof(struct sockaddr)) < 0) { perror("sendto"); exit(1); } return 0; }

That's it. Let's compile and run our application. You can see the injected packet via tcpdump(1) on another terminal:

# tcpdump icmp host 172.16.17.160

And...

# make icmp # ./icmp

# tcpdump icmp tcpdump: listening on fxp0 11:50:43.789297 172.17.14.174 > 172.17.14.169: icmp: echo request 11:50:43.789439 172.17.14.169 > 172.17.14.174: icmp: echo reply #

Let's get going, header files to be included:

#include #include #include #include #include #include #include #include #include #include #include #include #include

Structure definition for the pseudo header we're going to use to calculate UDP checksum. This encapsulates the standart UDP header:

struct psd_udp { struct in_addr src; struct in_addr dst; unsigned char pad; unsigned char proto; unsigned short udp_len; struct udphdr udp; };

Checksum function:

in_cksum_udp takes udp header, its length, source and destionation IP addresses, puts them into the pseudo header, and inputs it to the internet chekcsum function:

Our main:

int main(int argc, char **argv) { struct ip ip; struct udphdr udp; int sd; const int on = 1; struct sockaddr_in sin; u_char *packet;

Grab some space for our packet:

packet = (u_char *)malloc(60);

Just like in ICMP example, we fill in the IP header fields:

We prepare our UDP header, calculate its cheksum, append it to the packet just after the IP header; source UDP port:

udp.uh_sport = htons(45512);

Destination UDP port:

udp.uh_dport = htons(53);

Length of the UDP datagram (UDP header length + UDP data):

udp.uh_ulen = htons(8);

Set the checksum field to zero, and feed the checksum function, save the returned value in the checksum field:

udp.uh_sum = 0; udp.uh_sum = in_cksum_udp(ip.ip_src.s_addr, ip.ip_dst.s_addr, (unsigned short *)&udp, sizeof(udp)); memcpy(packet + 20, &udp, sizeof(udp));

Below is just the same with the ICMP sample:

TCP connection spoofing involves two phases: (though not limited to)

Spoofing the three-way connection establishment procedure
Keeping track of sequence numbers and advertized window size while sending/receiving user data

As said earlier, connection establishment in the TCP protocol involves a three-way handshake between the endpoints:

Client decides on an Initial Sequence Number (ISN), sends it within a TCP packet with the SYN flag set. Client enters SYN_SENT state.
On receiving the SYN packet, server enters SYN_RECEIVED state. Server also decides its own ISN. Server then increases client's ISN by one, and puts that value in the acknowledgment number field in the tcp header. Server sets SYN and ACK bits in the tcp flags field and sends back the reply.
Client receives the server's reply, increases server's ISN by one, puts that in acknowledgement number field; and sends back a tcp packet with the ACK flag set. After these steps, both parties enter the ESTABLISHED state.

send spoofed ARP responses
insert an ARP entry to the ARP table of the local system