Network performance within Amazon EC2 and to S3-xjc2694-ChinaUnix博客

Xiajc - 工作笔记xjc2694.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

xjc2694

博客访问： 3086143
博文数量： 535
博客积分： 15788
博客等级：上将
技术积分： 6507
用户组：普通用户
注册时间： 2007-03-07 09:11

文章分类

全部博文（535）

Puppet（6）
Solaris（1）
hadoop（15）
虚拟化（8）
C（1）
DB（44）
perl（35）
云计算（27）
系统监控（26）
Others（27）
WWW（100）
Mail（20）
Linux（213）
未分配的博文（12）

文章存档

2016年（1）

2015年（1）

2014年（10）

2013年（26）

2012年（43）

2011年（86）

2010年（76）

2009年（136）

2008年（97）

2007年（59）

我的朋友

Performance between Amazon EC2 instances

In this first experiment I boot a couple of EC2 large instances. I make one of them a ’server’ by setting up apache and copying some large files into it, and use the other instance as a client by issuing http requests with curl. All file transfers are made out of memory cached blocks, so there’s virtually no disk I/O involved in them.

So, this experimental setup consists of 2 Large EC2 instances:

Server: Apache (non-SSL) serving 1-2GB files (cached in memory)

Client: curl retrieving the large files from the server

These two instances seem to be actually separated by an intermediate router…so they don’t seem to be on the same host. This is the traceroute across them:

view plaincopy to clipboardprint?
traceroute to 10.254.39.48 (10.254.39.48), 30 hops max, 40 byte packets  
1  dom0-10-254-40-170.compute-1.internal (10.254.40.170)  0.123 ms  0.102 ms  0.075 ms  
2  10.254.40.2 (10.254.40.2)  0.380 ms  0.255 ms  0.246 ms  
3  dom0-10-254-36-166.compute-1.internal (10.254.36.166)  0.278 ms  0.257 ms  0.231 ms  
4  domU-12-31-39-00-20-C2.compute-1.internal (10.254.39.48)  0.356 ms  0.331 ms  0.319 ms  

So what are the results we got?
Well, using 1 single curl file retrieval, we were able to get around 75MB/s consistently. And adding additional curls uncovered even more network bandwidth, reaching close to 100MB/s. Here are the results:

1 curl -> 75MB/s (cached, i.e., no I/O on the apache server)
2 curls -> 88MB/s (2×44MB/s) (cached)
3 curls -> 96MB/s (33+35+28 MB/s) (cached)

I did not repeat the experiments using SSL. However, I did some additional tests transferring files using ’scp’ across the same instances . Those tests seem to max out at around 30-40MB/s regardless of the amount of parallelism as the CPU becomes the bottleneck.

This is really nice: basically we’re getting a full gigabit between the instances! Now, let’s take a look at what we get when EC2 instances talk to S3.

Performance between Amazon EC2 and Amazon S3

This experiment is similar to the previous one in the sense that I use curl to download or upload files from the server. The server, however, is s3.amazonaws.com. (Still using HTTP and HTTPS since S3 is a REST service).

So, this experimental setup consists of 1 Large EC2 instance:

curl to retrieve or upload S3 objects to/from S3
Amazon S3: i.e., s3.amazonaws.com
serving (or storing) 1GB files

The trace to the selected s3 server looks like:

view plaincopy to clipboardprint?
traceroute to s3.amazonaws.com (72.21.206.171), 30 hops max, 40 byte packets  
  
1  dom0-10-252-24-163.compute-1.internal (10.252.24.163)  0.122 ms  0.150 ms  0.209 ms  
  
2  10.252.24.2 (10.252.24.2)  0.458 ms  0.348 ms  0.409 ms  
  
3  othr-216-182-224-9.usma1.compute.amazonaws.com (216.182.224.9)  0.384 ms  0.400 ms  0.440 ms  
  
4  othr-216-182-224-15.usma1.compute.amazonaws.com (216.182.224.15)  0.990 ms  1.115 ms  1.070 ms  
  
5  othr-216-182-224-90.usma1.compute.amazonaws.com (216.182.224.90)  0.807 ms  0.928 ms  0.902 ms  
  
6  othr-216-182-224-94.usma1.compute.amazonaws.com (216.182.224.94)  151.979 ms  152.001 ms  152.021 ms  
  
7  72.21.199.32 (72.21.199.32)  2.050 ms  2.029 ms  2.087 ms  
  
8  72.21.205.24 (72.21.205.24)  2.654 ms  2.629 ms  2.597 ms  
  
9  * * *

So, although the server itself doesn’t respond to ICMPs, the trace tells that there’s a significant path to be traversed.

Let’s start with downloads, more specifically with HTTPS downloads. The first thing that I noticed is that the performance of a single download stream is quite good (i.e., around 12.6MB/s). What is also interesting to note is that while download performance doesn’t scale linearly with the number of concurrent curls, it is possible for a large instance to reach higher download speeds when downloading several objects in parallel. The maximum performance seems to flatten out around 50MB/s. At that point the large instance is operating at a CPU usage of around 22% user plus 20% system, which given the SSL encryption going on is nice!

Here are the raw HTTPS numbers:

1 curl -> 12.6MB/s
2 curls -> 21.0MB/s (10.5+10.5 MB/s)
3 curls -> 31.3MB/s (10.2+10.0+11.1 MB/s)
4 curls -> 37.5MB/s (9.0+9.1+9.8+9.6 MB/s)
6 curls -> 46.6MB/s (8.0+7.8+7.6+7.9+7.8+7.5 MB/s)
8 curls -> 49.8MB/s (6.0+6.3+7.0+6.1+6.0+5.9+6.2+6.3 MB/s)

The SSL encryption uses RC4-MD4, so there is a fair amount of work for both S3 and the instance to do. So the next natural question is to find out if there’s more to gain when talking to S3 without SSL. Unfortunately, the answer is no. While the load in the client reduces significantly (from 22% to 5% user and from 20-14% system when using 8 curls), the available bandwidth using non-SSL is basically the same (i.e., the differences fall within the margin of error). Which leads me to believe that in either case the instance is not the bottleneck.

Here are the same data points for non-SSL (HTTP) downloads:

1 curl -> 10.2 MB/s
2 curls -> 20.0 MB/s (10.1+9.9 MB/s)
3 curls -> 29.6 MB/s (10.0+9.7+9.9 MB/s)
4 curls -> 37.6MB/s (9.1+9.4+9.4+9.7MB/s)
6 curls -> 46.5 MB/s (7.8+7.8+7.6+7.9+7.8+7.6 MB/s)
8 curls -> 51.5 MB/s (6.6+6.4+6.6+6.3+6.2+6.2+6.7+6.3 MB/s)

Interestingly enough, a single non-SSL stream, seems to get less performance than an SSL one (10.2MB/s vs 12.6MB/s). I didn’t check whether the SSL stream uses compression, that may be one reason this is occurring.

So how about uploads? I’ll use the same setup but using curl to upload a 1GB file using a signed S3 URL.
The first interesting thing to notice from the results is that 1 single upload stream gets half the bandwidth that the downloads get (i.e., 6.9MB/s vs. 12.6MB/s). However, the good news is that the upload bandwidth still scales when using multiple streams.

Here are the raw numbers for SSL uploads:

1 curl -> 6.9MB/s
2 curls -> 14.2MB/s
4 curls -> 23.6MB/s
6 curls -> 37.6MB/s
8 curls -> 48.0MB/s
12 curls -> 53.8MB/s

In other words: give me some data and I’ll fill-up S3 in a hurry . So what about using non-SSL uploads? Well, that turned out to be an interesting one… I’ve seen a single curl upload achieve the same performance as download (that is: 1 curl upload with no SSL, can achieve 12.6MB/s). But over quite a number of experiments I’ve seen non-SSL uploads exhibit a weird behavior where some of them mysteriously slow down and linger for a while almost idle (i.e., at a very low bandwidth). The end result is that the average bandwidth at the end of the run varies by a factor of almost 2x. I’m still investigating to see what happens.

Summary

The bottom line from these experiments is that Amazon is providing very high throughput around EC2 and S3. Results were readily reproducible (except for the problem described with the non-SSL uploads) and definitely support high bandwidth high volume operation. Clearly if you put together a custom cluster in your own datacenter you can wire things up with more bandwidth, but for a general purpose system, this is a ton of bandwidth all around.

Archived Comments

myke
Do you have any numbers of traffic outside the amazon network?

blanquer
Do you mean from/to EC2? As far as we know, there’s no explicit throttling going in and out of EC2 from/to the Internet. So I would assume that most likely the bottleneck is your Internet connection (i.e., there are not many people that have GigaBit ethernet pipes to the Internet…:-) ).

But as far as numbers to back that up…no. I don’t have them.

Josep M.

阅读(2522) | 评论(0) | 转发(0) |

上一篇：Reserving EC2 Instances for Long-Term Use（ec2折扣）

下一篇：nginx rewrite 参数和例子

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6