使用mmap()和使用read()write()实现文件拷贝的对比-wwwzyf-ChinaUnix博客

从未走远alenzhou.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

wwwzyf

博客访问： 832893
博文数量： 92
博客积分： 1498
博客等级：上尉
技术积分： 993
用户组：普通用户
注册时间： 2009-09-18 18:31

文章分类

全部博文（92）

网络（1）
Linux网络编程（1）
Java（1）
OS（1）
Kernel（13）
Linux（20）
杂记（2）
娱乐（1）
Shell（12）
LinuxC（20）
LAMP（10）
数据结构与算法（8）
未分配的博文（2）

文章存档

2013年（2）

2012年（3）

2011年（3）

2010年（61）

2009年（23）

我的朋友

相关博文

使用mmap()和使用read()write()实现文件拷贝的对比

分类： LINUX

2010-04-11 13:59:53

最近上课老师说了这样一句话:mmap()内存映射可以实现文件的拷贝，并且速度明显快于一般的文件拷贝，于是
我试着实现了两种文件拷贝所花费时间的比较，首先看代码：

#include
#include
#include
#include
#include
#include
#include
#include
#include
#include

#define BUFFER_SIZE 1

void my_copy1()
{
    int fin,fout;   //文件描述符
    void *start;
    void *end;
    struct stat sb;
    if((fin = open("file.in",O_RDONLY)) < 0){
        perror("open error");
        exit(EXIT_FAILURE);
    }
    if((fout = open( "file.out",O_RDWR | O_CREAT | O_TRUNC,00600)) < 0 ){
        perror( "write error" );
        exit( EXIT_FAILURE );
    }

    fstat(fin,&sb);

    //这块必须给fout一个需求大小的偏移，因为mmap没有扩展空间的能力
    if(lseek(fout,sb.st_size-1,SEEK_SET) < 0 ){
        exit(EXIT_FAILURE);
    }
    if(write(fout, &sb,1) != 1 ){
        exit(EXIT_FAILURE);
    }

    start = mmap(NULL,sb.st_size,PROT_READ,MAP_PRIVATE,fin,0);
    if(start == MAP_FAILED)
        return;

    end = mmap(0,(size_t)sb.st_size,PROT_WRITE,MAP_SHARED,fout,0);
    if(end == MAP_FAILED){
        return ;
    }

    memcpy(end,start,(size_t)sb.st_size);

    munmap(start,sb.st_size); //关闭映射
    munmap(end,sb.st_size);

    close(fin);
    close(fout);
    return;
}

void my_copy2()
{
    int fin,fout;
    int bytes_read,bytes_write;
    char buffer[BUFFER_SIZE];
    char *ptr;
    if((fin = open("file.in",O_RDONLY)) < 0){
        perror("open error");
        exit(EXIT_FAILURE);
    }
    if((fout = open( "file.out",O_RDWR | O_CREAT | O_TRUNC,00700)) < 0 ){
        perror( "write error" );
        exit( EXIT_FAILURE );
    }

    while(bytes_read=read(fin,buffer,BUFFER_SIZE)){
        if((bytes_read==-1)&&(errno!=EINTR))
            break;
        else if(bytes_read>0){
            ptr=buffer;
            while(bytes_write=write(fout,ptr,bytes_read)){
                if((bytes_write==-1)&&(errno!=EINTR))
                    break;
                else if(bytes_write==bytes_read)
                    break;
                else if(bytes_write>0){
                    ptr+=bytes_write;
                    bytes_read-=bytes_write;
                }
            }
            if(bytes_write==-1)
               break;
         }
    }

    close(fin);
    close(fout);
    return;
}

main()
{
    struct timeval tv;
    struct timezone tz;
    long time_start,time_end;
    gettimeofday(&tv,&tz);
    time_start = tv.tv_sec*1000000 + tv.tv_usec;
    my_copy1();
    printf("\ndone.\n\n");
    gettimeofday(&tv,&tz);
    time_end = tv.tv_sec*1000000 + tv.tv_usec;
    printf("using \"mmap()\" to copy costs %ld microseconds \n",time_end - time_start);

    gettimeofday(&tv,&tz);
    time_start = tv.tv_sec*1000000 + tv.tv_usec;
    my_copy2();
    gettimeofday(&tv,&tz);
    time_end = tv.tv_sec*1000000 + tv.tv_usec;

//这块之前有时会出现打印出的负数，后来查看gettimeofday()函数，才知道，我所使用的微秒位在满一秒
//的时候会进位到秒，也即是微秒位清零。所以，正确的方法是在计算time_start和time_end的时候加上
//秒这一位，不过要首先转换其成为微秒。
    printf("using \"read() and write()\" to copy costs %ld microseconds \n",time_end - time_start);
}

代码不是很难，中间使用了一些LinuxC的一些函数，不懂的可以自己查阅相关资料。我现在主要使想就两种
不同的拷贝的实现在所花费的时间上的一些比较以及的出我自己的一些观点，调试程序时可以将BUFFER_SIZE
随意更改一个数字，表示的是使用read函数从文件中一次读取的字符个数。当然，强调了这个必然有原因。
如果BUFFER_SIZE很小的话，最终的结果差别很大。比如我的
BUFFER_SIZE=1时我的运行结果如下：
zhou@zhou:~/LinuxC/file/mmcopy$ ./mmap

done.

using "mmap()" to copy costs 591 microseconds
using "read() and write()" to copy costs 505337 microseconds
zhou@zhou:~/LinuxC/file/mmcopy$
两个完全不是一个数量级的。下面换个数字
BUFFER_SIZE=10000时我的运行情况如下：
zhou@zhou:~/LinuxC/file/mmcopy$ ./mmap

done.

using "mmap()" to copy costs 594 microseconds
using "read() and write()" to copy costs 585 microseconds
zhou@zhou:~/LinuxC/file/mmcopy$
这时两个的消耗时间很接近，可以想象。如果BUFFER_SIZE定义的很大的话，那么read()write()方法将会
非常快，但是。如果你要拷贝的文件很小呢，加入只有100字节，但是你却每次申请10000个字节，这样岂不
是很浪费内存。这也就是mmap()的优势，不仅没有浪费内存，而且速度相当的快。
话题一转，这是为什么呢，我的理解使这样的：mmap首先将要拷贝的文件的内容全部映射到内存，然后写到目
的文件，总共的磁盘操作就两次，而read()write()不同，会根据你的BUFFER_SIZE定义的，然后会执行
(文件内容的总的字节数/BUFFER_SIZE)*2 次的磁盘操作，因此在这上面浪费了大量的时间。所以了

好的，就这么多了，如果有什么问题可以直接留言，互相讨论，谢谢

阅读(3581) | 评论(0) | 转发(0) |

上一篇：目录结构的树状打印 -目录相关操作

下一篇：Linux 下rm+grep删除除去指定文件的剩余所有文件

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6