Chinaunix首页 | 论坛 | 博客
  • 博客访问: 265808
  • 博文数量: 52
  • 博客积分: 406
  • 博客等级: 一等列兵
  • 技术积分: 549
  • 用 户 组: 普通用户
  • 注册时间: 2012-04-21 12:34
个人简介

......

文章分类

全部博文(52)

文章存档

2014年(1)

2013年(32)

2012年(19)

我的朋友

分类: LINUX

2014-01-18 17:25:06

好久没写了,看了一下上次发表的时间是12月14号,今天是1月18号了,一个月多了啊。不找什么借口了,是自己懒惰。 把这几天的工作总结一下:
1. 问题
    要对2048*1080个字节进行内存拷贝工作,当然最简单的就是memcpy来做了,但是由于时间限制,用memcpy耗时太长,需要进行优化。我们是DSP平台,当然提供了快速指令来进行,但是发现时间
    比memcpy还差,这就说明出现问题了,现在把代码贴上(我的代码质量啊...)
   

点击(此处)折叠或打开

  1. #if 0
  2.                         for(i=0; i<2211840; i+=64)
  3.                         {
  4.                             SUPER_LD32R((signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i) ,
  5.                                  (signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+4),
  6.                                  (unsigned char *)videoPacket->buffers[0].data+i, 0);
  7.                             
  8.                             SUPER_LD32R((signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+8),
  9.                               (signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+12),
  10.                               (unsigned char *)videoPacket->buffers[0].data+i+8, 0);
  11.                             
  12.                             SUPER_LD32R((signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+16),
  13.                               (signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+20),
  14.                               (unsigned char *)videoPacket->buffers[0].data+i+16, 0);

  15.                             SUPER_LD32R((signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+24),
  16.                               (signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+28),
  17.                               (unsigned char *)videoPacket->buffers[0].data+i+24, 0);

  18.                             SUPER_LD32R((signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+32) ,
  19.                                  (signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+36),
  20.                                  (unsigned char *)videoPacket->buffers[0].data+i+32, 0);
  21.                             
  22.                             SUPER_LD32R((signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+40),
  23.                               (signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+44),
  24.                               (unsigned char *)videoPacket->buffers[0].data+i+40, 0);
  25.                             
  26.                             SUPER_LD32R((signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+48),
  27.                               (signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+52),
  28.                               (unsigned char *)videoPacket->buffers[0].data+i+48, 0);

  29.                             SUPER_LD32R((signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+56),
  30.                               (signed long *)(compareSignal.tempMemP[compareSignal.auxFlag]+i+60),
  31.                               (unsigned char *)videoPacket->buffers[0].data+i+56, 0);
  32.                         }
  33.                         #else
  34.                         unsigned char * tempP = compareSignalMem.tempMemP[compareSignal.auxFlag++];
  35.                         unsigned char * tempPacketP = (unsigned char *)videoPacket->buffers[0].data;
  36.                         for(i=0; i<2211840; i+=64)
  37.                         {
  38.                             unsigned char * tempP1 = tempP+i;
  39.                             SUPER_LD32R((signed long *)(tempP1) , (signed long *)(tempP1+4), tempPacketP+i, 0);                            
  40.                             SUPER_LD32R((signed long *)(tempP1+8), (signed long *)(tempP1+12), tempPacketP+i+8, 0);                            
  41.                             SUPER_LD32R((signed long *)(tempP1+16), (signed long *)(tempP1+20), tempPacketP+i+16, 0);
  42.                             SUPER_LD32R((signed long *)(tempP1+24), (signed long *)(tempP1+28), tempPacketP+i+24, 0);
  43.                             SUPER_LD32R((signed long *)(tempP1+32) , (signed long *)(tempP1+36), tempPacketP+i+32, 0);                            
  44.                             SUPER_LD32R((signed long *)(tempP1+40), (signed long *)(tempP1+44), tempPacketP+i+40, 0);                            
  45.                             SUPER_LD32R((signed long *)(tempP1+48), (signed long *)(tempP1+52), tempPacketP+i+48, 0);
  46.                             SUPER_LD32R((signed long *)(tempP1+56), (signed long *)(tempP1+60), tempPacketP+i+56, 0);
  47.                         }
  48.                         #endif
2. 提高代码质量
a) 经测试从内存中取值是非常耗时的,要比运算耗时。所以我在优化后把耗时的compareSignalMem.tempMemP[compareSignal.auxFlag++]; 从for循环中提取了出来,这样耗时减少了。
b)另外还注意了一点就是在一个for循环中多做,减少for循环的次数。当for循环中的运算量比较小,但for循环的次数比较多时,可以采用这个办法来减少for循环的次数。
c)在解决这个问题中,看到从内存中取值还是比较耗时间的


就这样吧

阅读(3159) | 评论(0) | 转发(0) |
0

上一篇:setsockopt函数

下一篇:没有了

给主人留下些什么吧!~~