探讨互斥锁锁定非临界区带来多少性能消耗-key

清风竹影

首页　| 　博文目录　| 　关于我

key_person

博客访问： 153969
博文数量： 43
博客积分： 0
博客等级：民兵
技术积分： 100
用户组：普通用户
注册时间： 2016-06-19 19:25

个人简介

迷茫的开发

文章分类

全部博文（43）

Linux IO（1）
存储（1）
并发编程（4）
网络开发（11）
工具（3）

git（1）
算法（2）
经验与随笔（3）
module（0）
点滴（0）
办公简化（0）
netfilter（18）
原创（0）
转载（0）
未分配的博文（0）

文章存档

2022年（1）

2019年（14）

2017年（10）

2016年（18）

我的朋友

相关博文

探讨互斥锁锁定非临界区带来多少性能消耗

分类： C/C++

2017-11-22 20:57:48

概述

本次主要是测试使用互斥锁，锁定非临界区带来的性能消耗。

在我们写代码时，有时候通过逻辑的设计，可以使代码中临界区在80%以上不会同时访问。但是从理论上来说，在极端或者概率很低的情况下它是可能成为临界区的。处于程序的稳定性考虑，同样是需要加锁的。

但是最近在看disruptor文档[1]时，文献提到：

即使不是临界资源，只要调用了锁就会大幅度的降低性能。
而我之前在项目中的代码，总是会考虑逻辑上减少多线程去竞争同一个锁，这难道是在做无用功？

文中采用的是简单的做5亿次++操作，考虑到其是用Java实现的，因此此处采用C来实现，实践来检验一下结果

如果有资源竞争，肯定会导致性能下降。因此我们主要对比进入“假临界区"的场景。

测试代码：

点击(此处)折叠或打开

#include<stdio.h>
#include<time.h>
#include<sys/time.h>
#include<unistd.h>
#include<string.h>
#include<pthread.h>
unsigned long gtimes = 2 * 1000 * 1000 * 1000;
unsigned long i;
struct timeval startTime, endTime;
pthread_mutex_t gmutex; //ensure not a stack varible;
void start_time()
{
gettimeofday(&startTime, NULL);
}
void end_time()
{
gettimeofday(&endTime, NULL);
}
double spend_time()
{
return 1000 * (endTime.tv_sec - startTime.tv_sec) +
(endTime.tv_usec - startTime.tv_usec) / 1000.0f;
}
void* test_thread(void* argv)
{
i = gtimes;
start_time();
while(i--);
end_time();
printf(" a thread cost time: %.2f ms\n", spend_time());
return NULL;
}
void* test_lockthread(void* argv)
{
i = gtimes;
pthread_mutex_init(&gmutex,NULL);
start_time();
pthread_mutex_lock(&gmutex);
while(i--);
pthread_mutex_unlock(&gmutex);
end_time();
pthread_mutex_destroy(&gmutex);
printf(" a thread with a pthread_mutex, cost time: %.2f ms\n", spend_time());
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t pid;
//pthread_create(&pid, NULL, test_thread, NULL);
pthread_create(&pid, NULL, test_lockthread, NULL);
pthread_join(pid, NULL);
test_thread(NULL);
test_lockthread(NULL);
return 0;
}

测试结果:

	不加锁	加锁	效率对比	绝对值	加锁在不同线程
1	990.86	1007.29	1.66%	16.43	987.79
2	996.13	997.04	0.09%	0.91	1001.21
3	988.47	989.19	0.07%	0.72	982.72
4	993.6	992.02	-0.16%	-1.58	986.94
5	984.85	984.57	-0.03%	-0.28	989.66
6	991.59	986.75	-0.49%	-4.84	992.94
7	986.68	986.72	0.00%	0.04	983.4
8	989.16	991.17	0.20%	2.01	987.69
9	987.22	1001.31	1.43%	14.09	985.03
10	986.27	984.09	-0.22%	-2.18	987.14

从上表可以看出：
如果两个场景在不同的线程中，没有可比性：两者差值不同。
再考虑到进程调度。理论上偏差也比较大

如果是同一个线程中，除了第1和第9组数据，差距都不是很大：
最大偏差<2ms，偏差率<0.5%.
多数偏差<1ms,偏差率<0.1%.

那么另外两组误差在哪里呢？同样是时间片。Linux中时间片是10ms。
在程序中，两个函数是挨着执行的，如果第一个函数执行完成之后，在执行第二个函数的start_time后时间片到期，此时就会多消耗一个时间片。
那么我们将后一个函数减去时间片，则基本上可以在接收的范围内——实际上还会有至少两次线程切换

1	990.86	997.29	0.65%	6.43	987.79
9	987.22	991.31	0.41%	4.09	985.03

结论：

C 中的锁即使进入临界区，实际没有发生资源争用，基本上等同于进入非临界区。

但是性能消耗肯定是有的，应该是锁底层实现的首先自旋的时候会识别出来资源可用！

[1] disruptor原文地址：https://mechanitis.blogspot.jp/2011/07/dissecting-disruptor-why-its-so-fast.html

阅读(3264) | 评论(0) | 转发(0) |

上一篇：【QUIC 官方文档翻译】 QUIC 概述

下一篇：【GDB调试】 GDB 查看 vector中存储的指针数据

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6