localtime()函数导致日志性能差-up哥小号-ChinaUnix博客

up哥小号的ChinaUnix博客upge.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

up哥小号

博客访问： 154226
博文数量： 10
博客积分： 207
博客等级：入伍新兵
技术积分： 380
用户组：普通用户
注册时间： 2012-11-10 12:44

文章分类

全部博文（10）

文章存档

2015年（2）

2013年（1）

2012年（7）

我的朋友

fire0225

相关博文

localtime()函数导致日志性能差

分类：系统运维

2015-01-25 00:27:58

前些日子，做了一些日志优化，我的某个程序，打开日志的情况下单机性能非常差。。
看不下去了，着手改了下，开着日志，打了点压力，问题出来了，在压力下延时明显增加，perf看到spin_lock出来了。。
继续加压，性能不能增加，看来各线程都被spin_lock给锁住了。。
我们的日志模式理论上不应该有这种问题，是多个worker线程，通过无锁ring把日志传到某个线程上，由这个线程单独写日志。
我们使用的是多生产者，单消费者模式的无锁ring。
刚开始觉得难道是dpdk这个无锁ring的问题？可是不能啊，实现上就多了那么点CAS指令就这么差了？
无锁ring的性能不可能这么差，搂了下dpdk 无锁ring的实现，并没有什么地方调用spin_lock。。
dpdk 的ring就是多了个CAS指令，可是理论上CAS指令也是非常快的呀。。
遗漏了什么？
回到最初的问题，开日志，性能很差，不开日志，性能还不错。差别在各线程多了构造日志，通过无锁ring发过来。
忽略了构造日志这个基本环节了！

点击(此处)折叠或打开

static void log_fmt_msg(struct log_msg *log, int type, int level, const char *fmt, va_list args)
{
int size;
long ts = time(NULL);
struct tm *ptm = localtime(&ts);
...
}
}

good，都忘了调用的是glibc的localtime()了，下载glibc 2.12 看了下localtime()的实现，

点击(此处)折叠或打开

/* The C Standard says that localtime and gmtime return the same pointer. */
struct tm _tmbuf;
/* Return the `struct tm' representation of *T in local time,
using *TP to store the result. */
struct tm *
__localtime_r (t, tp)
const time_t *t;
struct tm *tp;
{
return __tz_convert (t, 1, tp);
}
weak_alias (__localtime_r, localtime_r)
/* Return the `struct tm' representation of *T in local time. */
struct tm *
localtime (t)
const time_t *t;
{
return __tz_convert (t, 1, &_tmbuf);
}
~
~

点击(此处)折叠或打开

struct tm *
__tz_convert (const time_t *timer, int use_localtime, struct tm *tp)
{
long int leap_correction;
int leap_extra_secs;
if (timer == NULL)
{
__set_errno (EINVAL);
return NULL;
}
__libc_lock_lock (tzset_lock);
/* Update internal database according to current TZ setting.
POSIX.1 8.3.7.2 says that localtime_r is not required to set tzname.
This is a good idea since this allows at least a bit more parallelism.
By analogy we apply the same rule to gmtime_r. */
tzset_internal (tp == &_tmbuf);
if (__use_tzfile)
{
if (! __tzfile_compute (*timer, use_localtime,
&leap_correction, &leap_extra_secs, tp))
tp = NULL;
}
else
{
if (! (__offtime (timer, 0, tp) && tz_compute (tp)))
tp = NULL;
leap_correction = 0L;
leap_extra_secs = 0;
}
if (tp)
{
if (use_localtime)
{
if (!__use_tzfile)
{
int isdst = (*timer >= tz_rules[0].change
&& *timer < tz_rules[1].change);
tp->tm_isdst = isdst;
tp->tm_zone = __tzname[isdst];
tp->tm_gmtoff = tz_rules[isdst].offset;
}
}
else
{
tp->tm_isdst = 0;
tp->tm_zone = "GMT";
tp->tm_gmtoff = 0L;
}
if (__offtime (timer, tp->tm_gmtoff - leap_correction, tp))
tp->tm_sec += leap_extra_secs;
else
tp = NULL;
}
__libc_lock_unlock (tzset_lock);
return tp;
}

good，问题出来了，锁在__tz_convert()中！
那也就是我每个worker线程调用localtime()函数就有很大问题了，相互之间有竞争抢锁问题
网上搜了下，用gettimeofday稍微合适点，但是也有系统调用问题
知道原因就简单了，解决：在某个线程中每10ms调用一次gettimeofday，把时间字符串构造出来，赋值给一个全局变量。每个worker线程直接去读取这个时间字符串变量就好了，毕竟目前我们日志的精度到秒就行了。
改完后试了下，锁没了，性能上去了。

问题：这个故事告诉我们，glibc提供的函数，不是随便可以调用的。

阅读(5885) | 评论(0) | 转发(0) |

上一篇：Redis random()函数问题

下一篇：没有了

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6