strtok()和strtok_r()源码解析-haicg-ChinaUnix博客

信仰之路lihaicg.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

haicg

博客访问： 579748
博文数量： 50
博客积分： 571
博客等级：中士
技术积分： 1162
用户组：普通用户
注册时间： 2012-01-20 14:01

个人简介

希望成为一个有思想，有信仰的程序设计师。

文章分类

全部博文（50）

源码学习（4）

C/C++库（4）
OJ小题（1）
Web技术（2）
嵌入式（2）
数据挖掘（3）

模式识别（1）
工程管理（1）
VIM使用笔记（6）
C/C++（8）
Linux系统（14）
JAVA学习笔记（2）
算法分析（6）

编程珠玑（2）
未分配的博文（1）

文章存档

2016年（2）

2015年（2）

2014年（13）

2013年（10）

2012年（23）

我的朋友

相关博文

strtok()和strtok_r()源码解析

分类： C/C++

2013-08-07 23:00:12

strtok()的源码：

#include <string.h>
static char *olds;
#undef strtok
/* Parse S into tokens separated by characters in DELIM.
If S is NULL, the last string strtok() was called with is
used. For example:
char s[] = "-abc-=-def";
x = strtok(s, "-"); // x = "abc"
x = strtok(NULL, "-="); // x = "def"
x = strtok(NULL, "="); // x = NULL
// s = "abc\0=-def\0"
*/
char *
strtok (s, delim)
char *s;
const char *delim;
{
char *token;
if (s == NULL)
s = olds;
/* Scan leading delimiters. */
s += strspn (s, delim); //将指针移到第一个非delim中的字符的位置
if (*s == '\0')
{
olds = s;
return NULL;
}
/* Find the end of the token. */
token = s;
s = strpbrk (token, delim);// 获取到delimz中字符在字符串s中第一次出现的位置
if (s == NULL)
/* This token finishes the string. */
olds = __rawmemchr (token, '\0');
else
{
/* Terminate the token and make OLDS point past it. */
*s = '\0';
olds = s + 1;
}
return token;
}

从上面的源码可以看出，strtok是用一个静态指针变量来保存下一次字符串分割的起始位置。当有多个线程同时调用这个函数的时候就会出现问题，这个静态的指针变量就会变的混乱。同时在同一个程序中同时有两个字符串要解析，并且同时进行解析也是会出错的。
代码很简单也就没有什么要详细解释的了，主要是要注意这是线程不安全函数，用的时候要注意。
下面介绍一下strtok_r()

点击(此处)折叠或打开

#include <string.h>
#undef strtok_r
#undef __strtok_r
#ifndef _LIBC
/* Get specification. */
# include "strtok_r.h"
# define __strtok_r strtok_r
# define __rawmemchr strchr
#endif
/* Parse S into tokens separated by characters in DELIM.
If S is NULL, the saved pointer in SAVE_PTR is used as
the next starting point. For example:
char s[] = "-abc-=-def";
char *sp;
x = strtok_r(s, "-", &sp); // x = "abc", sp = "=-def"
x = strtok_r(NULL, "-=", &sp); // x = "def", sp = NULL
x = strtok_r(NULL, "=", &sp); // x = NULL
// s = "abc\0-def\0"
*/
char *
__strtok_r (char *s, const char *delim, char **save_ptr)
{
char *token;
if (s == NULL)
s = *save_ptr;
/* Scan leading delimiters. */
s += strspn (s, delim);
if (*s == '\0')
{
*save_ptr = s;
return NULL;
}
/* Find the end of the token. */
token = s;
s = strpbrk (token, delim);
if (s == NULL)
/* This token finishes the string. */
*save_ptr = __rawmemchr (token, '\0');
else
{
/* Terminate the token and make *SAVE_PTR point past it. */
*s = '\0';
*save_ptr = s + 1;
}
return token;
}
#ifdef weak_alias
libc_hidden_def (__strtok_r)
weak_alias (__strtok_r, strtok_r)
#endif

从上面的代码可以这个函数是通过save_ptr这个指针来保存下一次分割的起始地址的。这样就消除了上面的线程不安全的影响。
这两个函数主要的区别就这里。这个函数在使用的时候，一定要注意，你传入的这个save_ptr的指针，一定要是下次分割的起始地址。不可以每分割一次就申请一个指针变量。

说明：
1.这两个函数都会改变原来字符串的值。
2.这两个函数都不会造成内存泄露，因为这两个函数只是，将原来是分割符的地方用‘\0’替代了，不影响内存的释放。
3.这两个函数的参数都不可以是const char *类型,一般char *p="hello world",会被编译成const char *.

参考资料：

【1】线程安全——strtok VS strtok_r
【2】glibc 的源码 strtok.c和strtok_r.c

阅读(9497) | 评论(0) | 转发(0) |

上一篇：ubuntu10.04 下安装 kscope

下一篇：Linux C 时间操作相关函数分析

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6