Linux内核常用宏和数据结构二（学习内核必备）-dengjin

HASH表适用于不需要对整个空间元素进行排序，而是只需要能快速找到某个元素的场合，是一种以空间换时间的方法，本质也是线性表，但由一个大的线性表拆分为了多个小线性表，由于只需要查找小表，因此搜索速度就会线性查整个大表提高很多，理想情况下，有多少个小线性表，搜索速度就提高了多少倍，通常把小线性表的表头综合为一个数组，大小就是HASH表的数量。

HASH表速度的关键是HASH函数的设计，HASH函数根据每个元素中固定的参数进行计算，算出一个不大于HASH表数量的索引值，表示该元素需要放在该索引号对应的那个表中，对于固定的参数，计算结果始终是固定的，但对于不同的参数值，希望计算出来的结果能尽可能地平均到每个索引值，HASH函数计算得越平均，表示每个小表中元素的数量都会差不多，这样搜索性能将越好。HASH函数也要尽可能的简单，以减少计算时间，常用的算法是将参数累加求模，在include/linux/jhash.h中已经定义了一些HASH计算函数，可直接使用。

HASH表在路由cache表，状态连接表等处用得很多。

举例，连接跟踪中根据tuple值计算HASH：

// net/ipv4/netfilter/ip_conntrack_core.c

u_int32_t
hash_conntrack(const struct ip_conntrack_tuple *tuple)
{
#if 0
dump_tuple(tuple);
#endif
return (jhash_3words(tuple->src.ip,
                      (tuple->dst.ip ^ tuple->dst.protonum),
                      (tuple->src.u.all | (tuple->dst.u.all << 16)),
                      ip_conntrack_hash_rnd) % ip_conntrack_htable_size);
}

// include/linux/jhash.h
static inline u32 jhash_3words(u32 a, u32 b, u32 c, u32 initval)
{
a += JHASH_GOLDEN_RATIO;
b += JHASH_GOLDEN_RATIO;
c += initval;

__jhash_mix(a, b, c);

return c;
}

4. 定时器(timer)

linux内核定时器由以下结构描述：

/* include/linux/timer.h */
struct timer_list {
struct list_head list;
unsigned long expires;
unsigned long data;
void (*function)(unsigned long);
};

list：timer链表
expires：到期时间
function：到期函数，时间到期时调用的函数
data：传给到期函数的数据，实际应用中通常是一个指针转化而来，该指针指向一个结构

timer的操作：

增加timer，将timer挂接到系统的timer链表：
extern void add_timer(struct timer_list * timer);

删除timer，将timer从系统timer链表中拆除：
extern int del_timer(struct timer_list * timer);
(del_timer()函数可能会失败，这是因为该timer本来已经不在系统timer链表中了，也就是已经删除过了)

对于SMP系统，删除timer最好使用下面的函数来防止冲突：
extern int del_timer_sync(struct timer_list * timer);

修改timer，修改timer的到期时间：
int mod_timer(struct timer_list *timer, unsigned long expires);

通常用法：
struct timer_list通常作为数据结构中的一个参数，在初始化结构的时候初始化timer，表示到期时要进行的操作，实现定时动作，通常更多的是作为超时处理的，timer函数作为超时时的资源释放函数。注意：如果超时了运行超时函数，此时系统是处在时钟中断的bottom half里的，不能进行很复杂的操作，如果要完成一些复杂操作，如到期后的数据发送，不能直接在到期函数中处理，而是应该在到期函数中发个信号给特定内核线程转到top half进行处理。

为判断时间的先后，内核中定义了以下宏来判断：

#define time_after(a,b) ((long)(b) - (long)(a) < 0)
#define time_before(a,b) time_after(b,a)

#define time_after_eq(a,b) ((long)(a) - (long)(b) >= 0)
#define time_before_eq(a,b) time_after_eq(b,a)

这里用到了一个技巧，由于linux中的时间是无符号数，这里先将其转换为有符号数后再判断，就能解决时间回绕问题，当然只是一次回绕，回绕两次当然是判断不出来的，具体可自己实验体会。

5. 内核线程(kernel_thread)

内核中新线程的建立可以用kernel_thread函数实现，该函数在kernel/fork.c中定义：

long kernel_thread(int (*fn)(void *), void * arg, unsigned long flags)

fn：内核线程主函数；
arg：线程主函数的参数；
flags：建立线程的标志；

内核线程函数通常都调用daemonize()进行后台化作为一个独立的线程运行，然后设置线程的一些参数，如名称，信号处理等，这也不是必须的，然后就进入一个死循环，这是线程的主体部分，这个循环不能一直在运行，否则系统就死在这了，或者是某种事件驱动的，在事件到来前是睡眠的，事件到来后唤醒进行操作，操作完后继续睡眠；或者是定时睡眠，醒后操作完再睡眠；或者加入等待队列通过schedule()调度获得执行时间。总之是不能一直占着CPU。

以下是内核线程的一个实例，取自kernel/context.c:

int start_context_thread(void)
{
static struct completion startup __initdata = COMPLETION_INITIALIZER(startup);

kernel_thread(context_thread, &startup, CLONE_FS | CLONE_FILES);
wait_for_completion(&startup);
return 0;
}

static int context_thread(void *startup)
{
struct task_struct *curtask = current;
DECLARE_WAITQUEUE(wait, curtask);
struct k_sigaction sa;

daemonize();
strcpy(curtask->comm, "keventd");
keventd_running = 1;
keventd_task = curtask;

spin_lock_irq(&curtask->sigmask_lock);
siginitsetinv(&curtask->blocked, sigmask(SIGCHLD));
recalc_sigpending(curtask);
spin_unlock_irq(&curtask->sigmask_lock);

complete((struct completion *)startup);

/* Install a handler so SIGCLD is delivered */
sa.sa.sa_handler = SIG_IGN;
sa.sa.sa_flags = 0;
siginitset(&sa.sa.sa_mask, sigmask(SIGCHLD));
do_sigaction(SIGCHLD, &sa, (struct k_sigaction *)0);

/*
* If one of the functions on a task queue re-adds itself
* to the task queue we call schedule() in state TASK_RUNNING
*/
for (;;) {
  set_task_state(curtask, TASK_INTERRUPTIBLE);
  add_wait_queue(&context_task_wq, &wait);
  if (TQ_ACTIVE(tq_context))
   set_task_state(curtask, TASK_RUNNING);
  schedule();
  remove_wait_queue(&context_task_wq, &wait);
  run_task_queue(&tq_context);
  wake_up(&context_task_done);
  if (signal_pending(curtask)) {
   while (waitpid(-1, (unsigned int *)0, __WALL|WNOHANG) > 0)
    ;
   spin_lock_irq(&curtask->sigmask_lock);
   flush_signals(curtask);
   recalc_sigpending(curtask);
   spin_unlock_irq(&curtask->sigmask_lock);
  }
}
}

6. 结构地址

在C中，结构地址和结构中第一个元素的地址是相同的，因此在linux内核中经常出现使用结构第一个元素的地址来表示结构地址的情况，在读代码时要注意这一点，这和list_entry宏的意思一样。

如：
struct my_struct{
int a;
int b;
}c;

if(&c == &c.a){ // always true
...
}

阅读(844) | 评论(0) | 转发(0) |

上一篇：Linux内核常用宏和数据结构一（学习内核必备）

下一篇：A Detailed Look at the Linux Boot Process

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6