6.2 The Linux Timekeeping Architecture /*Linux 计时架构*/-wangzhen11aaa-ChinaUnix博客

wangzhen11aaawangzhen.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

wangzhen11aaa

博客访问： 309293
博文数量： 94
博客积分： 2163
博客等级：大尉
技术积分： 932
用户组：普通用户
注册时间： 2010-12-20 09:23

文章分类

全部博文（94）

kernel 相关（3）
应用程序 ELF imp（4）
AT&T汇编（3）
CPU指令解释（2）
POSIX（0）
GCC（3）
我的意思（7）
算法导论－分析（4）
wget project（35）
linux-2.6.x.x（9）
好文章（5）
ULK翻译（7）
中断机制（1）
LINUX 0.11学习（1）
LINUX 驱动（1）
LINUX 文件系统（1）
LINUX 内存管理（1）
LINUX 网络堆栈（6）
未分配的博文（1）

文章存档

2012年（2）

2011年（92）

我的朋友

相关博文

6.2 The Linux Timekeeping Architecture /*Linux 计时架构*/

分类： LINUX

2011-10-08 09:20:46

Linux must carry on several time-related activities. For instance, the kernel periodically:
   Updates the time elapsed since system startup.'
   Updates the time and date.
   Determines, for every CPU, how long the current process has been running, and preempts it if it has exceeded the time allocated to it. The allocation of time slots (also called "quanta") is discussed in Chapter 7.
/*Linux 一定进行一些时间－相关的活动。例如，内核定期地:
从系统启动时，更新时间流逝
更新时间和日期
对每个CPU，确定当前进程运行了多少时间，如果它运行超过了分配给它的时间就抢占它。时间块的分配
在第七章讨论。
Updates resource usage statistics /*更新用户的资源数据*/
.Checks whether the interval of time associated with each software timer (see the later section "Software Timers and Delay Functions") has elapsed. /*检查和软件计时器有关的时间间隔是否流逝(见软件计时器和延时函数）*/
Linux's timekeeping architecture is the set of kernel data structures and functions related to the flow of time. Actually, 80 x 86-based multiprocessor machines have a timekeeping architecture that is slightly different from the timekeeping architecture of uniprocessor machine
/*Linux's的计时架构是设置数据结构和函数和时间相关。实际上， 80x86基础上的多核机器上的计时架构和单处理器是有些不同的*/
In a uniprocessor system, all time-keeping activities are triggered by interrupts raised by the global timer (either the Programmable Interval Timer or the High Precision Event Timer).
/*在单处理器系统，所有的计时活动是被全局的计时器触发的（无论是可编程间隔计时器或者高进度事件计时器）
In a multiprocessor system, all general activities (such as handling of software timers) are triggered by the interrupts raised by the global timer, while CPU-specific activities (such as monitoring the execution time of the currently running process) are triggered by the interrupts raised by the local APIC timer.
   /*在多核处理器系统，所有通用的活动(比如对软件计时器的操作）是被全局的计时器触发的，而CPU－个别的活动（比如监视当前运行的进程的时间）是被当地APIC计时器来触发的。
Unfortunately, the distinction between the two cases is somewhat blurred. For instance some early SMP systems based on Intel 80486 processors didn't have local APICs. Even nowadays, there are SMP motherboards so buggy that local timer interrupts are not usable at all. In these cases, the SMP kernel must resort to the UP timekeeping architecture. On the other hand, recent uniprocessor systems feature one local APIC, so the UP kernel often makes use of the SMP timekeeping architecture. However, to simplify our description, we won't discuss these hybrid cases and will stick to the two "pure" timekeeping architectures.
不幸的是，在两个情况的不同有些模糊。比如，一些早期的SMP系统是基于80486处理器没有当地的APIC计时器。甚至在今天，有SMP主板是如此有错竟然还没有当地计时器中断是不安全的。在这几种情况下，SMP kernel 一定要采取向上兼容的计时架构。另一方面，现在的单处理器系统有了当地APIC计时器，所以内核经常使用SMP计时架构。但是，为了简化我们的描述，我们不讨论这些混合的情况，仍然坚持两个纯正的计时架构。
Linux's timekeeping architecture depends also on the availability of the Time Stamp Counter (TSC), of the ACPI Power Management Timer, and of the High Precision Event Timer (HPET). The kernel uses two basic timekeeping functions: one to keep the current time up-to-date and another to count the number of nanoseconds that have elapsed within the current second. There are different ways to get the last value. Some methods are more precise and are available if the CPU has a Time Stamp Counter or a HPET; a less-precise method is used in the opposite case (see the later section "The time( ) and gettimeofday( ) System Calls").
Linux 计时架构也是根据时间戳计时器，ACPI 电源管理计时器，高精度实践计时器。kernel 用两个基本的计时函数：一个是保持当前的时间和另一个去计算这一秒流逝的ns。有不同的方式获得最后的值。有些方法更加精确和有效的如果CPU有一个时间戳计时器或者HPET；比较不精确的用在相反的情况（见后面的The time()和 gettimeofday()系统调用)。
6.2.1. Data Structures of the Timekeeping Architecture 计时架构的数据结构
The timekeeping architecture of Linux 2.6 makes use of a large number of data structures. As usual, we will describe the most important variables by referring to the 80 x 86 architecture.
Linux 2.6计时架构用了很多的数据结构。一般我通过80x86架构来描述最重要的变量。
   6.2.1.1. The timer object 计时器对象
   In order to handle the possible timer sources in a uniform way, the kernel makes use of a "timer object," which is a descriptor of type timer_opts consisting of the timer name and of four standard methods shown in Table 6-1.
   为了以统一的方式操作计时器资源，kernel 使用了计时器对象，是一个time_opts类型包含计时器名称和四个标准的方法描述。
Field name Description
name A string identifying the timer source
mark_offset Records the exact time of the last tick; it is invoked by the timer interrupt handler。
get_offset        Returns the time elapsed since the last tick
monotonic_clock Returns the number of nanoseconds since the kernel initialization
delay          Waits for a given number of "loops" (see the later section "Delay Functions")
名字   一个字符串来表示计时器资源
下面的是函数调用
标记_偏移   记录最后滴答的时间；被计时器中断处理程序调用。
获得_偏移   返回自从最后一个滴答以来所流逝的时间
单调_时钟   返回从kernel初始化以来的ns.
延迟        等待给定的循环次数。（见后面“延迟函数节”）
The most important methods of the timer object are mark_offset and get_offset. The mark_offset method is invoked by the timer interrupt handler, and records in a suitable data structure the exact time at which the tick occurred. Using the saved value, the get_offset method computes the time in microseconds elapsed since the last timer interrupt (tick). Thanks to these two methods, Linux timekeeping architecture achieves a sub-tick resolutionthat is, the kernel is able to determine the current time with a precision much higher than the tick duration. This operation is called time interpolation .
计时器中最重要的方法是mark_offset和get_offset. mark_offset方法被计时器中断处理程序调用，在一个合适的数据结构中记录当滴答发生时确切的时间。get_offset方法计算当最后一个计时器中断时多少微秒流逝，用来保存此值。Linux 计时架构实现了一个低于－滴答的解决方法：这个期间。这个操作叫做时间插补。
The cur_timer variable stores the address of the timer object corresponding to the "best" timer source available in the system. Initially, cur_timer points to timer_none, which is the object corresponding to a dummy timer source used when the kernel is being initialized. During kernel initialization, the select_timer( ) function sets cur_timer to the address of the appropriate timer object. Table 6-2 shows the most common timer objects used in the 80x86 architecture, in order of preference. As you see, select_timer( ) selects the HPET, if available; otherwise, it selects the ACPI Power Management Timer , if available, or the TSC. As the last resort, select_timer( ) selects the always-present PIT. The "Time interpolation" column lists the timer sources used by the mark_offset and get_offset methods of the timer object; the "Delay" column lists the timer sources used by the delay method.
cur_timer 变量记录在系统中有效且“最好的”计时器对象的地址。开始时，cur_timer指向time_none,这是一个虚拟的计时器在kernle被初始化时使用的资源对象。在kernel初始化期间，这个select_timer()函数将cur_timer设置到一个合适的计时器对象。表 6－2展示了很多一般的在80x86架构上使用有优先级顺序的计时器对象。正如你看到的，select_timer()选择了HPET,如果它有效的话；否则，它选择ACPI电源管理计时器如果有效；最后是TSC。最后采取select_timer()选取通常有效的PIT。这个时间插补列列出了被mark_offset() 和 get_offset()函数使用计时器资源；延迟列列出了被延时方法使用的计时器。
Table 6-2. Typical timer objects of the 80x86 architecture, in order of
preference
Timer object   name        Description        Time interpolation            Delay

timer_hpet       High Precision Event Timer (HPET) HPET                  HPET
timer_pmtmr     ACPI Power Management Timer (ACPI PMT) ACPI PMT          TSC
timer_tsc            Time Stamp Counter (TSC)          TSC        TSC
timer_pit          Programmable Interval Timer (PIT)    PIT            Tight   loop
timer_none     Generic dummy timer source(used during kernel (none)      Tight initialization) loop
Notice that local APIC timers do not have a corresponding timer object. The reason is that local APIC timers are used only to generate periodic interrupts and are never used to achieve sub-tick resolution.
注意到本地 APIC计时器没有相关的计时器对象。原因是本地APIC计时器被用来产生定时中断，不用于解决低于－滴答解方案
6.2.1.2. The jiffies variable /* jiffies 变量*/
The jiffies variable is a counter that stores the number of elapsed ticks since the system was started. It is increased by one when a timer interrupt occursthat is, on every tick. In the 80 x 86 architecture, jiffies is a 32-bit variable, therefore it wraps around in approximately 50 daysa relatively short time interval for a Linux server. However, the kernel handles cleanly the overflow of jiffies thanks to the time_after, time_after_eq, time_before, and time_before_eq macros: they yield the correct value even if a wraparound occurred.
jiffies 变量是个计数器记录自从系统启动来流逝的滴答的数目。每当计时器发生中断时，它加一。在80x86架构上，jiffies是个32-bit的变量，因此大约50天就环绕一次，linux服务器上相对时间很短。但是，kernel根据time_after，time_after_eq, time_before和time_before_eq宏定义很干净地来处理jiffies的溢出：他们表示正确的值即使发生了环绕。
You might suppose that jiffies is initialized to zero at system startup. Actually, this is not the case: jiffies is initialized to 0xfffb6c20, which corresponds to the 32-bit signed value 300,000; therefore, the counter will overflow five minutes after the system boot. This is done on purpose, so that buggy kernel code that does not check for the overflow of jiffies shows up very soon in the developing phase and does not pass unnoticed in stable kernels.
你可能设想：jiffies在系统初始化初始化为0。实际上不是这样：jiffies被初始化为 0xfffb6c20，这是和32-bit有符号值：300，000；因此，计数器将会在系统启动5分钟后溢出。这是按照计划做的，所以有错的kernel代码不会检查jiffies的溢出很快出现，在开发阶段；在稳定的kernel中不可以忽视这个溢出。
In a few cases, however, the kernel needs the real number of system ticks elapsed since the system boot, regardless of the overflows of jiffies. Therefore, in the 80 x 86 architecture the jiffies variable is equated by the linker to the 32 less significant bits of a 64-bit counter called jiffies_64. With a tick of 1 millisecond, the jiffies_64 variable wraps around in several hundreds of millions of years, thus we can safely assume that it never overflows.
在一些情况中，kernel需要自从系统初始化以来实际的系统滴答流逝数目，忽略掉jiffies的溢出。因此，在80x86架构上jiffies变量等效于连接64－bit低有效那一半叫做jiffies_64.这样1ms一个tick，jiffies-64在几百万万年才会溢出。我们可以假设它从不溢出。
You might wonder why jiffies has not been directly declared as a 64-bit unsigned long long integer on the 80 x 86 architecture. The answer is that accesses to 64-bit variables in 32-bit architectures cannot be done atomically. Therefore, every read operation on the whole 64 bits requires some synchronization technique to ensure that the counter is not updated while the two 32-bit half-counters are read; as a consequence, every 64-bit read operation is significantly slower than a 32-bit read operation.
你也许会好奇我为什么不把jiffies声明为64-bit的unsigned long long 类型在 80x86架构上。答案是在32-bit上访问64－bit数据不能原子性访问。因此，每读一次64位操作需要异步技术，需要保证计数器32－bit在被读取时，计数器没有更新;作为一个后果，每次64-bit读操作比32-bit读要明显慢。

The get_jiffies_64( ) function reads the value of jiffies_64 and returns its value:
    unsigned long long get_jiffies_64(void)
    {
        unsigned long seq;
        unsigned long long ret;
        do {
            seq = read_seqbegin(&xtime_lock); /*读取时要加锁*/
            ret = jiffies_64;   /*这里看出来是unsigned long long 64位*/
        } while (read_seqretry(&xime_lock, seq));
        return ret;
    }

/*get_jiffies_64()读取jiffies_64然后返回它的值。
The 64-bit read operation is protected by the xtime_lock seqlock (see the section "Seqlocks" in Chapter 5): the function keeps reading the jiffies_64 variable until it knows for sure that it has not been concurrently updated by another kernel control path.
64-bit读操作是需要用xtime_lock 锁，（见第五章Seqlocks);在函数中保持读jiffies_64变量，当知道它没有被其他的kernel控制路径更新。
Conversely, the critical region increasing the jiffies_64 variable must be protected by means of write_seqlock(&xtime_lock ) and write_sequnlock( &xtime_lock). Notice that the ++jiffies_64 instruction also increases the 32-bit jiffies variable, because the latter corresponds to the lower half of jiffies_64.
反过来，这个冲突的区域增加jiffies_64变量一定会被write_seqlock(&xtime_lock)和write_sequnlock(&xtime_lock)保护。主要到++jiffies_64指令也是增加32-bit jiffies变量，因为它和jiffies_64的低一半相关。
6.2.1.3. The xtime variable /*xtime 变量*/
The xtime variable stores the current time and date; it is a structure of type timespec having two fields:
   tv_sec    Stores the number of seconds that have elapsed since midnight of January 1, 1970 (UTC)
   tv_nsec   Stores the number of nanoseconds that have elapsed within the last second (its value ranges between 0 and 999,999,999)
xtime变量储存着当前的时间和日期；是timespec类型包含两个域：
tv_sec : 存储这从1970 -1 -1 午夜0点开始的秒数。
tv_nsec: 存储在最新的这秒内流逝的纳秒数。（值在0-999,999,999之间)
   The xtime variable is usually updated once in a tickthat is, roughly 1000 times per second. As we'll see in the later section "System Calls Related to Timing Measurements," user programs get the current time and date from the xtime variable. The kernel also often refers to it, for instance, when updating inode timestamps (see the section "File Descriptor and Inode" in Chapter 1).
xtime 变量大约在一个时钟滴答（这个时间滴答是指从8253/8254中出来的中断)更新一次，大约每秒1000次。正如我们将要在和”时间测量有关的系统调用“见到的，用户程序得到当前的时间和日期从xtime变量中。kernel也引用它，例如，当更新节点的时间戳（见“文件描述符和节点“在第一章")

The xtime_lock seqlock avoids the race conditions that could occur due to concurrent accesses to the xtime variable. Remember that xtime_lock also protects the jiffies_64 variable; in general, this seqlock is used to define several critical regions of the timekeeping architecture.
   xtime_lock 锁防止产生并发访问xtime变量的情况。记住xtime_lock也保护jiffies_64变量；一般的，这个seqlock被用于定义在计时架构的冲突区域。

6.2.2. Timekeeping Architecture in Uniprocessor Systems /*计时架构在单处理器系统*/
In a uniprocessor system, all time-related activities are triggered by the interrupts raised by the Programmable Interval Timer on IRQ line 0. As usual, in Linux, some of these activities are executed as soon as possible right after the interrupt is raised, while the remaining activities are carried on by deferrable functions (see the later section "Dynamic Timers").
在单处理器系统中，所有和时间相关的活动都是被可编程间隔计时器以 IRQ 0 触发的。一般的，在linux，一些活动在中断发生时就会被尽可能快地执行，而其他的活动会被延迟的函数执行.
6.2.2.1. Initialization phase 初始化阶段
    During kernel initialization, the time_init( ) function is invoked to set up the timekeeping architecture. It usually[*] performs the following operations:
在kernel初始化期间，time_init()函数被计时架构调用。经常用下面的接下来的操作：
[*] The time_init( ) function is executed before mem_init( ), which initializes the memory data structures. Unfortunately, the HPET registers are memory mapped, therefore initialization of the HPET chip has to be done after the execution of mem_init( ). Linux 2.6 adopts a cumbersome solution: if the kernel supports the HPET chip, the time_init( ) function limits itself to trigger the activation of the hpet_time_init( ) function.The latter function is executed after mem_init( ) and performs the operations described in this section.
time_init()函数在mem_init()(内存数据结构初始化)之前执行。不幸的是，这个HPET是内存映射的，因此初始化HPET芯片应该在mem_init执行之前执行。linux 2.6 采用了笨重的解决方法：如果kernel支持HPET芯片，那么time_init()函数限制自己去触发hpet_time_init()函数。后面的函数在mem_init()后执行如这一节描述的那样操作。
   Initializes the xtime variable. The number of seconds elapsed since the midnight of January 1, 1970 is read from the Real Time Clock by means of the get_cmos_time( ) function. The tv_nsec field of xtime is set, so that the forthcoming overflow of the jiffies variable will coincide with an increment of the tv_sec field that is, it will fall on a second boundary.
初始化xtime变量。自1970-1-1午夜0点开始的秒数被实时时钟用get_cmos_time()函数读取。xitme的tv_nsec域被设置，接下来的jiffies的溢出将会配合tv_sec的增量属于第二个边界。
Initializes the wall_to_monotonic variable. This variable is of the same type timespec as xtime, and it essentially stores the number of seconds and nanoseconds to be added to xtime in order to get a monotonic (ever increasing) flow of time. In fact, both leap seconds and synchronization with external clocks might suddenly change the tv_sec and tv_nsec fields of xtime so that they are no longer monotonically increased. As we'll see in the later section "System Calls for POSIX Timers," sometimes the kernel needs a truly monotonic time source.
初始化 wall_to_monototic变量。这个变量和xtime都是timespec类型，它有效存储加到xtime秒和纳秒的数量来得到一个单调的时间流。实际上，秒的飞跃和与外部时钟同步都可能会突然改变xtime的tv_sec和tv_nsec 域以至于他们不再是单调的增加了。我们将会在后面的"对POSIX计时器系统调用"见到，有些时候，kernel需要一个真正的单调的时间资源。
   If the kernel supports HPET, it invokes the hpet_enable( ) function to determine whether the ACPI firmware has probed the chip and mapped its registers in the memory address space. In the affirmative case, hpet_enable( ) programs the first timer of the HPET chip so that it raises the IRQ 0 interrupt 1000 times per second. Otherwise, if the HPET chip is not available, the kernel will use the PIT: the chip has already been programmed by the init_IRQ( ) function to raise 1000 timer interrupts per second, as described in the earlier section "Programmable Interval Timer (PIT)."
如果kernel支持HPET,它调用hpet_enable()函数去决定是否ACPI固件是否被侦测到这个芯片，然后映射它的寄存器到内存地址空间。在有情况下，hpet_enable()对HPET芯片进行编程使它可以每秒产生1000次IRQ 0号中断。否则，kernel利用PIT：这个芯片早已经被init_IRQ()函数编程每秒可以产生1000次中断，在前面的“可编程间隔计时器”描述了。
Invokes select_timer( ) to select the best timer source available in the system, and sets the cur_timer variable to the address of the corresponding timer object.
调用select_timer()选择最好的在系统中有效的计时器资源，然后用设置cur_timer变量到相关的计时器对象地址。
Invokes setup_irq( 0,&irq0) to set up the interrupt gate corresponding to IRQ0the line associated with the system timer interrupt source (PIT or HPET).The irq0 variable is statically defined as:
    struct irqaction irq0 = { timer_interrupt, SA_INTERRUPT, 0,
                              "timer", NULL, NULL };
From now on, the timer_interrupt( ) function will be invoked once every tick with interrupts disabled, because the status field of IRQ 0's main descriptor has the SA_INTERRUPT flag set.
调用setup(0, &irq 0)去设置和IRQ0相关的中断门，和系统计时器资源(PIT或者 HPET)相关。irq0变量被静态的定义为：
struct irqaction irq0 = { time_interrupt, SA_INTERRUPT, 0, "timer", NULL, NULL);
从这开始，timer_interrupt()函数在每当一个中断禁用时产生的滴答时被调用。因为IRQ 0 主要的描述符中的status域此SA_INTERRUPT（被置位标明中断处理程序是快速处理程序(设置SA_INTERRUPT)）。
6.2.2.2. The timer interrupt handler 计时器中断处理程序
The timer_interrupt( ) function is the interrupt service routine (ISR) of the PIT or of the HPET; it performs the following steps:
timer_interrupt()函数是PIT或者HPET的中断服务程序：它以下面的几步进行
一、 Protects the time-related kernel variables by issuing a write_seqlock() on the xtime_lockseqlock (see the section "Seqlocks" in Chapter 5).
时间相关的kernel 变量保护用write_seqlock()使用xtime_lockseqlock。见第五章Seqlock节）。
二、 Executes the mark_offset method of the cur_timer timer object. As explained in the earlier section "Data Structures of the Timekeeping Architecture," there are four possible cases:
执行cur_timer 对象的mark_offset方法。如前面描述的“计时架构的数据结构”有四种可能的情况：
1、cur_timer points to the timer_hpet object: in this case, the HPET chip is the source of timer interrupts. The mark_offset method checks that no timer interrupt has been lost since the last tick; in this unlikely case, it updates jiffies_64 accordingly. Next, the method records the current value of the periodic HPET counter.
cur_timer 指向timer_hpet对象：这种情况下，这个HPET芯片是计时器中断的资源。mark_offset方法检查没有计时器中断在最后一次滴答后还没有发生；在这种不太可能的情况下，它据此更新jiffies_64。下一个这个方法保存当前HPET的计数值。
2、cur_timer points to the timer_pmtmr object: in this case, the PIT chip is the source of timer interrupts, but the kernel uses the APIC Power Management Timer to measure time with a finer resolution. The mark_offset method checks that no timer interrupt has been lost since the last tick and updates jiffies_64 if necessary. Then, it records the current value of the APIC Power Management Timer counter.
cur_timer指向timer_pmtmr对象：这种情况下，这个PIT芯片是计时器中断的资源，但是kernel用APIC电源管理计时器去测量时间是个更好的方法。这个mark_offset 方法去检查如果自上次的滴答到现在还没有发生计时器中断，那么如果有必要就更新jiffies_64的值。然后，它记录当前的APIC电源管理计数器的值。
3、 cur_timer points to the timer_tsc object: in this case, the PIT chip is the source of timer interrupts, but the kernel uses the Time Stamp Counter to measure time with a finer resolution. The mark_offset method performs the same operations as in the previous case: it checks that no timer interrupt has been lost since the last tick and updates jiffies_64 if necessary. Then, it records the current value of the TSC counter.
cur_timer指向time_tsc对象，这种情况下，PIT是计时器中断的资源，但是kernel使用时间戳计时器去测量时间是个更好的方法。mark_offset()去表现和前面的一样。
4、cur_timer points to the timer_pit object: in this case, the PIT chip is the source of timer interrupts, and there is no other timer circuit. The mark_offset method does nothing.
cur_timer指向timer_pit对象：PIT芯片是计时器中断的资源，没有其他的计时器电路。这个mark_offset（）什么也不做。
三、 Invokes the do_timer_interrupt( ) function, which in turn performs the following actions:
/*调用do_timer_interrupt()函数，按顺序执行下面的动作：
1、Increases by one the value of jiffies_64. Notice that this can be done safely, because the kernel control path still holds the xtime_lock seqlock for writing.
在jiffies_64上增加1。注意到这样做是安全的，因为kernel控制路径仍然持有xtime_lock seqlock来写。

2、Invokes the update_times( ) function to update the system date and time and to compute the current system load; these activities are discussed later in the sections "Updating the Time and Date" and "Updating System Statistics."
2、调用update_times()函数更新系统的日期和时间，计算当前系统负载；这些活动在后面的“更新时间和日期”和“更新系统数据”节讨论。
3、Invokes the update_process_times( ) function to perform several time-related accounting operations for the local CPU (see the section "Updating Local CPU Statistics" later in this chapter).
/*调用update_process_times()函数去本地CPU执行一些时间相关的计算操作（看“更新本地cpu数据那一节见后面的章节)
4、Invokes the profile_tick( ) function (see the section "Profiling the Kernel Code" later in this chapter).
调用profile_tick()函数（见后面的章节的简单描述kernel code节)
5 、If the system clock is synchronized with an external clock (an adjtimex( ) system call has been previously issued), invokes the set_rtc_mmss( ) function once every 660 seconds (every 11 minutes) to adjust the Real Time Clock. This feature helps systems on a network synchronize their clocks (see the later section "The adjtimex( ) System Call").
5、如果系统时钟被外部时钟同步（一个adjtimex()系统调用前面提过的），调用set_rtc_mmss()函数每660秒去调整实时时钟。这个特点帮助在网络上同步时钟（见adjtimex()系统调用)
四、Releases the xtime_lock seqlock by invoking write_sequnlock(). 释放xtime_lock调用write_sequnlock函数。
五、Returns the value 1 to notify that the interrupt has been effectively handled (see the section "I/O Interrupt Handling" in Chapter 4).
返回1，如果发现所有的中断都已经处理了。（见I/O中断处理节第四章）。
6.2.3. Timekeeping Architecture in Multiprocessor Systems 计时架构在多处理器系统
Multiprocessor systems can rely on two different sources of timer interrupts: those raised by the Programmable Interval Timer or the High Precision Event Timer, and those raised by the CPU local timers.
多核处理器系统可以依赖两种不同的计时器中断资源：有些中断是被PIT或者HPET，有些是由本地CPU计时器发起。
In Linux 2.6, global timer interrupts raised by the PIT or the HPETsignal activities not related to a specific CPU, such as handling of software timers and keeping the system time up-to-date. Conversely, a CPU local timer interrupt signals timekeeping activities related to the local CPU, such as monitoring how long the current process has been running and updating the resource usage statistics.
   在Linux2.6中，全局计时器中断被PIT或者HPET引起。信号活动不是和某个指定的CPU有关的，比如处理软件计时器和和保持系统时间更新。相反的CPU本地计时器中断信号计时活动和某个本地CPU有关，比如监视当前进程运行了多长时间，和更新用户资源数据。
6.2.3.1. Initialization phase 初始化阶段
The global timer interrupt handler is initialized by the time_init( ) function, which has already been described in the earlier section "Timekeeping Architecture in Uniprocessor Systems."
    全局的计时器中断处理程序被time_init()函数初始化，前面的“在单处理器系统计时架构”已经描述。
The Linux kernel reserves the interrupt vector 239 (0xef) for local timer interrupts (see Table 4-2 in Chapter 4). During kernel initialization, the apic_intr_init( ) function sets up the IDT's interrupt gate corresponding to vector 239 with the address of the low-level interrupt handler apic_timer_interrupt( ). Moreover, each APIC has to be told how often to generate a local time interrupt. The calibrate_APIC_clock( ) function computes how many bus clock signals are received by the local APIC of the booting CPU during a tick (1 ms). This exact value is then used to program the local APICs in such a way to generate one local timer interrupt every tick. This is done by the setup_APIC_timer( ) function, which is executed once for every CPU in the system.
Linux kernel 保留了中断向量对本地计时器中断的0xef（见表4－2,第四章）。在kernel 初始化期间，apic_intr_init()函数建立起中断描述符中断门和根据此中断向量(0xef)设置低级中断服务程序apic_timer_initerrupt()。此外，每一个APIC需要被告知每隔多久产生本地时钟中断。calibrate_APIC_clock()函数计算在启动CPU时本地APIC接受多少总线时钟信号在一个滴答(1ms)。这个准确的值被用于对本地APICS编程，以此在一个滴答时就会产生一次中断。这是被setup_APIC_timer()函数完成的，这个被每一个在系统内的CPU执行。
All local APIC timers are synchronized because they are based on the common bus clock signal. This means that the value computed by calibrate_APIC_clock( ) for the boot CPU is also good for the other CPUs in the system.
所有的APIC计时器被同步因为他们都是基于一个相同的总线信号。这意味着被为启动CPU，calibratte_APIC_clock(）计算的值是适合其他的CPU的。
6.2.3.2. The global timer interrupt handler 全局计时器中断处理程序
The SMP version of the timer_interrupt() handler differs from the UP version in a few points:
SMP版本的time_interrput()处理程序和后续版本有以下方面的不同：
1、The do_timer_interrupt( ) function, invoked by timer_interrupt( ), writes into a port of the I/O APIC chip to acknowledge the timer IRQ.
do_timer_interrput()函数，被timer_interrupt()函数调用，写入APIC芯片的I/O端口去通知计时器中断。
2、The update_process_times( ) function is not invoked, because this function performs actions related to a specific CPU.
update_process_times()函数不是被调用，因为这个函数的活动适合特定的CPU有关系。
3、 profile_tick()函数不被调用，同上。
6.2.3.3. The local timer interrupt handler /*当地计时器处理程序*/
This handler performs the timekeeping activities related to a specific CPU in the system, namely profiling the kernel code and checking how long the current process has been running on a given CPU.
这个处理程序计时功能和特定的CPU有关，即分kernel code,检查当前进程已经在一个给定的CPU上运行了多少时间。
The apic_timer_interrupt( ) assembly language function is equivalent to the following code:
    apic_timer_interrupt:   /*label*/
        pushl $(239-256)   /装入中断号*/
        SAVE_ALL
        movl %esp, %eax
        call smp_apic_timer_interrupt
        jmp ret_from_intr
As you can see, the low-level handler is very similar to the other low-level interrupt handlers already described in Chapter 4. The high-level interrupt handler called smp_apic_timer_interrupt( ) executes the following steps:
apic_timer_interrupt()汇编语言和下列代码相同。
如你所见，低级别的处理程序和其他的在第四章描述级别的中断处理程序很相似。高级别的中断处理程序叫做smp_apic_timer_interrupt()执行如下步骤。
1、 Gets the CPU logical number (say, n).
2、Increases the apic_timer_irqs field of the nth entry of the irq_stat array (see the section "Checking the NMI Watchdogs" later in this chapter).
3、Acknowledges the interrupt on the local APIC.
4、Calls the irq_enter( ) function (see the section "The do_IRQ( ) function" in Chapter 4).
5、Invokes the smp_local_timer_interrupt( ) function.
6、Calls the irq_exit( ) function.
1、获得CPU的逻辑号（假设n)
2、增加第n个irq_stat 数组中的apic_timer_irqs域。（见后面的检查NMI 看门狗”）
3、通知本地APIC的中断
4、调用irq_enter()函数（见The do_IRQ()函数调用一节）。
5、调用smp_local_timer_interrupt()函数。
6、调用irq_exit()函数。
The smp_local_timer_interrupt( ) function executes the per-CPU timekeeping activities. Actually, it performs the following main steps:
1、 smp_local_timer_interrput()函数执行每一个CPU计时活动。一般来说，有下面几步：
Invokes the profile_tick( ) function (see the section "Profiling the Kernel Code" later in this chapter).
调用profile_tick()函数（见描述这kernel code后面的章节）
2、Invokes the update_process_times( ) function to check how long the current process has been running and to update some local CPU statistics (see the section "Updating Local CPU Statistics" later in this chapter).
调用update_process_times()函数去检查当前进程运行了多少时间，更新当地CPU状态（见“更新本地CPU数据，见后面的章节).
The system administrator can change the sample frequency of the kernel code profiler by writing into the /proc/profile file.To carry out the change, the kernel modifies the frequency at which local timer interrupts are generated. However, the smp_local_timer_interrupt( ) function keeps invoking the update_process_times( ) function exactly once every tick.
系统管理者可以用在/proc/profile文件中写入值的方法更改kernel code事件侦探器的样本频率。为了执行这些变化，kernel 在某一个本地时钟中断产生时修改频率。但是，smp_local_timer_interrupt()函数每产生一个滴答就持续调用update_process_times()函数。

阅读(1988) | 评论(0) | 转发(0) |

上一篇：如何启动时获得cpu工作频率＝分析

下一篇：jiffies 和 jiffies_64

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6