1.6.5.1. Kernel preemption disabling
禁止内核抢夺
To provide a drastically
simple solution to synchronization problems, some traditional Unix
kernels are nonpreemptive: when a process executes in Kernel Mode, it
cannot be arbitrarily suspended and substituted with another process.
Therefore, on a uniprocessor system, all kernel data structures that
are not updated by interrupts or exception handlers are safe for the
kernel to access.
为了给同步问题提供一个有效的解决方案,一些传统的UNIX
系统内核是非抢夺式的,也就是说当一个进程在kernel
mode下执行的时候,不允许它被随意挂起或者被其它进程替代。这样,在一个单CPU的系统上,所有的内核数据结构,只要不被中断/异常处理更改,就都是
保证安全的。
Of course, a process in
Kernel Mode can voluntarily relinquish the CPU, but in this case, it
must ensure that all data structures are left in a consistent state.
Moreover, when it resumes its execution, it must recheck the value of
any previously accessed data structures that could be changed.
当然,允许一个处于Kernel mode的进程主动地放弃对CPU的占有。但在这种情况下,它必须保证,所有的数据结构都都处于一致的状态。此外,当它恢复执行的时候,它必须重新检查那些它曾经访问过、并且有可能被其它进程更改的数据结构是否已经发生了变化。
A synchronization mechanism
applicable to preemptive kernels consists of disabling kernel
preemption before entering a critical region and reenabling it right
after leaving the region.
对抢夺式的内核来说,一个可行的同步机制是,在进入critical region之前禁止内核抢夺,并且在离开critical region之后重新允许内核抢夺。
Nonpreemptability is not
enough for multiprocessor systems, because two kernel control paths
running on different CPUs can concurrently access the same data
structure.
不可抢夺的内核设计在多处理器环境下是不能保证正确的,因为多个kernel control path可能在不同的CPU上同时运行,并且访问同一个数据结构。
1.6.5.2. Interrupt disabling
禁止中断
Another synchronization
mechanism for uniprocessor systems consists of disabling all hardware
interrupts before entering a critical region and reenabling them right
after leaving it. This mechanism, while simple, is far from optimal. If
the critical region is large, interrupts can remain disabled for a
relatively long time, potentially causing all hardware activities to
freeze.
另外一个针对单处理器系统的同步机制是,在进入一个
critical region之前禁止全部的硬件中断,并且在离开这个critical
region之后重新打开(允许)硬件中断。这一机制虽然简单,但是效率不高。如果critical
region比较大,中断可能被禁止较长的时间,这可能会使硬件被冻结,不能作出响应。
Moreover, on a
multiprocessor system, disabling interrupts on the local CPU is not
sufficient, and other synchronization techniques must be used.
此外,在多处理器系统上,在当前CPU上禁止中断是不能正确完成同步功能的,必须同时使用附加的同步技术。
1.6.5.3. Semaphores
信号量
A widely used mechanism,
effective in both uniprocessor and multiprocessor systems, relies on
the use of semaphores . A semaphore is simply a counter associated with
a data structure; it is checked by all kernel threads before they try
to access the data structure. Each semaphore may be viewed as an object
composed of:
信号量作为一种同步机制,在单处理器和多处理器系统里都是有效的,因而得到广泛使用。一个信号量,实际上是一个和数据结构相关联的计数器。任何内核线程试图访问这个数据结构之前都必须先检查这个计数器。每个信号量都可以被看作是由以下组件构成的一个对象:
An integer variable
一个整型变量
A list of waiting processes
一个等待访问某数据结构的进程列表
Two atomic methods: down( ) and up( )
两个原子操作:down()和up()
The down( ) method
decreases the value of the semaphore. If the new value is less than 0,
the method adds the running process to the semaphore list and then
blocks (i.e., invokes the scheduler). The up( ) method increases the
value of the semaphore and, if its new value is greater than or equal
to 0, reactivates one or more processes in the semaphore list.
Down()方法让信号量的值减少1。如果信号量中变量的新
值小于0,这个方法就会把检查这个信号量的进程加到这个信号量的进程列表上,然后阻塞(即调用系统调度器去执行其它进程)。Up()方法让信号量的值增加
1,如果信号量的新值大于等于0,就从信号量的进程列表中取出一个进程并解除对它的阻塞。
Each data structure to be
protected has its own semaphore, which is initialized to 1. When a
kernel control path wishes to access the data structure, it executes
the down( ) method on the proper semaphore. If the value of the new
semaphore isn't negative, access to the data structure is granted.
Otherwise, the process that is executing the kernel control path is
added to the semaphore list and blocked. When another process executes
the up( ) method on that semaphore, one of the processes in the
semaphore list is allowed to proceed.
每一个需要被保护的数据结构都有它自己的信号量,这个信号量
被初始化为1。当一个kernel control
path需要访问这个数据结构的时候,它在这个信号量上执行down()操作。如果这个信号量的新值是非负的,对该数据结构的访问就被允许。否则,执行
down()操作的进程就被加到这个信号量的进程列表里,然后被阻塞。当另外一个进程执行了这个信号量的up()操作,那个信号量的进程列表里,就会有一
个进程被解除阻塞。
1.6.5.4. Spin locks
回旋锁
In multiprocessor systems,
semaphores are not always the best solution to the synchronization
problems. Some kernel data structures should be protected from being
concurrently accessed by kernel control paths that run on different
CPUs. In this case, if the time required to update the data structure
is short, a semaphore could be very inefficient. To check a semaphore,
the kernel must insert a process in the semaphore list and then suspend
it. Because both operations are relatively expensive, in the time it
takes to complete them, the other kernel control path could have
already released the semaphore.
在多处理器系统中,信号量并不是同步问题的最佳解决方案。有
一些内核数据结构必须被严格保护,使他不被运行在不同CPU上的Kernel control
path同时访问。这种情况下,如果对该数据结构的访问所需的时间很短,那么信号量也许就不是一种有效的同步机制。为了等待访问一个资源的权限,内核需要
把一个进程加到对应信号量的进程列表里,并将其阻塞——这两个操作和访问数据结构这一操作相比,相对代价较高,也许这两个操作还没做完,信号量关联的数据
结构已经被正占用它的kernel control path释放了。
In these cases,
multiprocessor operating systems use spin locks . A spin lock is very
similar to a semaphore, but it has no process list; when a process
finds the lock closed by another process, it "spins" around repeatedly,
executing a tight instruction loop until the lock becomes open.
这种情况下,针对多处理器的操作系统一般选用回旋锁来进行同步。一个回旋锁和一个信号量很像,但是它没有对应的进程列表。当一个进程发现某个回旋锁被另外一个进程锁住的时候,该进程只是反复的回旋,执行一个密集指令循环来等待该锁被释放。
Of course, spin locks are
useless in a uniprocessor environment. When a kernel control path tries
to access a locked data structure, it starts an endless loop.
Therefore, the kernel control path that is updating the protected data
structure would not have a chance to continue the execution and release
the spin lock. The final result would be that the system hangs.
当然,回旋锁在单处理器系统中是没有用的。当一个
kernel control path试图访问一个被加锁的数据结构的时候,就会陷入无限循环,这样,锁住该数据结构的kernel control
path就得不到CPU去完成对该数据结构的更新,这样这个锁就永远不会被打开。结果是死锁,系统挂起。
1.6.5.5. Avoiding deadlocks
避免死锁
Processes or kernel control
paths that synchronize with other control paths may easily enter a
deadlock state. The simplest case of deadlock occurs when process p1
gains access to data structure a and process p2 gains access to b, but
p1 then waits for b and p2 waits for a. Other more complex cyclic waits
among groups of processes also may occur. Of course, a deadlock
condition causes a complete freeze of the affected processes or kernel
control paths.
需要和其它进程或者kernel control
path同步的进程和kernel control
path经常陷入死锁状态。最简单的死锁状态发生在以下情况下:进程P1取得访问数据a的权利,进程P2取得访问数据b的权利,但是此时P1等待对b的访
问权限并且P2等待对a的访问权限。更复杂的、存在于多个进程之间的循环等待造成的死锁也可能发生。一个死锁状态会造成相关的全部进程和kernel
control path被冻结。
As far as kernel design is
concerned, deadlocks become an issue when the number of kernel locks
used is high. In this case, it may be quite difficult to ensure that no
deadlock state will ever be reached for all possible ways to interleave
kernel control paths. Several operating systems, including Linux, avoid
this problem by requesting locks in a predefined order.
就内核设计而言,在内核锁被大量使用的时候,死锁就成了一个问题。这种情况下,要保证各kernel control path在以可能的任何顺序被交织时没有死锁发生,是非常困难的。很多操作系统,包括linux,预先定义了申请加锁必须依照的顺序,来避免死锁。
阅读(1091) | 评论(0) | 转发(0) |