Linus对微内核的看法-letmego163-ChinaUnix博客

涌泉闪闪@原创空间

首页　| 　博文目录　| 　关于我

letmego163

博客访问： 448816
博文数量： 123
博客积分： 2686
博客等级：少校
技术积分： 1349
用户组：普通用户
注册时间： 2009-12-23 22:11

文章分类

全部博文（123）

game（2）
mcore（1）
Virtual Address （1）
Page Coloring（1）
memory managemen（27）
Job（2）
configuration（1）
schedule（4）
common_kernel_AP（2）
kernel_patch（1）
多核研究（2）
Linux内核数据结（2）
U-boot-2009移植（8）
Linux内核源码（26）
Linux应用相关（11）
Linux驱动程序（11）
未分配的博文（21）

文章存档

2012年（3）

2011年（10）

2010年（100）

2009年（10）

我的朋友

周宏1990

相关博文

Linus对微内核的看法

分类： LINUX

2010-06-26 16:07:08

_Arthur (Arthur_@sympatico.ca) on 5/9/06 wrote:
>
>I found that distinction between microkernels and "monolithic" kernels useful:
>With microkernels, when you call a system service, a "message" is generated to
>be handled by the kernel *task*, to be dispatched to the proper handler (task).
>There is likely to be at least 2 levels of task-switching (and ring-level switching) in a microkernel call.

I don't think you should focus on implementation details.

For example, the task-switching could be basically hidden
by hardware, and a "ukernel task switch" is not necessarily
the same as a traditional task switch, because you may have
things - hardware or software conventions - that basically
might turn it into something that acts more like a normal
subroutine call.

To make a stupid analogy: a function call is certanly "more
expensive" than a straight jump (because the function call
implies the setup for returning, and the return itself).
But you can optimize certain function calls into plain
jumps - and it's such a common optimization that it has a
name of its own ("tailcall conversion").

In a similar manner, those task switches for the system
call have very specific semantics, so it's possible to do
them as less than "real" task-switches.

So I wouldn't focus on them, since they aren't necessarily
even the biggest performance problem of an ukernel.

The real issue, and it's really fundamental, is the issue
of sharing address spaces. Nothing else really matters.
Everything else ends up flowing from that fundamental
question: do you share the address space with the caller,
or put in slightly different terms: can the callee look at
and change the callers state as if it were its own
(and the other way around)?

Even for a monolithic kernel, the answer is a very emphatic
no when you cross from user space into kernel space.
Obviously the user space program cannot change kernel state,
but it is equally true that the kernel cannot just consider
user space to be equivalent to its own data structures (it
might use the exact same physical instructions, but it
cannot trust the user pointers, which means that in
practice, they are totally different things from kernel
pointers).

That's another example of where "implementation" doesn't
much matter, this time in the reverse sense. When a kernel
accesses user space, the actual implementation of
that - depending on hw concepts and implementation - may
be exactly the same as when it accesses its own data
structures: a normal "load" or "store". But despite that
identical low-level implementation, there are high-level
issues that radically differ.

And that separation of "access space" is a really big
deal. I say "access space", because it really is something
conceptually different from "address space". The two parts
may even "share" the address space (in a monolithic kernel
they normally do), and that has huge advantages (no TLB
issues etc), but there are issues that means that you end
up having protection differences or simply semantic
differences between the accesses.

(Where one common example of "semantic" difference might be
that one "access space" might take a page fault, while
another one is guaranteed to be pinned down - this has some
really huge issues for locking around the access, and for
dead-lock avoidance etc etc).

So in a traditional kernel, you usually would share the
address space, but you'd have protection issues and some
semantic differences that mean that the kernel and user
space can't access each other freely. And that makes for
some really big issues, but a traditional kernel very much
tries to minimize them. And most importantly, a traditional
kernel shares the access space across all the basic system
calls, so that user/kernel difference is the only
access space boundary.

Now, the real problem with split access spaces is
not the performance issue (which does exist), but the
much higher complexity issue. It's ludicrous how micro-
kernel proponents claim that their system is "simpler" than
a traditional kernel. It's not. It's much much more
complicated, exactly because of the barriers that it has
raised between data structures.

The fundamental result of access space separation is that
you can't share data structures. That means that you can't
share locking, it means that you must copy any shared data,
and that in turn means that you have a much harder time
handling coherency. All your algorithms basically end up
being distributed algorithms.

And anybody who tells you that distributed algorithms
are "simpler" is just so full of sh*t that it's not even
funny.

Microkernels are much harder to write and maintain
exactly because of this issue. You can do simple
things easily - and in particular, you can do things where
the information only passes in one direction quite easily,
but anythign else is much much harder, because there is
no "shared state" (by design). And in the absense of shared
state, you have a hell of a lot of problems trying to make
any decision that spans more than one entity in the
system.

And I'm not just saying that. This is a fact. It's a fact
that has been shown in practice over and over again, not
just in kernels. But it's been shown in operating systems
too - and not just once. The whole "microkernels are
simpler" argument is just bull, and it is clearly shown to
be bull by the fact that whenever you compare the speed
of development of a microkernel and a traditional kernel,
the traditional kernel wins. By a huge amount, too.

The whole argument that microkernels are somehow "more
secure" or "more stable" is also total crap. The fact that
each individual piece is simple and secure does not make
the aggregate either simple or secure. And the
argument that you can "just reload" a failed service and
not take the whole system down is equally flawed.

Anybody who has ever done distributed programming should
know by now that when one node goes down, often the rest
comes down too. It's not always true (but neither is it
always true that a crash in a kernel driver would bring
the whole system down for a monolithic kernel), but it's
true enough if there is any kind of mutual dependencies,
and coherency issues.

And in an operating system, there are damn few things that
don't have coherency issues. If there weren't any coherency
issues, it wouldn't be in the kernel in the first place!

(In contrast, if you do distributed physics calculations,
and one node goes down, you can usually just re-assign
another node to do the same calculation over again from
the beginning. That is not true if you have a
really distributed system and you didn't even know where
the data was coming from or where it was going).

As to the whole "hybrid kernel" thing - it's just marketing.
It's "oh, those microkernels had good PR, how can we try
to get good PR for our working kernel? Oh, I know,
let's use a cool name and try to imply that it has all the
PR advantages that that other system has"

Linus

阅读(862) | 评论(0) | 转发(0) |

上一篇：阻塞与非阻塞

下一篇：等待队列——数据结构和操作API

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6