分类: LINUX
2012-03-18 22:56:28
Xen和guest都有各自的init_IRQ函数,irq_desc全局数组,do_IRQ处理函数,以及中断返回处理,简单来说,就是xen的中断处理借鉴了Linux的实现
来自Xen Intro- version 1.0的材料非常精当
Registration (or binding) of irqs in guest domains:
第一部分:guest的初始化,guest的irq实际和evtchn绑定,
The guest OS calls init_IRQ() when it boots (start_kernel() method calls init_IRQ() ; file init/main.c). (init_IRQ() is in file sparse/arch/xen/kernel/evtchn.c) There can be 256 physical irqs; so there is an array called irq_desc with 256 entries. (file sparse/include/linux/irq.h)
All elements in this array are initialized in init_IRQ() so that their status is disabled (IRQ_DISABLED).
Now, when a physical driver starts it usually calls request_irq(). This method eventually calls setup_irq() (both in sparse/kernel/irq/manage.c). which calls startup_pirq(). startup_pirq() send a hypercall to the hypervisor (HYPERVISOR_event_channel_op) in order to bind the physical irq (pirq).The hypercall is of type EVTCHNOP_bind_pirq. See: startup_pirq() (file sparse/arch/xen/kernel/evtchn.c)
注1:在xen 3.1中已经不包含这个文件sparse/kernel/irq/manage.c,该文件在Linux内核中
注2:physical driver 对应static struct hw_interrupt_type pirq_type 。
static struct hw_interrupt_type pirq_type = {
.typename = "Phys-irq",
.startup = startup_pirq,
};
而setup_irq中有这样调用desc->handler->startup(irq)。
第二部分:Xen
On the Hypervisor side, handling this hypervisor call is done in: evtchn_bind_pirq() method (file /common/event_channel.c) which calls pirq_guest_bind() (file arch/x86/irq.c). The pirq_guest_bind() changes the status of the corresponding irq_desc array element to be enabled (~IRQ_DISABLED注[3]). it also calls startup() method. Now when an interrupts arrives from the controller (the APIC), we arrive at do_IRQ() method as is also in usual linux kernel
(also in arch/x86/irq.c). The Hypervisor handles only timer and serial interrupts. Other interrupts are passed to the domains by calling _do_IRQ_guest() (In fact, the IRQ_GUEST flag is set for all interrupts except for timer and serial interrupts). _do_IRQ_guest() send the interrupt by calling send_guest_pirq() to all guests who are registered on this IRQ. The send_guest_pirq() creates an event channel (an instance of evtchn注[4]) and sets the pending flag of this event channel. (by calling evtchn_set_pending()) Then, asynchronously, Xen will notify this domain regarding this interrupt calling evtchn_set_pending()) Then, asynchronously, Xen will notify this domain regarding this interrupt (unless it is masked).
注[3]: 此处的irq_desc注意是xen的irq_desc,而第一部分提到设置为IRQ_DISABLED是guest的irq_desc。
注[4]:这个说法不确切,“The send_guest_pirq() creates an event channel” 该event channel是在evtchn_bind_pirq时已经分配好,send_guest_pirq只是根据pirq找到该evtchn而已。
中断的处理初始化init_IRQ函数在xen/arch/x86/i8259.c文件中
When an interrupt occurs control passes to the Xen common_interrupt routine(见文件asm/asm_defns.h中的宏BUILD_COMMON_IRQ), that calls the Xen do_IRQ function.(该函数在xen/arch/x86/irq.c文件中)
do_IRQ:
Checks who has the responsibility to handle the interrupt:
The VMM: the interrupt is handled internally by the VMM
One ore more guest OS: it calls __do_IRQ_guest function
:
For each domain that has a binding to the IRQ sets to 1 the pending flag of the event channel via send_guest_pirq
xen仅仅需要处理2个物理中断,即串口中断(ns16550)和计时器中断,分见于函数ns16550_init_postirq和early_time_init。
中断的处理In Xen interrupts to be notified to the Linux guest OS are handled through the event channels notification mechanism.
During startup the guest OS installs two handlers (event and failsafe) via the HYPERVISOR_set_callbacks hypercall:
The event callback is the handler to be called to notify an event to the guest OS
The failsafe callback is used when a fault occurs when using the event callback
linux-2.6-xen-sparse/arch/i386/mach-xen/setup.c中有代码如下
void __init machine_specific_arch_setup(void)
{
static struct callback_register __initdata event = {
.type = CALLBACKTYPE_event,
.address = { __KERNEL_CS, (unsigned long)hypervisor_callback },
};
static struct callback_register __initdata failsafe = {
.type = CALLBACKTYPE_failsafe,
.address = { __KERNEL_CS, (unsigned long)failsafe_callback },
};
ret = HYPERVISOR_callback_op(CALLBACKOP_register, &event);
if (ret == 0)
ret = HYPERVISOR_callback_op(CALLBACKOP_register, &failsafe);
hypervisor_callback 在linux-2.6-xen-sparse/arch/i386/kernel/entry-xen.S文件中,其实现和作用见“xen的ret_from_intr”小节的分析。
可以看到:The event callback handler is hypervisor_callback function (is the installed at startup), that calls evtchn_do_upcall. 具体的分析见evtchn分析篇。
evtchn_do_upcall:
1 Checks for pending events
2 Resets to zero the pending flag
3 Uses the evtchn_to_irq array to identify the IRQ binding for the event channel
4 Calls Linux do_IRQ interrupt handler function
Andrndr
或driver domain的物理中断
http://blog.csdn.net/snailhit/article/details/6413399
startup_pirq, enable_pirq等几个操作都调用了HYPERVISOR_physdev_op超级调用.
中有如下代码:
/* DOM0 is permitted full I/O capabilities. */
rc |= irqs_permit_access(dom0, 0, NR_IRQS-1);
问题:
Driver Domain是不是通过XEN_DOMCTL_irq_permission 打开中断?
的ret_from_intr在xen/arch/x86/x86-32/entry.S中
通过CS来判断这个中断是否发生在ring0,如果是就跳到返回,如果不是就跳到test_all_events,这里就开始进行guest中断的检测和处理。
ENTRY()
GET_CURRENT(%ebx)
movl UREGS_eflags(%esp),%eax
movb UREGS_cs(%esp),%al
testl $(3|X86_EFLAGS_VM),%eax
jnz test_all_events
jmp restore_all_xen
test_guest_events先检查upcall_mask,如果没有置位再检查upcall_pending
test_all_events:
…..
test_guest_events:
movl VCPU_vcpu_info(%ebx),%eax
testb $0xFF,VCPUINFO_upcall_mask(%eax)
jnz restore_all_guest
testb $0xFF,VCPUINFO_upcall_pending(%eax)
jz restore_all_guest
/*process_guest_events:*/
sti
leal VCPU_trap_bounce(%ebx),%edx
movl VCPU_event_addr(%ebx),%eax
movl %eax,TRAPBOUNCE_eip(%edx)
movl VCPU_event_sel(%ebx),%eax
movw %ax,TRAPBOUNCE_cs(%edx)
movb $TBF_INTERRUPT,TRAPBOUNCE_flags(%edx)
call create_bounce_frame
jmp test_all_events
:
testl $~3,%eax
jz domain_crash_synchronous
movl %eax,UREGS_cs+4(%esp)
movl TRAPBOUNCE_eip(%edx),%eax
movl %eax,UREGS_eip+4(%esp)
ret
如果有事件的话, 首先通过create_bounce_frame构造帧。create_bounce_frame的参数从哪里来呢?这就要回到前面提到的HYPERVISOR_set_callbacks。
xen 中, HYPERVISOR_set_callbacks在xen中的实现为
do_set_callbacks=>register_guest_callback, 该函数纪录了guest中传递过来的callback信息.
static long (struct callback_register *reg)
{
long ret = 0;
struct vcpu *v = current;
switch ( reg->type )
{
case CALLBACKTYPE_event:
v->arch.guest_context.event_callback_cs = reg->address.cs;
v->arch.guest_context.event_callback_eip = reg->address.eip;
break;
}
xen/arch/x86/x86-32/asm-offset.c中有如下代码
OFFSET(VCPU_event_addr, struct vcpu,
arch.guest_context.event_callback_eip);
这样的话,可以看到hypervisor_callback被准备为create_bounce_frame的参数。所以当通过restore_all_guest返回guest时,hypervisor_callback被调用。