Chinaunix首页 | 论坛 | 博客
  • 博客访问: 613867
  • 博文数量: 197
  • 博客积分: 7001
  • 博客等级: 大校
  • 技术积分: 2155
  • 用 户 组: 普通用户
  • 注册时间: 2005-02-24 00:29
文章分类

全部博文(197)

文章存档

2022年(1)

2019年(2)

2015年(1)

2012年(100)

2011年(69)

2010年(14)

2007年(3)

2005年(7)

分类: 虚拟化

2012-03-15 23:52:46

A system call, or syscall, is the mechanism used by an application program to request service from the operating system.

 

A hypervisor call, or hypercall, referred to the paravirtualization interface, by which a guest operating system could access hypervisor services.

的实现机制 的实现

对于guest xen-2.0/linux-2.6.11-xen-sparse/include/asm-xen/hypervisor.h

/* And the trap vector is... */

#define TRAP_INSTR "int $0x82"

static inline int HYPERVISOR_xen_version( int cmd)

{

    int ret;

    unsigned long ignore;

 

    __asm__ __volatile__ (

        TRAP_INSTR

        : "=a" (ret), "=b" (ignore)

       : "0" (__HYPERVISOR_xen_version), "1" (cmd)

       : "memory" );

 

    return ret;

}

 

对于xen

xen-2.0/xen/arch/x86/traps.c文件中trap_init函数中有如下代码

#define HYPERCALL_VECTOR   0x82

/* Only ring 1 can access Xen services. */

 _set_gate(idt_table+HYPERCALL_VECTOR,14,1,&hypercall);

 

xen-2.0/xen/arch/x86/x86_32/entry.S hypercall_table,包含了实现函数的地址。hypercall根据eax寄存器进行跳转。

ENTRY(hypercall)

    …..  

       call *SYMBOL_NAME(hypercall_table)(,%eax,4)

ENTRY(hypercall_table)

        .long SYMBOL_NAME(do_set_trap_table)     /*  0 */

        …..

        .long SYMBOL_NAME(do_xen_version)

        .long SYMBOL_NAME(do_console_io)

       

      

的实现

原理:

按照下面的解释来理解,就是gues切换到xen有多种方式(intsysenter等),如果依照xen 2.0的做法,guest migration到稍有区别的平台,则guest必须recompile。如果有xen来实现hypercall page,则解决了这个问题。一个困惑:sysenter/syscall只能用于ring 3 jump into ring 0,这样的话大部分情况guest无法利用(guest一般在ring 1),目前只看到x86_64体系结构的hypercall_page_initialise_ring3_kernel有用到,目前hypercall page依然基本是填充int 82h指令。

 

In more recent versions of Xen, hypercalls are issued via an extra layer of indirection. The guest kernel calls a function in a shared memory page (mapped by the hypervisor) with the arguments passed in registers. This allows more effcient mechanisms to be used for hypercalls on systems that support them,without requiring the guest kernel to be recompiled for every minor variation in architecture. Newer chips from AMD and Intel provide mechanisms for fast transitions to and from ring 0.(指sysenterintel),syscallamd)之流) This layer of indirection allows these to be used when available.

 

Hypercalls were generated by a guest kernel in almost the same way as system calls are generated by userspace applications, the difference being that interrupt 82h, instead of 80h, is used.

 

This still works as of Xen 3, but is now deprecated. Instead, hypercalls are issued indirectly via the hypercall page. This is a memory page mapped in to the guest’s address space when the system is started.

 

[Xen-devel] Why using hypercall_page 有解释:

This allows guest migrated to a newer/older xen with a different hypercall invocation convention. Xen fills hypercall page by its convention, and thus release guest from hardcoding specific flow.

 

Hypercalls are issued by CALLing an address within this page. As you know, the old Intel/AMD x86 cpus use INT to invoke kernel's service. But the newer CPUs introduce two instruction pairs:

syscall/sysret, syscenter/sy***it. So, because the hypercall page is filled by Xen, it can hide the difference of this two types. Guest OS only take one uniform format to invoke a hypercall.

 

BTW, for HVM guest's hypercall, we don't use int 0x82 or the sysXXX instructions; we use VMCALL inside VMX guest or something similar (VMMCALL? I'm not sure) inside SVM guest.
Even for PV guest, the hypercall stub codes may have different formats/versions...
We can see these differences in the function hypercall_page_initialise().

So considering compatibility and portability, it's really not OK for a guest to assume the underlying stub codes or doing hard coding.
Using the hypercall-page method, various guests can use one unified method to invoke hypercalls.

图来自,里面的exits.s似乎应该为entry.s

的实现(xen 3.1

Hypercall_page is actually a code page, which contains 32 hypercall entry.

every entry is something like

 

"mov  $__HYPERVISOR_xxx,%eax

int  $0x82 "

 

It is initialized in hypercall_page_initialise(void *hypercall_page) at the  time  when control panel creates the domain. Later, domain can simply the  corresponding entry to issue a hypercall.

hypercall_page_initialise函数实现与平台相关。

 

void hypercall_page_initialise(struct domain *d, void *hypercall_page)

{

    if ( is_hvm_domain(d) )

        hvm_hypercall_page_initialise(d, hypercall_page);

    else if ( supervisor_mode_kernel )

        hypercall_page_initialise_ring0_kernel(hypercall_page);

    else

        hypercall_page_initialise_ring1_kernel(hypercall_page);

}

每个hypercallhypercall_page中占32个字节,这32个字节填充指令(如int 82h)和调用参数。

中的实现

一般说来, guest 有一个类似HYPERVISOR_XXX的调用(linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/ Hypercall.h文件中), 调用_hypercallN,下面是_hypercall0的例子。

 

#define HYPERCALL_STR(name)                                   \

       "call hypercall_page + ("STR(__HYPERVISOR_##name)" * 32)"

 

#define _hypercall0(type, name)                  \

({                                       \

       long __res;                          \

       asm volatile (                       \

              HYPERCALL_STR(name)            \

              : "=a" (__res)                \

              :                           \

              : "memory" );                \

       (type)__res;                         \

})

中的实现

xen/arch/x86/x86_32/traps.c对中断向量初始化

void __init percpu_traps_init(void)

{..

      /* The hypercall entry vector is only accessible from ring 1. */

    _set_gate(idt_table+HYPERCALL_VECTOR, 14, 1, &hypercall);

 

/xen/arch/x86/x86_32/entry.S中有如下代码:

ENTRY(hypercall)

        ……

        call *hypercall_table(,%eax,4)

对于guest HYPERVISOR_XXX,在xen中有一个对应的do_XXX实现,可以看 hypercall_table

 

HYPERVISOR_set_trap_table(guest) => do_set_trap_table() (file xen/arch/x86/traps.c, xen)

 

http://old-list-archives.xen.org/archives/html/xen-devel/2008-10/msg00515.html

    In xen/arch/x86/x86_32/traps.c, if supervisor_mode_kernel is true, the hypercall_page will be initialized by hypercall_page_initialise_ring0_kernel.

    my question is, does supervisor_mode_kernel mean that the guest kernel is also running in ring0, the same privilege level as Xen hypervisor?

 

    The book "the definitive guide to the xen hypervisor" (in page 30) says hypercall through int82 is now deprecated, and replaced by hypercall_page.

    but int82 can still be found in hypercall_page_initialise_ring1_kernel. In what situation it will be used?

 

 

Yes, supervisor_mode_kernel means that the dom0 kernel runs in ring 0. It also means that other guests cannot be run. It’s not really very useful these days.

 

To your other question: guests are supposed to call into the hypervisor via the hypercall page, but actually the underlying mechanism is still int 0x82 for 32-bit PV guests. It’s just hidden in the hypervisor-provided hypercall page now.

hypercall一般是供guest kernel使用的,但是有时候应用程序也需要该服务。如:

  libxenctrl (tools/libxc/xenctrl.h) is a library for low-level access to the Xen control interfaces.

  libxenguest (tools/libxc/xenguest.h) is a library for guest domain management in Xen.

应用程序申请超级调用的过程为:

  1. 打开Xen提供的内核驱动:/proc/xen/privcmd
  2. 通过ioctl系统调用来间接调用hypercall

fd = open(“/proc/xen/privcmd”, O_RDWR);  
privcmd_hypercall_t hcall = {  
       __HYPERVISOR_print_string,  
       {message, 0, 0, 0, 0}  
   };  
ioctl(fd, IOCTL_PRIVCMD_HYPERCALL, &hcall);  

复杂一点的超级调用申请的过程为:(以_HYPERVISOR_domctl超级调用为例)

  1. 通过pyxc_domain_create()获取要创建的domain的相关信息;
  2. 通过xc_domain_create()创建控制结构体变量domctl
  3. 通过do_domctl()生成超级调用请求;
  4. 传递请求到OS内核:do_xen_hypercall()
  5. do_privcmd通过ioctl来完成由3环到1环的转变,并完成超级调用。
阅读(4719) | 评论(0) | 转发(0) |
0

上一篇:逝将去汝(旧文一篇)

下一篇:占位符

给主人留下些什么吧!~~