分类:
2012-09-06 15:46:24
这篇文章介绍了第一代Intle虚拟化技术VT-x,非常值得一读。个人认为最大价值在于Software-only virtualization with the IA-32 and Itanium® architectures 这一节,而不是在于VT-x的介绍
纯软件方案面临的挑战和VT-x的解决之策
ring deprivileging:a
technique that runs all guest software at a privilege level greater than
0. A guest OS could be deprivileged in two distinct ways: it could run either
at privilege level 1 (the 0/1/3 model) or
at privilege level 3 (the 0/3/3 model).
Ring compression
保护VMM必须分段或者分页,但是在64位模式无分段,IA-32 paging does not
distinguish privilege levels 0–2, the guest OS must run at privilege level 3
(the 0/3/3 model).Thus, the guest OS runs at the same privilege level as guest applications
and is not protected from them. This problem is called ring compression.
Ring aliasing
Ring aliasing refers to problems that arise when software is run at a
privilege level other than the privilege level for which it was written.例如IA-32 guest OS可以PUSH CS发现自己不运行在ring 0.
VT-x很明显解决了Ring
aliasing和Ring
compression问题
Address-space compression
A VMM must reserve for itself some portion of the
guest's virtual-address space.方案一It could run entirely within the guest's virtual-address space,这样占用比较多的guest空间。 方案二The VMM can run in a separate address space, but even in that case,
the VMM must use a minimal amount of the guest's virtual-address space for the
control structures that manage transitions between guest software and the VMM.
For IA-32, these structures include the interrupt-descriptor table (IDT) and
the global-descriptor table (GDT), which reside in the linear-address space.
VT-X的解决方法:VMM和guest都有独立的地址空间(VMCS中包含CR3),The
VMX transitions are managed by the VMCS,而其处于物理地址空间中
Non-faulting access to privileged state
the IA-32 and Itanium architectures both include
instructions that access privileged state and do not fault when executed with
insufficient privilege.例如guest can execute the instructions that read, or store, from
these registers (SGDT, SIDT, SLDT, and STR) at any
privilege level.从而发现自己不处于ring
0
VT-x allows guest software running at privilege level 0 to use the
instructions LGDT, LIDT, LLDT, LTR, SGDT, SIDT, SLDT, and STR.
Adverse impact on guest system calls
sysenter和sy***it直接是ring 0和ring 3之间切换的,VMM必须emulate。但是在VT-x这不是问题。
Interrupt virtualization
control interrupt masking通常的做法fault in the context of ring
deprivileging。但是如果guest频繁mask and unmask interrupts会导致性能问题。此外,VMM有时还需要对guest注入中断
VT-x includes an external-interrupt exiting VM-execution control. When this control is set to 1, a VMM prevents guest control of interrupt masking without gaining control of every guest attempt to modify . 这里的意思外部中断会导致VM eixt,但是guest可以频繁修改EFLAGS.IF(针对前面的问题),但不会导致VM exit。
VMM利用event injection将中断分发给guest OS,但是guest OS没准备好接收中断怎么办?
VT-x also includes an interrupt-window exiting VM-execution control. When this control is set to 1, a VM exit occurs whenever guest software is ready to receive interrupts. A VMM can set this control when it has a virtual interrupt to deliver to a guest.
Frequent access to privileged resources
如果VMM利用fault来处理guest频繁访问特权资源,可能导致性能问题
A VMM can configure the VMCS (for VT-x) so that the VMM is invoked only when required
Access to hidden state
这个主要是段寄存器的影子寄存器,软件无法访问自然无法复制,但是问题不大,只关系到性能。
VT-x includes, in the guest-state area of the VMCS,
fields corresponding to CPU state not represented in any software-accessible
register. The processor loads values from these VMCS fields on every VM entry
and saves into them on every VM exit.
Intel® Virtualization Architecture overview
已经对整体上有所了解。下面列出一些细节
The guest-state area does not contain fields
corresponding to registers that can be saved and loaded by the VMM itself
(e.g., the general-purpose registers). 这样VMM可以根据实际情况优化
Like the IA-32 page tables, each VMCS is referenced
with a physical (not linear) address. This eliminates the need to locate the
VMCS in the guest's linear-address space (which, as noted below, may be
different from that of the VMM).
VM entries load processor state from the guest-state area of the VMCS.
(Note that, because the state loaded includes CR3, the guest may run in a
different linear-address space than the VMM.) In addition to loading guest
state, VM entry can be optionally configured for event injection.
在未有硬件支持之前,VMM code to emulate x86 exception delivery ,但是VT-x已经提供了相关的支持,下面的内容主要来自http://linux.linti.unlp.edu.ar/images/f/f1/Vtx.pdf
2.8.3 VM-Entry Controls for
VM entry can be configured to conclude by delivering an event through the guest IDT (after all guest state and MSRs have been loaded). This process is called event injection and is controlled by the following three VM-entry control fields:
l VM-entry interruption-information field (32 bits). This field provides details about the event to be injected:
Ø The vector (bits 7:0) determines which entry in the IDT is used.
Ø The interruption type (bits 10:8) determines details of how the injection is performed. It could be an external interrupt, a NMI, a hardware exception, a software interrupt, etc.In general, a VMM should use the type hardware exception for all exceptions other than breakpoint exceptions and overow exceptions; it should use the type software exception for those.
Ø For exceptions, the deliver-error-code bit (bit 11) determines whether delivery pushes an error code on the guest stack.
Ø VM entry injects an event if and only if the valid bit (bit 31) is 1.
l VM-entry exception error code (32 bits). This field is used if and only if the valid bit (bit 31) and the deliver-error-code bit (bit 11) are both set in the VM-entry interruption-information field.
l
VM-entry instruction length (32
bits). For injection of events whose type is software interrupt, software
exception, or privileged software exception, this _eld is used to determine the
value of RIP that is pushed on the stack.