zoukankan      html  css  js  c++  java
  • Xen Memory Management

    • All low-level memory operations go through Xen.
    • Guest OSes are responsible for allocating and initializing PTs for processes (restricted to read only access)
      • allocates and initialize a page and register it with Xen to serve as the new PT
    • Direct page writes are intercepted, validated and applied by the Xen VMM
      • update can be batched into a single hypercall (reduce cost of entering/exiting Xen)
    • page_info struct associated with each machine page frame
      • page type (none, l1, l2, l3, l4, LDT, GDT, RW)
      • reference count – number of references to the page
      • page frame can be reused only when unpinned and its reference count is zero
    • Each domain has a maximum and current memory allocation
      • max allocation is set at domain creation time and cannot be modified
    • PT updates
      • hypercall –> mmu_update()
      • writable page tables –> vm_assist()
    • Xen exists in the top 64MB (0xFC000000 – 0xFFFFFFFF) section of every guest virtual address space (TLB flush avoided when entering/leaving the hypervisor)
      • not accessible or remappable by guest OSes.
    • “fast handler” for system calls - direct access from app into guest OS, without going through Xen
      • muse execute outside Ring 0
    • Each guest supports a “ballon” memory management driver - that is used by the VMM to dynamically adjust the guest’s memory usage
    • Page fault handling
      • faulting address is written into an extended stack frame on the guest OS stack (normally the faulting address is read from a privileged processor register (CR2))
    • In terms of page protection, Ring1/2 are considered to be part of ‘supervisor mode’. The WP bit in CR0 controls whether read-only restrictions are respected in supervisor mode – if the bit is clear then any mapped page is writable. Xen gets around this by always setting the WP bit and disallowing updates to it. xen/arch/x86/boot/x86_32.S#153
    • Xen provides a domain with a list of machine frames during bootstrapping, and it is the domain’s responsibility to create the pseudo-physical address space from this

    No guarantee that a domain will receive a contiguous stretch of physical memory. Most OSes do not have good support for operating in a fragmented physical address space.

    • Machine memory
      • entire amount of memory installed in the machine (physical memory)
      • 4kB machine page frames numbered consecutively starting from 0.
    • Pseudo-physical memory
      • per-domain abstraction.
      • allows a guest OS to consider its memory allocation to consist of a contiguous range of physical page frames starting at physical frame 0.
    • machine-to-physical table
      • globally readable table maintained by Xen
      • records the mapping from machine addresses to pseudo-physical addresses
      • table size is proportional to the amount of RAM installed in the machine
    • physical-to-machine table
      • per-domain table which performs the inverse (physical-to-machine) mapping.
      • table size is proportional to the memory allocation of the given domain.

    (XEN) VIRTUAL MEMORY ARRANGEMENT (for DOM0)
    (XEN) Loaded kernel: c0100000→c042e254
    (XEN) Init. ramdisk: c042f000→c07fca00
    (XEN) Phys-Mach map: c07fd000→c086e894 == 454 MB (as can be verified by: xm list)
    (XEN) Start info: c086f000→c0870000
    (XEN) Page tables: c0870000→c0874000 == 16 MB
    (XEN) Boot stack: c0874000→c0875000
    (XEN) TOTAL: c0000000→c0c00000 
    (XEN) ENTRY ADDRESS: c0100000


    x86-32 Xen supports only guests with 2-level page tables. PGD = l2, PTE =l1


    How to intercept interrupts from guest domains
    http://lists.xensource.com/archives/html/xen-devel/2006-09/msg00597.html
    http://lists.xensource.com/archives/html/xen-devel/2006-09/msg00604.html

    Page fault handling for Xen guests
    http://lists.xensource.com/archives/html/xen-devel/2006-02/msg00263.html

    show pagetable walk if guest cannot handle page
    http://lists.xensource.com/archives/html/xen-devel/2006-09/msg00612.html

    Memory management, mapping, paging questions...
    http://lists.xensource.com/archives/html/xen-devel/2006-10/msg01151.html

    Information related to shadowing
    http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00319.html
    http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00793.html
    http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00802.html

    How to intercept memory operation in Xen
    http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00659.html
    http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00664.html
    http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00717.html

    alert message from dom0 to domU
    http://lists.xensource.com/archives/html/xen-devel/2006-12/msg00967.html

    Share Memory Between DomainU and Domain0
    http://lists.xensource.com/archives/html/xen-devel/2006-12/msg01008.html

    Call hypercall straightly from user space
    http://lists.xensource.com/archives/html/xen-devel/2006-12/msg01061.html


    xen/arch/x86/traps.c#do_page_fault –> fixup_page_fault –> mm.c#ptwr_do_page_fault


    xen-3.0.2-2/xen/arch/x86/setup.c#__start_xen()
                    |                                 \
                    v                                  \
    xen-3.0.2-2/xen/common/domain.c#domain_create()     \
                    |                                    \
                    v                                     \
    xen-3.0.2-2/xen/arch/x86/domain.c#arch_domain_create() \
                                                            \
                                                             v
                    xen-3.0.2-2/xen/arch/x86/domain_build.c#construct_dom0()
    
    Xen-ELF image vmlinux-syms-2.6.16-xen has a special'__xen_guest' section
    
    
    Xen hypercall table:
    /xen-3.0.2-2/xen/arch/x86/x86_32/entry.S
    
    
    #I think this is called when DOM0 attempts to create a DOMU
    xen-3.0.2-2/xen/common/dom0_ops.c#do_dom0_op()
    
    
    
    
    trousers-0.2.7/src/tspi/spi_tpm.c#Tspi_TPM_Quote()
                    |
                    v
    trousers-0.2.7/src/tcsd_api/calltcsapi.c#TCSP_Quote()
                    |
                    v
    trousers-0.2.7/src/tcsd_api/tcstp.c#TCSP_Quote_TP()
                    |
                    v
    trousers-0.2.7/src/tcsd_api/tcstp.c#sendTCSDPacket()

    原文:https://wiki.cs.dartmouth.edu/nihal/doku.php/xen:memory

    一.x86_64是怎么嵌入到Dom0的线性空间的
    IA32是通过段保护机制做到的:高64M为Ring-0的Xen空间;
    1G-64M为Kernel的Ring-1空间;
    其他的3G给Application

    x86_64没有段保护机制,必须用页保护机制:2^64-2^47 --> 2^64 == 内核空间
    0 --> 2^47 == 用户空间
    中间空的部分可以作为他用 == 被Xen用了

    二.Xen采用直接模式 == Guest OS使用自己的页表直接访问HPA
    方法: 页表里的内容为HPA;页表项Guest OS只可读;普通的页Guest OS可直接读写。
    一旦更新引起Page异常。如果想要更新/操作页表,可以调用相应的Hypercall。
    VMM也能保证Guest OS只能访问自己的内存。

    Guest OS操作内存的流程:
    1.Guest OS访问一个新内存地址(GVA),PageFault ==> 更新Guest OS的页表
    2.Guest OS先找到页表的GPA,VMM根据GPA找到该GPA对应的HPA(通过P2M)
    ==> 相当于页表更新,调用页表更新的Hypercall(GPA,HPA)
    3.如果子页表不存在,需要挂接该子页
    ==> 相当于页表挂接操作,调用页表操作的Hypercall(线性地址,HPA)
    4.访问该PT表,重复以上2-3步,最终得到一个GVA==>HPA的地址

    三.可写页表
    由于对页表的操作开销比较大(每次都要进行Hypercall调用),在某些情况下可以改进它()。
    方法是:先把页表(实际上只要把总表PD表)拿下来,不让别人访问,把它作为Guest OS的普通的可读写页
    Guest OS随便更改,很多次更改完成后,最后提交给Hypercall,让VMM一次完全的完成更新操作。
    前提:PAE模式。因为PDE只有一个PD页。

    四.Balloon驱动(存在的Dom0和DomU中)
    为Dom0和DomU申请/释放内存
    可以查看自己和全Machine的内存状况
    Balloon驱动根据设置在XenStore的中的目标值来自动调整它的内存的大小。

    五.共享页是怎么实现的
    Start Info Page(包括里面的内容)是VMM在Domain初始化时拼成的,它的内容包括了Shared Info Page和XenStore的连接,进入Domain的前几件事就是把本Doamin的Shared Info Page利用页表更新上真正VMM已经分配了的存在Start Info Page。
    HVM的PV驱动(主要是)当然也要用Shared Info Page,它的Shared Info Page是自己拼成的。

    4.就算是Dom0利用VT-x不也很好吗,用了吗?
    没有用,半虚拟化不需要用VT-x技术,目的是为了提高系统的性能
    5.PAE模式是什么,有什么影响
    物理地址扩展 (PAE) 允许将最多64GB 的物理内存用作常规的4KB 页面,并扩展内核能使用的位数以将物理内存地址从32扩展到36。

    Dom0只有在迁移的时候才用到影子页表,其他时候都用直接访问物理内存。

    注释:
    gpfn/gfn: guset page frame number 客户物理页面号(客户操作系统使用gpfn/gfn对客户物理地址空间寻址)
    mfn: machine page frame number 机器页面号
    smfn: machine page frame number for shadow pages shadow页面所在的机器页面号
    l1e: level 1 page table entry 
    gl1e: level 1 guest page table entry
    sl1e: level 1 shadow page table entry 一级shadow页表项
    PV: para-virtualization
    HVM: Hardware assistant Virtual Machine

     
  • 相关阅读:
    安装MySQL5.7.19 网上的文章参考 并做了部分修改
    从hadoop一路配置到spark
    java面试问题收集(2)
    JAVA的 IO NIO AIO笔记
    Shiro
    Spring注解使用注意点
    oracle RAC
    spark随笔
    Storm知识点笔记
    真机调试手机程序,电脑插上手机数据线虚拟机中的系统就死掉
  • 原文地址:https://www.cnblogs.com/feisky/p/2439067.html
Copyright © 2011-2022 走看看