zoukankan      html  css  js  c++  java
  • KVM and Qemu as Linux Hypervisor

     instruction set to provide isolation of resources at hardware level. Since Qemu is a userspace process, the kernel treats it like other processes from the scheduling perspective.

    Before we discuss Qemu and KVM, we touch upon Intel Vt-x and the specific instruction set added by vt-x.

    Vt-x solves the problem that the x86 instructions architecture cannot be virtualized.

    Simplify VMM software by closing virtualization holes by design.

    Ring Compression

    Non-trapping instructions

    Excessive trapping

    Eliminate need for software virtualization (i.e paravirtualization, binary translation).

    Adds one more mode called the non-root mode where the virtualized guest can run. Guest doesn’t necessarily have to be an operating system though. There are projects like Dune which run a process within the VM environment rather then a complete OS. In root mode it’s the VMM which runs. This is the mode where kvm runs.

    Transitions take place between the non-root mode to root mode via a VM exit and similarly from root mode to non-root mode via a vm entry. The registers and address spaces are swapped in a single atomic operation.

    Image for post

    Qemu runs as a user process and handles kvm kernel module for using the vt-x extensions to provide the guest an isolated environment from memory and cpu perspective.

    This is how it looks like

    Image for post

    Qemu process owns the guest RAM and is either memory mapped via file or anonymous. Each vcpu provided to the guest runs as a thread on the kernel. This gives the advantage that the vcpu are scheduled by the linux scheduler like any other threads. The difference is just the code which gets executed on those threads. In the case of guest since it’s the machine which is virtualized, the code executes software BIOS and also the operating system.

    Qemu also dedicates separate thread for I/O. This thread runs an event loop and is based on the non blocking mechanism and registers the file descriptors for i/o. Qemu can use paravirtualized drivers like virtio to provide guests with virtio devices like virtio-blk for block devices and virtio-net for network devices.

    Image for post

    Here you see Guest within the Qemu process implements the front end driver where as the Host implements the backend drivers. The communication between front-end and back-end driver happens over specialized data structures called virt-queues. So any packet which originates from the guest is first put into the virt queue and the host side driver is notified over a hypercall, to drain the packet for actual processing to device. There can be two variations of this packet flow.

    1. Packet from guest received by Qemu and then pushed to backend driver on host. Example being virtio-net
    2. Packet from guest directly reach the host via what is called a vhost driver. This bypasses the Qemu layer and is relatively faster.

    Also there is a hotplug capability to make the devices dynamically available in the guest. This allows to dynamically resize the block devices as an example. There is also a hotplug-dimm module which allows to resize the RAM available to the guest.

    Finally creation of a VM happens over a set of ioctl calls to the kernel kvm module which exposes a device /dev/kvm to the guest. In simplistic terms these are the calls from userspace to create and launch a VM

    KVM CREATE VM : The new VM has no virtual cpus and no memory

    KVM SET USER MEMORY REGION : MAP userspace memory for the VM

    KVM CREATE IRQCHIP / …PIT KVM CREATE VCPU : Create hardware component and map them with VT-X functionalities

    KVM SET REGS / …SREGS / …FPU / … KVM SET CPUID / …MSRS / …VCPU EVENTS / … KVM SET LAPIC : hardware configurations

    KVM RUN : Start the VM

    KVM RUN starts the VM and internally it’s the VMLaunch instruction invoked by the kernel module to put the VM code execution in non-root mode and post that changing the instruction pointer to the location of code in guest memory. This might be an over simplification as the module does much more to setup the VM like setting up VMCS(VM Control Section) etc.

    Disclaimer : The views expressed above are personal and not of the company I work for.

  • 相关阅读:
    泛型应用----泛型接口、泛型方法、泛型数组、泛型嵌套
    有选择性的启用SAP UI5调试版本的源代码
    SAP UI5应用入口App.controller.js是如何被UI5框架加载的?
    SAP WebIDE里UI5应用的隐藏文件project.json
    SAP UI5的support Assistant
    如何用SAP WebIDE的Fiori创建向导基于ABAP OData service快速创建UI5应用
    SAP Cloud Platform上Destination属性为odata_gen的具体用途
    Marketing Cloud contact主数据的csv导入
    Marketing Cloud的contact merge机制
    如何让某些用户对Marketing Cloud的contact数据只能实施只读操作
  • 原文地址:https://www.cnblogs.com/dream397/p/14274297.html
Copyright © 2011-2022 走看看