zoukankan      html  css  js  c++  java
  • xm migrate源码分析

    xm migrate源码分析

    xen动态迁移虚机的命令为:xm migrate --live <domain id> <destination machine>

    迁移的原理

    Xen live migration begins by sending a request, or reservation, to the target specifying the resources the migrating domain will need. If the target accepts the request, the source begins the iterative precopy phase of migration. During this step, Xen copies pages of memory over a TCP connection to the destination host. While this is happening, pages that change are marked as dirty and then recopied. The machine iterates this until only very frequently changed pages remain, at which point it begins the stop and copy phase. Now Xen stops the VM and copies over any pages that change too frequently to efficiently copy during the previous phase. In practice, our testing suggests that Xen usually reaches this point after four to eight iterations. Finally the VM starts executing on the new machine.

    By default, Xen will iterate up to 29 times and stop if the number of dirty pages falls below a certain threshold. You can specify this threshold and the number of iterations at compile time, but the defaults should work fine.

    xen动态迁移分为save和restore两部分,先看save部分。
    tools/python/xen/xm/migrate.py

    该函数主要是对输入的命令作参数解析,然后跳转到
     server.xend.domain.migrate(dom, dst, opts.vals.live,
                                       opts.vals.port,
                                       opts.vals.node,
                                       opts.vals.ssl)
     -->XendDomain.domain_migrate()
    tools/python/xen/xend/XendDomain.py


    domain_migrate(self, domid, dst, live=False, port=0, node=-1, ssl=None):
    1. Make sure vm existing and being running
      dominfo = self.domain_lookup_nr(domid) # get a structure of XendDomainInfo
      if not dominfo:
          raise XendInvalidDomain(str(domid))
      if dominfo.getDomid() == DOM0_ID:
          raise XendError("Cannot migrate privileged domain %s" % domid)
      if dominfo._stateGet() != DOM_STATE_RUNNING:
          raise VMBadState("Domain is not running",
              POWER_STATE_NAMES[DOM_STATE_RUNNING],POWER_STATE_NAMES[dominfo._stateGet()])
    2. Notify all device about intention of migration
      
      dominfo.testMigrateDevices(True, dst) 
      ---> XendDomainInfo.migrateDevice(n, c, network, dst, DEV_MIGRATE_TEST, self.getName()) 
      ---> XendDomainInfo.getDeviceController(deviceClass).migrate(deviceConfig,
                                            network, dst, step, domName)
      ---> DevController.migrate()
           由于xoptions.get_external_migration_tool()返回为空,实际上什么也不做,直接返回0
        
       This function is called for 4 steps:

            If step == 0: Check whether the device is ready to be migrated
                          or can at all be migrated; return a '-1' if
                          the device is NOT ready, a '0' otherwise. If it is
                          not ready ( = not possible to migrate this device),
                          migration will not take place.
               step == 1: Called immediately after step 0; migration
                          of the kernel has started;
               step == 2: Called after the suspend has been issued
                          to the domain and the domain is not scheduled anymore.
                          Synchronize with what was started in step 1, if necessary.
                          Now the device should initiate its transfer to the
                          given target. Since there might be more than just
                          one device initiating a migration, this step should
                          put the process performing the transfer into the
                          background and return immediately to achieve as much
                          concurrency as possible.
               step == 3: Synchronize with the migration of the device that
                          was initiated in step 2.
                          Make sure that the migration has finished and only
                          then return from the call.
         这儿DEV_MIGRATE_TEST=0,即步骤0
    3. For live migration, make sure there's memory free for enabling shadow mode
       dominfo.checkLiveMigrateMemory()
       --->XendDomainInfo.checkLiveMigrateMemory()
           迁移需要的内存为:1MB per vcpu plus 4Kib/Mib of RAM
           这些内存通过balloon.free(overhead_kb, self)获得
    4. 如果使用--ssl选项,则建立SSL连接;否则,建立普通tcp连接
    5. 开始迁移
       XendCheckpoint.save(sock.fileno(), dominfo, True, live, dst, node=node)
       --->XendCheckpoint.save()
    tools/python/xen/xend/XendCheckPoint.py

    def save(fd, dominfo, network, live, dst, checkpoint=False, node=-1)
      
      1. 首先发送SIGNATURE,即LinuxGuestRecord
      2. 为了避免将虚机迁移到本机时虚机重名,将原虚机暂时重命名为migrating-domain_name
      3. 发送配置文件
      4. 真正迁移是在 forkHelper(cmd, fd, saveInputHandler, False)中做的,即创建一个子进程来执行xc_save,而主进程继续执行后面的步骤:
         其中,cmd为 /usr/lib64/xen/bin/xc_save fd domid 0 0 str(int(live)|(int(hvm)<<2))
         (xc_save的参数格式为/usr/lib64/xen/bin/xc_save iofd domid maxit maxf flags
          def saveInputHandler(line, tochild):
                log.debug("In saveInputHandler %s", line)
                if line == "suspend":
                    log.debug("Suspending %d ...", dominfo.getDomid())
                    dominfo.shutdown('suspend')
                    dominfo.waitForSuspend()
                if line in ('suspend', 'suspended'):
                    dominfo.migrateDevices(network, dst, DEV_MIGRATE_STEP2,
                                           domain_name)
                    log.info("Domain %d suspended.", dominfo.getDomid())
                    dominfo.migrateDevices(network, dst, DEV_MIGRATE_STEP3,
                                           domain_name)
                    if hvm:
                        dominfo.image.saveDeviceModel()

                if line == "suspend":
                    tochild.write("done\n")
                    tochild.flush()
                    log.debug('Written done')
      5. 发送qemu设备的状态(即/var/lib/xen/qemu-save.7)
      6. 由于传入的checkpoint参数为True,将suspended的虚机resume
      7. 最后,将重命名的虚机改为原来的名字
    下面是restore部分
    tools/xen/xend/server/SrvDaemon.py

    xend在启动的时候会打开8002端口的监听:
    Daemon()-->run()
    -->relocate.listenRelocation()
    -->tcp.TCPListener(RelocationProtocol, port, interface = interface, hosts_allow = hosts_allow)
    相应的参数在xend配置文件中设置:
    (xend-relocation-server yes)
    (xend-relocation-port 8002)
    (xend-relocation-hosts-allow '^localhost$ ^localhost\\.localdomain$')
    建立连接后的处理由RelocationProtocol完成,其中最重要的函数为op_receive:
         def op_receive(self, name, _):
            if self.transport:
                self.send_reply(["ready", name])
                try:
                    XendDomain.instance().domain_restore_fd(
                        self.transport.sock.fileno(), relocating=True)

                except:
                    self.send_error()
                    self.close()
            else:
                log.error(name + ": no transport")
                raise XendError(name + ": no transport")
     而XendDomain.domain_restore_fd()是通过XendCheckpoint.restore(self, fd, paused=paused, relocating=relocating)来完成的。
    tools/python/xen/xend/XendCheckPoint.py

    def restore(xd, fd, dominfo = None, paused = False, relocating = False):
      
      1. 首先接收并验证SIGNATURE
      2. 接收配置文件
      3. 确保本机上没有与要迁移的虚机同名或同UUID的虚机
      4. 根据接收的配置文件建立XendDomainInfo结构:
         
         dominfo = xd.restore_(vmconfig)-->XendDomainInfo.restore(config)
      5. 如果原虚机设置了node_to_cpu,则绑定vcpu到相应的pcpu
      6. 创建image并分配内存(shadow_memory + memory_dynamic_max)
         restore_image = image.create(dominfo, dominfo.info)
         balloon.free(memory + shadow, dominfo)
      7. 真正迁移是在forkHelper(cmd, fd, handler.handler, True)中做的,即创建一个子进程来执行xc_restore,而主进程继续执行后面的步骤:

         其中,cmd为:
         cmd = map(str, [xen.util.auxbin.pathTo(XC_RESTORE),
                            fd, dominfo.getDomid(),
                            store_port, console_port, int(is_hvm), pae, apic])

         (xc_save的参数格式为/usr/lib64/xen/bin/xc_restore iofd domid store_evtchn console_evtchn hvm pae apic

          def handler(self, line, _):
            m = re.match(r"^(store-mfn) (\d+)$", line)
            if m:
                self.store_mfn = int(m.group(2))
            else:
                m = re.match(r"^(console-mfn) (\d+)$", line)
                if m:
                    self.console_mfn = int(m.group(2))

      5. 接收qemu设备的状态 (即/var/lib/xen/qemu-save.7)
      6. 设置虚机cpuid,并创建虚机:

            restore_image.setCpuid()
            os.read(fd, 1)           # Wait for source to close connection
            dominfo.completeRestore(handler.store_mfn, handler.console_mfn):
            self._introduceDomain()
            self.image = image.create(self, self.info)
            if self.image:
                self.image.createDeviceModel(True)
            self._storeDomDetails()
            self._registerWatches()
            self.refreshShutdown()

      7. 最后,启动虚机:

         dominfo.waitForDevices()
         dominfo.unpause()

    xc_save和xc_restore的源码

    暂时还没有仔细看。

  • 相关阅读:
    ansible——playbook conditions条件判断
    ansible——playbook lookups从插件加载变量
    ansible——playbook循环
    lombok注解
    集合与集合取笛卡尔积
    List排列组合
    synchronized初识
    java IO与NIO
    文件I/O和标准I/O
    双数据源配置
  • 原文地址:https://www.cnblogs.com/feisky/p/2433238.html
Copyright © 2011-2022 走看看