zoukankan      html  css  js  c++  java
  • boot,rebuild,resize,migrate有关的scheduler流程

    代码调用流程:

    1. nova.scheduler.client.query.SchedulerQueryClient#select_destinations
    2. nova.scheduler.rpcapi.SchedulerAPI#select_destinations
    3. nova.scheduler.manager.SchedulerManager#select_destinations
    4. nova.scheduler.filter_scheduler.FilterScheduler#select_destinations

    scheduler的rpcapi和manager属于同步调用。

    在第三步中scheduler会调用placement提供的API,对所有的`compute node`进行初步的筛选,placement的API会返回一个字典,格式如下:

    {
        "provider_summaries": {
            "4cae2ef8-30eb-4571-80c3-3289e86bd65c": {
                "resources": {
                    "VCPU": {
                        "used": 2,
                        "capacity": 64
                    },
                    "MEMORY_MB": {
                        "used": 1024,
                        "capacity": 11374
                    },
                    "DISK_GB": {
                        "used": 2,
                        "capacity": 49
                    }
                }
            }
        },
        "allocation_requests": [
            {
                "allocations": [
                    {
                        "resource_provider": {
                            "uuid": "4cae2ef8-30eb-4571-80c3-3289e86bd65c"
                        },
                        "resources": {
                            "VCPU": 1,
                            "MEMORY_MB": 512,
                            "DISK_GB": 1
                        }
                    }
                ]
            }
        ]
    }
    View Code

    对于placement API筛选出的节点,scheduler会再度进行筛选,大概的筛选过程:all hosts => filtering => weighting => random
    1. get all hosts:这里的all host当然不是指环境中所有的host,而是在通过placement API,返回的所有host的详细信息;
    2. filtering:首先过滤ignore host和force host,如果force host或者force node直接返回即可。然后结合nova的配置文件中available_filters和enabled_filters参数,依次执行所有的filter。下面我们举几个filter的例子,执行filter的入口:

    nova.filters.BaseFilterHandler#get_filtered_objects
     
        def get_filtered_objects(self, filters, objs, spec_obj, index=0):
            list_objs = list(objs)
            LOG.debug("Starting with %d host(s)", len(list_objs))
            part_filter_results = []
            full_filter_results = []
            log_msg = "%(cls_name)s: (start: %(start)s, end: %(end)s)"
            # 循环遍历配置文件中指定的filters
            for filter_ in filters:
                if filter_.run_filter_for_index(index):
                    cls_name = filter_.__class__.__name__
                    # 记录开始该执行filter之前的host的个数
                    start_count = len(list_objs)
                    # 对所有的host执行该filter,返回只有经过该filter的host
                    objs = filter_.filter_all(list_objs, spec_obj)
                    if objs is None:
                        LOG.debug("Filter %s says to stop filtering", cls_name)
                        return
                    list_objs = list(objs)
                    end_count = len(list_objs)
                    part_filter_results.append(log_msg % {"cls_name": cls_name,
                            "start": start_count, "end": end_count})
                    if list_objs:
                        remaining = [(getattr(obj, "host", obj),
                                      getattr(obj, "nodename", ""))
                                     for obj in list_objs]
                        full_filter_results.append((cls_name, remaining))
                    else:
                        LOG.info(_LI("Filter %s returned 0 hosts"), cls_name)
                        full_filter_results.append((cls_name, None))
                        break
                    LOG.debug("Filter %(cls_name)s returned "
                              "%(obj_len)d host(s)",
                              {'cls_name': cls_name, 'obj_len': len(list_objs)})
            # 下边是一些日志中打印一些详细信息,不在赘述
            …………
            return list_objs
    View Code

    接下来介绍几个filter。

    class AvailabilityZoneFilter(filters.BaseHostFilter):
     
        # 如果是一次创建多个虚机,则AvailabilityZoneFilter指执行一次
        run_filter_once_per_request = True  
        # 所有的filter都需要实现该方法
        def host_passes(self, host_state, spec_obj):
            # 获取request_spec中指定的availability_zone,这里需要强调一下,如果创建时,没有指定--availability-zone 参数,request_sepc中的availability_zone就是空的。
            availability_zone = spec_obj.availability_zone
            # 如果request_spec中availability_zone值为空,那么也就是这个操作是允许跨AZ操作的。
            if not availability_zone:
                return True
            # 获取host的availability_zone信息,首先获取该host所属的aggregate信息,aggregate信息中有availability_zone相关的信息
            metadata = utils.aggregate_metadata_get_by_host(
                    host_state, key='availability_zone')
     
            if 'availability_zone' in metadata:
                # 判断request_spec中指定的availability_zone是否在该host所属的availability_zone中。
                hosts_passes = availability_zone in metadata['availability_zone']
                host_az = metadata['availability_zone']
            else:
                hosts_passes = availability_zone == CONF.default_availability_zone
                host_az = CONF.default_availability_zone
     
            if not hosts_passes:
                LOG.debug("Availability Zone '%(az)s' requested. "
                          "%(host_state)s has AZs: %(host_az)s",
                          {'host_state': host_state,
                           'az': availability_zone,
                           'host_az': host_az})
     
            return hosts_passes
    View Code
    nova.scheduler.filters.image_props_filter.ImagePropertiesFilter#host_passes
     
        # 主要是根据镜像中的property的值进行过滤,在ironic的调度中会使用到。
        def host_passes(self, host_state, spec_obj):
            image_props = spec_obj.image.properties if spec_obj.image else {}
            # 判断该compute_node是否支持image的property属性中指定的参数值。
            if not self._instance_supported(host_state, image_props,
                                            host_state.hypervisor_version):
                LOG.debug("%(host_state)s does not support requested "
                            "instance_properties", {'host_state': host_state})
                return False
            return True
         
        def _instance_supported(self, host_state, image_props,
                                hypervisor_version):
            img_arch = image_props.get('hw_architecture') # 架构,i686或x86_64
            img_h_type = image_props.get('img_hv_type') # hypervisor 类型
            img_vm_mode = image_props.get('hw_vm_mode') # 虚拟化类型
            …………
            # 获取该compute_node支持的instance类型,返回值为列表。比如:
            [["x86_64", "baremetal", "hvm"]]
            [["i686", "qemu", "hvm"], ["i686", "kvm", "hvm"], ["x86_64", "qemu", "hvm"], ["x86_64", "kvm", "hvm"]]
            supp_instances = host_state.supported_instances
            …………
            比较规则
            def _compare_props(props, other_props):
                # 对image的property指定的所有值进行遍历
                for i in props:
                    查看该property是否是该compute_node支持的
                    if i and i not in other_props:
                        return False
                return True
            # 对该compute_node支持的所有类型进行遍历
            for supp_inst in supp_instances:
                if _compare_props(checked_img_props, supp_inst)
    View Code

    对于Ironic的调度需要我们着重使用到ImagePropertiesFilter,虚机使用的镜像和裸机使用的镜像中的property的值是不同的,再结合相关的placement的调度,实现虚机不会调度到ironic node,同时创建裸机不会调度到qemu的node。

    3. 把过滤后的hosts计算权重并且进行最优排序,下面我们举几个weight的例子:

    class BaseWeightHandler(loadables.BaseLoader):
        object_class = WeighedObject
     
        def get_weighed_objects(self, weighers, obj_list, weighing_properties):
            """Return a sorted (descending), normalized list of WeighedObjects."""
            # obj_list 表示filter筛选出的所有hosts
            # weighing_properties 表示request_sepc信息
            weighed_objs = [self.object_class(obj, 0.0) for obj in obj_list]
            # 如果经过filter筛选只剩一个host,则无需进行权重的比较,直接返回该host即可
            if len(weighed_objs) <= 1:
                return weighed_objs
            # 根据配置文件中指定的weigher_classes,逐个计算权重
            for weigher in weighers:
                # 以RAMWeigher为例进行说明
                weights = weigher.weigh_objects(weighed_objs, weighing_properties)
     
                # Normalize the weights
                weights = normalize(weights,
                                    minval=weigher.minval,
                                    maxval=weigher.maxval)
     
                for i, weight in enumerate(weights):
                    obj = weighed_objs[i]
                    # 将计算后的权重值,保存到host信息中,并且将所有类型的权重加到一块,如果我们想要增加某种类型的权重比例,我们可以修改配置文件中*_weight_multiplier的值,比如我们想要在权重的计算中有关内存的权重占更大的作用,那么我们可以通过调节ram_weight_multiplier的值达到效果。
                    obj.weight += weigher.weight_multiplier() * weight
            # 按照权重进行性排序(倒序)
            return sorted(weighed_objs, key=lambda x: x.weight, reverse=True)
             
    class RAMWeigher(weights.BaseHostWeigher):
        minval = 0
     
        def weight_multiplier(self):
            """Override the weight multiplier."""
            return CONF.filter_scheduler.ram_weight_multiplier
     
        def _weigh_object(self, host_state, weight_properties):
            """Higher weights win.  We want spreading to be the default."""
            # 直接返回该节点的剩余内存,也就是剩余内存越多的节点,有关内存的权重越大。
            return host_state.free_ram_mb
    View Code

    4. random,这个过程我们通过代码进行详细的分析。

    host_subset_size = CONF.filter_scheduler.host_subset_size
    if host_subset_size < len(weighed_hosts):
        weighed_subset = weighed_hosts[0:host_subset_size]
    else:
        weighed_subset = weighed_hosts
    # 使用随机算法,从N个中抽取1个
    chosen_host = random.choice(weighed_subset)
    weighed_hosts.remove(chosen_host)
    return [chosen_host] + weighed_hosts

    对于host_subset_size参数,默认值为1。官方是这样解释的:如果设置大于1的正整数,当有多个scheduler进程处理相同的请求是会减少调度到同一台host的可能性,创造了一种竞争机制。从N个host中挑选最适合请求的一个host,会减少冲突。然而,如果该值设置的越大,对于给定的请求,选择的主机可能不太优化。

  • 相关阅读:
    BZOJ 2821: 作诗(Poetize)( 分块 )
    BZOJ 2440: [中山市选2011]完全平方数( 二分答案 + 容斥原理 + 莫比乌斯函数 )
    BZOJ 1058: [ZJOI2007]报表统计( 链表 + set )
    BZOJ 1034: [ZJOI2008]泡泡堂BNB( 贪心 )
    BZOJ 1016: [JSOI2008]最小生成树计数( kruskal + dfs )
    BZOJ 2329: [HNOI2011]括号修复( splay )
    BZOJ 3143: [Hnoi2013]游走( 高斯消元 )
    BZOJAC400题留念
    BZOJ 2982: combination( lucas )
    poj 3233
  • 原文地址:https://www.cnblogs.com/gushiren/p/9642325.html
Copyright © 2011-2022 走看看