Kubernetes 基于 Namespace 的物理队列实现，即Namespace下的Pod和Node的强绑定

zoukankan html css js c++ java

Kubernetes 基于 Namespace 的物理队列实现，即Namespace下的Pod和Node的强绑定
Kuberntes 目前在实际业务部署时，有两个流派：一派推崇小集群，一个或数个业务共享小集群，全公司有数百上千个小集群组成；另一派推崇大集群，每个AZ（可用区）一个或数个大集群，各个业务通过Namespace的方式进行隔离。

两者各有优劣，但是从资源利用率提升和维护成本的角度，大集群的优势更加突出。但同时大集群也带来相当多的安全、可用性、性能的挑战和维护管理成本。

本文属于Kubernetes多租户大集群实践的一部分，用来解决多租户场景下，如何实现传统的物理队列隔离。

物理队列并不是一个通用的业界名词，它来源于一种集群资源管理模型，该模型简化下如下：
- 逻辑队列（Logical Queue）：逻辑队列是虚拟资源分配的最小单元，将虚拟资源配额（Quota）配置在逻辑队列上（如CPU 200 标准核、内存 800GB等）
  
  逻辑队列对应Kubernetes的Namespace概念。参考Resource Quotas
  
  不同的逻辑队列之间可以设置Qos优先级，实现优先级调度。参考Limit Priority Class consumption by default可以限制每个Namespace下Pod的优先级选择
  
  配额分两种：Requests（提供保障的资源）和Limits（资源的最大限制），其中仅Requests才能算Quota，Limits 由管理员视情况选择
- 物理队列（Physical Queue）：物理队列对应底层物理机资源，同一台物理机仅能从属于同一个物理队列。物理队列的资源总额就是其下物理机可提供的资源的总和。
  
  物理队列当前在Kubernetes下缺乏概念映射
  
  逻辑队列和物理队列是多对多绑定的关系，即同一个逻辑队列可以跨多个物理队列。
  
  逻辑队列的配额总和 / 物理队列的资源总和 = 全局超售比
- 租户：租户可以绑定多个逻辑队列，对应关系仅影响往对应的Namespace中部署Pod的权限。
资源结构如图所示：

1、原理

物理队列实现：
- 给节点配置Label和Taint，Label用于选择，Taint用于拒绝非该物理队列的Pod部署。
和Namespace的自动绑定的原理：
- 配置两个Admission Controller: PodNodeSelector和PodTolerationRestriction，参考Admission Controllers
- 给Namespace增加默认的NodeSelector和Tolerations策略，并自动应用到该 Namespace 下的全部新增 Pod 上，从而自动将Pod绑定到物理队列上。
2、配置

2.1 集群开启Admission Controller: PodNodeSelector,PodTolerationRestriction

我是在已经运行的k8s集群开启PodNodeSelector,PodTolerationRestriction准入控制的，不能直接使用kubectl edit命令编辑kube-apiserver这个pod，直接加保存时报错，需要修改修改 /etc/kubernetes/manifests/kube-apiserver.yaml配置文件
- --enable-admission-plugins=NodeRestriction,PodNodeSelector,PodTolerationRestriction
修改配置文件后立刻生效,之后查看kube-apiserver这个pod被重启了，这样就修改完成了。

2.2 创建 Namespace
apiVersion: v1 kind: Namespace metadata: name: public annotations: scheduler.alpha.kubernetes.io/node-selector: "node-restriction.kubernetes.io/physical_queue=public-phy" scheduler.alpha.kubernetes.io/defaultTolerations: '[{"operator": "Equal", "effect": "NoSchedule", "key": "node-restriction.kubernetes.io/physical_queue", "value": "public-phy"}]' # scheduler.alpha.kubernetes.io/tolerationsWhitelist: '[{"operator": "Equal", "effect": "NoSchedule", "key": "node-restriction.kubernetes.io/physical_queue", "value": "public-phy"}]'
此处要点：
- 文档有问题，toleration 配置是一个list，配置错误在部署时会提示解析JSON错误
- tolerationsWhitelist配置后，就算配置有defaultTolerations且相同，也需要在Pod中指定对应的toleration，所以不能配置
- NoSchedule 已经足够限制，无需 NoExecute，Node配置的时候同样配置，此处可根据需求进行选择。
- 物理队列的前缀建议为 node-restriction.kubernetes.io/physical_queue，此处是根据文档的建议，后续可以配合NodeRestriction admission plugin限制kubelet自定配置
- 目前Namespace尚不能绑定多个物理队列:
  
  NodeSelector 无法支持in语法，见 4
  
  defaultTolerations 可以配置多个 Torleration
2.3 给Node绑定物理队列
kubectl label node node7 node-restriction.kubernetes.io/physical_queue=public-phy kubectl taint nodes node7 node-restriction.kubernetes.io/physical_queue=public-phy:NoSchedule
3、测试

3.1 测试的Deployment如下
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80
3.2 验证提交到指定物理队列中的Pod默认增加NodeSelector和Toleration
kubectl apply -f nginx_deployment.yaml --namespace public
kubectl describe pod nginx-deployment-574b87c764-khmd7 -n=public
3.3 验证和物理队列中指定的NodeSelector冲突的Pod无法提交
kubectl delete deployment nginx-deployment --namespace public # 修改nginx_deployment.yaml ，增加spec.template.spec.nodeSelector nodeSelector: node-restriction.kubernetes.io/physical_queue: second-phy # 验证能否部署 kubectl apply -f nginx_deployment.yaml --namespace public # 查看deployments kubectl describe replicaset nginx-deployment-585fcd8d7d --namespace public Warning FailedCreate 49s (x15 over 2m11s) replicaset-controller Error creating: pods is forbidden: pod node label selector conflicts with its namespace node label selector
4、相关问题

Q: NodeSelector无法使用Set-based语法，导致逻辑队列（NameSpace）无法绑定多个物理队列

后续考虑使用Node Affinity配置节点亲和性。但是目前并没有现成的Adminssion Controller去给Namespace绑定默认的节点亲和性，如有需求需要自己开发。

NodeSelector 和 Toleration 的功能，可以被 Node Affinity 进行替代，且后者提供更高级的调度功能，后续尝试是否基于此进行资源调度的整体设计。

此外Node Affinity还可以实现一个逻辑队列绑定多个物理队列的情况下，配置物理队列的调度权重的功能，即优先部署到某个物理队列。

参考：https://github.com/ninehills/ninehills.github.io/issues/77
查看全文

相关阅读:
mysql数据库监控利器lepus天兔工具安装和部署
 通过zabbix自带api进行主机的批量添加操作
 svn服务器的搭建备份和还原和svnmanager的使用
 elasticsearch自动按天创建索引脚本
 nginx或者squid正向代理实现受限网站的访问
 mysql查询sending data占用大量时间的问题处理
 解决由腾讯qq浏览器引起win10系统桌面图标不停的闪烁问题
 缓存系列之四：redis持久化与redis主从复制
 缓存系列之三：redis安装及基本数据类型命令使用
 缓存系列之二：CDN与其他层面缓存

原文地址：https://www.cnblogs.com/zhangmingcheng/p/13489790.html

Kubernetes 基于 Namespace 的物理队列实现，即Namespace下的Pod和Node的强绑定

Q: NodeSelector无法使用Set-based语法，导致逻辑队列（NameSpace）无法绑定多个物理队列