一、helm命令安装。
检查helm命令是否存在:helm version
安装helm命令:
参考资料:https://blog.csdn.net/zhoumengshun/article/details/108161015
参考官网:
https://helm.sh/docs/intro/install/
https://helm.sh/zh/docs/intro/install/
二、安装chaosblade。
(1)检查chaosblade是否存在
helm ls --all chaosblade-operator
(2)安装chaosblade:
helm v2安装:helm install --namespace chaosblade --name chaosblade-operator chaosblade-operator-VERSION-v2.tgz
helm v3安装:helm install chaosblade-operator chaosblade-operator-VERSION-v3.tgz --namespace chaosblade
(3)安装在 kube-system 命令空间下后,ChaosBlade Operator 启动后会在每个节点部署 chaosblade-tool Pod 和一个chaosblade-operator Pod。可通过以下命令查看安装结果
(4)卸载chaosblade步骤。
helm delete chaosblade-operator
查看状态
helm ls --all chaosblade-operator
彻底删除:
helm del --purge chaosblade-operator
三、安装metrics-service服务。
metrics-server是用来扩展k8s的第三方apiserver,其主要作用是收集pod或node上的cpu,内存,磁盘等指标数据,并提供一个api接口供kubectl top命令访问;默认情况kubectl top 命令是没法正常使用,其原因是默认apiserver上没有对应的接口提供收集pod或node的cpu,内存,磁盘等核心指标数据;kubectl top命令主要用来显示pod/node资源的cpu,内存,磁盘的占用比例;该命令能够正常使用必须依赖Metrics API。
(1)下载metrics-service的yaml配置文件:
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.4.0/components.yaml
(2)修改配置:
增加配置:- --kubelet-insecure-tls
修改配置:- --kubelet-preferred-address-types=InternalIP
修改镜像:image: phperall/metrics-server:v0.4.1
注销健康检查:
(3)给节点打标签。
查看components.yaml文件中配置的节点标签信息:
打标签:kubectl label nodes node-name kubernetes.io/os=linux (node-name为节点名称,例如master1、slave1)
(4)配置RBAC授权(配置策略参考官网:https://kubernetes.io/zh/docs/reference/access-authn-authz/rbac/;https://kubernetes.io/zh/docs/reference/access-authn-authz/rbac/#kubectl-auth-reconcile)
[root@k8s-master cfg]# kubectl top pod -A
Error from server (Forbidden): pods.metrics.k8s.io is forbidden: User "system:kube-proxy" cannot list resource "pods" in API group "metrics.k8s.io" at the cluster scope
[root@k8s-master cfg]# kubectl top pod
Error from server (Forbidden): pods.metrics.k8s.io is forbidden: User "system:kube-proxy" cannot list resource "pods" in API group "metrics.k8s.io" in the namespace "default"
配置授权,如下(命名空间根据top命令实际需要访问的命名空间进行配置):
cat metrics_RBAC_rule.yaml
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: default
name: metrics-reader
rules:
- apiGroups: ["metrics.k8s.io"]
resources: ["pods"]
verbs: ["get", "watch", "list"]
- apiGroups: ["metrics.k8s.io"]
resources: ["nodes"]
verbs: ["get", "watch", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: read-pods
namespace: default
subjects:
- kind: User
name: system:kube-proxy #用户名称
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: metrics-reader
apiGroup: rbac.authorization.k8s.io
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: metrics-reader
rules:
- apiGroups: ["metrics.k8s.io"]
resources: ["pods"]
verbs: ["get", "watch", "list"]
- apiGroups: ["metrics.k8s.io"]
resources: ["nodes"]
verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: metrics-reader
subjects:
- kind: User
name: system:kube-proxy #用户名称
apiGroup: rbac.authorization.k8s.io
配置生效:
测试应用 RBAC 对象的清单文件,显示将要进行的更改:kubectl auth reconcile -f metrics_RBAC_rule.yaml --dry-run
应用 RBAC 对象的清单文件,保留角色中的额外权限和绑定中的其他主体:kubectl auth reconcile -f metrics_RBAC_rule.yaml
四、故障实验
(1)注入故障(更多实验文件参考:https://github.com/chaosblade-io/chaosblade-operator/tree/v1.3.0/examples)
kubectl apply -f chaosblade_cpu_load.yaml
(2)销毁故障
根据实验名称停止:
kubectl get blade
kubectl delete blade names
通过yaml配置文件停止
kubectl delete -f chaosblade_cpu_load.yaml
通过blade命令停止
此方式仅限使用 blade 创建的实验,使用以下命令停止:
blade destroy <UID>
五、卸载chaosblade
执行 helm del --purge chaosblade-operator 卸载即可,将会停止全部实验,删除所有创建的资源