背景介绍
为了使资源被Kubernetes平台接管,基础资源得到统一管理、平台管理,RabbitMQ也可以实现部署到Kubernetes集群平台中
本文安装RabbitMQ-3.8.3版本,使用官方镜像rabbitmq:3.8.3-management参考官方文档
- Cluster Formation and Peer Discovery — RabbitMQ
- GitHub - rabbitmq/diy-kubernetes-examples: Examples that demonstrate how deploy a RabbitMQ cluster to Kubernetes, the DIY way
- diy-kubernetes-examples/kind/base at master · rabbitmq/diy-kubernetes-examples · GitHub
- Deploying RabbitMQ to Kubernetes: What's Involved? | RabbitMQ - Blog
从RabbitMQ 3.8.0开始,RabbitMQ自带这plugin支持直接对接Prometheus & Grafana
插件名称: rabbitmq_prometheus
值得注意的是,rabbiitmq_prometheus与rabbitmq_exporter区别在于,前者更倾向于rabbitmq本身系统运行时的状态并非MQ业务状态,可以更深入的了解RabbitMQ的底层运行情况及基础元数据的信息,根据采集的数据能够预判RabbitMQ行为,如下
- VM 配置
- 初始化配置
- CPU资源利用率(连接/队列/通道的使用情况)
- 运行调度状态
- 线程信息
- erlang进程资源利用率
- 内存分配
- open file限制情况
具体可参考官方地址https://www.rabbitmq.com/runtime.html
部署过程
在Kubernetes集群中部署RabbitMQ集群时,会使用到Kubernetes以下几种资源
- rbac.yaml
- configmap.yaml
- rabbitmq.yaml
- service.yaml
- statefulset.yaml
- 创建rbac.yaml
--- apiVersion: v1 kind: ServiceAccount metadata: name: rabbitmq namespace: prod --- kind: Role apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: rabbitmq-peer-discovery-rbac namespace: prod rules: - apiGroups: [""] resources: ["endpoints"] verbs: ["get"] - apiGroups: [""] resources: ["events"] verbs: ["create"] --- kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: rabbitmq-peer-discovery-rbac namespace: prod subjects: - kind: ServiceAccount name: rabbitmq roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: rabbitmq-peer-discovery-rbac
- 创建configmap.yaml
apiVersion: v1 kind: ConfigMap metadata: name: rabbitmq-config namespace: prod data: enabled_plugins: | [rabbitmq_management,rabbitmq_peer_discovery_k8s,rabbitmq_prometheus]. rabbitmq.conf: | ## Cluster formation. See https://www.rabbitmq.com/cluster-formation.html to learn more. cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s cluster_formation.k8s.host = kubernetes.default.svc.cluster.local ## Should RabbitMQ node name be computed from the pod's hostname or IP address? ## IP addresses are not stable, so using [stable] hostnames is recommended when possible. ## Set to "hostname" to use pod hostnames. ## When this value is changed, so should the variable used to set the RABBITMQ_NODENAME ## environment variable. cluster_formation.k8s.address_type = hostname ## How often should node cleanup checks run? cluster_formation.node_cleanup.interval = 30 ## Set to false if automatic removal of unknown/absent nodes ## is desired. This can be dangerous, see ## * https://www.rabbitmq.com/cluster-formation.html#node-health-checks-and-cleanup ## * https://groups.google.com/forum/#!msg/rabbitmq-users/wuOfzEywHXo/k8z_HWIkBgAJ cluster_formation.node_cleanup.only_log_warning = true cluster_partition_handling = autoheal ## See https://www.rabbitmq.com/ha.html#master-migration-data-locality queue_master_locator=min-masters ## This is just an example. ## This enables remote access for the default user with well known credentials. ## Consider deleting the default user and creating a separate user with a set of generated ## credentials instead. ## Learn more at https://www.rabbitmq.com/access-control.html#loopback-users loopback_users.guest = false ## 内存最高使用水位线,默认值0.4 ## vm_memory_high_watermark.absolute = 3.6GB vm_memory_high_watermark.relative = 0.6 ## MQ消息转储阈值,将消息paging到磁盘中,默认值是0.5 ## vm_memory_high_watermark_paging_ratio = 0.5 ## MQ识别节点的最大可用内存量,特别针对容器化部署对宿主机的内存无法感知 total_memory_available_override_value = 6GB
- 创建statefulset.yaml
apiVersion: policy/v1beta1 kind: PodDisruptionBudget metadata: name: rabbitmq-pdb namespace: prod spec: selector: matchLabels: app: rabbitmq maxUnavailable: 1 --- apiVersion: apps/v1 # See the Prerequisites section of https://www.rabbitmq.com/cluster-formation.html#peer-discovery-k8s. kind: StatefulSet metadata: name: prod-rabbitmq namespace: prod spec: serviceName: rabbitmq # Three nodes is the recommended minimum. Some features may require a majority of nodes # to be available. replicas: 3 selector: matchLabels: app: rabbitmq template: metadata: labels: app: rabbitmq spec: serviceAccountName: rabbitmq terminationGracePeriodSeconds: 10 tolerations: - operator: "Equal" #- operator: "Exists" key: "data" value: "prod" nodeSelector: # Use Linux nodes in a mixed OS kubernetes cluster. # Learn more at https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#kubernetes-io-os kubernetes.io/os: linux affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/resource operator: In values: - prod-data podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: "app" operator: In values: - rabbitmq topologyKey: "kubernetes.io/hostname" containers: - name: prod-rabbitmq image: hub.qiangyun.com/rabbitmq:3.8.3 volumeMounts: - name: rabbitmq-pvc mountPath: /var/lib/rabbitmq - name: config-volume mountPath: /etc/rabbitmq - name: localtime mountPath: /etc/localtime # Learn more about what ports various protocols use # at https://www.rabbitmq.com/networking.html#ports resources: requests: cpu: 500m memory: 1024Mi limits: cpu: '2' memory: 6Gi ports: - name: mq-mgt protocol: TCP containerPort: 15672 - name: amqp protocol: TCP containerPort: 5672 livenessProbe: exec: # This is just an example. There is no "one true health check" but rather # several rabbitmq-diagnostics commands that can be combined to form increasingly comprehensive # and intrusive health checks. # Learn more at https://www.rabbitmq.com/monitoring.html#health-checks. # # Stage 2 check: command: ["rabbitmq-diagnostics", "status"] initialDelaySeconds: 60 # See https://www.rabbitmq.com/monitoring.html for monitoring frequency recommendations. periodSeconds: 60 timeoutSeconds: 15 readinessProbe: exec: # This is just an example. There is no "one true health check" but rather # several rabbitmq-diagnostics commands that can be combined to form increasingly comprehensive # and intrusive health checks. # Learn more at https://www.rabbitmq.com/monitoring.html#health-checks. # # Stage 1 check: command: ["rabbitmq-diagnostics", "ping"] initialDelaySeconds: 20 periodSeconds: 60 timeoutSeconds: 10 imagePullPolicy: IfNotPresent #imagePullPolicy: Always env: - name: MY_POD_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.name - name: MY_POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: RABBITMQ_USE_LONGNAME value: "true" # See a note on cluster_formation.k8s.address_type in the config file section - name: K8S_SERVICE_NAME value: rabbitmq - name: RABBITMQ_NODENAME value: rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local - name: K8S_HOSTNAME_SUFFIX value: .$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local - name: RABBITMQ_ERLANG_COOKIE value: "mycookie" volumes: - name: config-volume configMap: name: rabbitmq-config items: - key: rabbitmq.conf path: rabbitmq.conf - key: enabled_plugins path: enabled_plugins - name: localtime hostPath: path: /etc/localtime type: '' volumeClaimTemplates: - metadata: name: rabbitmq-pvc annotations: volume.alpha.kubernetes.io/storage-class: anything spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 50Gi storageClassName: alicloud-disk-essd
- 创建service.yaml
kind: Service apiVersion: v1 metadata: namespace: prod name: rabbitmq labels: app: rabbitmq type: LoadBalancer spec: type: NodePort ports: - name: mq-mgt protocol: TCP port: 15672 targetPort: 15672 #nodePort: 31672 - name: amqp protocol: TCP port: 5672 targetPort: 5672 #nodePort: 30672 selector: app: rabbitmq
- 创建ingress.yaml
apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: mq-mgt.qiangyun.com namespace: prod annotations: # use the shared ingress-nginx kubernetes.io/ingress.class: "sys" spec: rules: - host: mq-mgt.qiangyun.com http: paths: - path: / backend: serviceName: rabbitmq servicePort: 15672