配置 spark 用户
apiVersion: v1
kind: ServiceAccount
metadata:
name: spark
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: spark-role
rules:
- apiGroups: [""]
resources: ["*"]
verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: spark-role-binding
namespace: default
subjects:
- kind: ServiceAccount
name: spark
namespace: default
roleRef:
kind: Role
name: spark-role
apiGroup: rbac.authorization.k8s.io
配置 spark 容器,会在这个容器里以 client 模式 submit spark 程序,所以这个容器也会作为 driver
apiVersion: apps/v1
kind: Deployment
metadata:
name: spark-client
spec:
replicas: 1
selector:
matchLabels:
app: spark-client
component: spark-client
template:
metadata:
labels:
app: spark-client
component: spark-client
spec:
containers:
- name: spark-client
image: spark-py:2.4.6
workingDir: /opt/spark
command: ["/bin/bash", "-c", "while true;do echo hello;sleep 6000;done"]
serviceAccountName: spark
配置 service,使得 spark executor 可以连接上 spark driver,任意端口都可以
apiVersion: v1
kind: Service
metadata:
namespace: default
name: spark-client-service
spec:
selector:
app: spark-client
ports:
- protocol: TCP
port: 7321
targetPort: 7321
clusterIP: None
登陆 spark 容器,以 client 模式提交 spark,指定 spark.driver.host 和 spark.driver.port
bin/spark-submit
--master k8s://https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT_HTTPS}
--deploy-mode client
--name spark-test
--conf spark.executor.instances=3
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark
--conf spark.kubernetes.container.image=spark-py:2.4.6
--conf spark.driver.host=spark-client-service
--conf spark.driver.port=7321
/opt/spark/examples/src/main/python/wordcount.py
/opt/spark/examples/src/main/python/wordcount.py