zoukankan      html  css  js  c++  java
  • Ubuntu 16.04的k8s安装配置

    相关软件 

    1、kubeadm

    安装步骤 

    apt-get update
    

      

    1、禁用所有交换分区

    swapoff -a

    /etc/fstab

    可以用free命令查看禁用情况 

    root@gpu-10-0-1-24:~# free
                  total        used        free      shared  buff/cache   available
    Mem:      528016312     6131652   343432968     6595072   178451692   512492696
    Swap:             0           0           0

    2、关闭防火墙

    systemctl stop firewalld
    systemctl disable firewalld
    

    3、禁用SELinux 

    setenforce 0

    安装网络插件flannel

      

    kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.0.1.18 --kubernetes-version=v1.11.1 --ignore-preflight-errors=all //--skip-preflight-checks选项已经弃用

      

    报错

    [preflight] Activating the kubelet service
    failure loading ca certificate: couldn't load the private key file /etc/kubernetes/pki/ca.key: open /etc/kubernetes/pki/ca.key: no such file or directory
    

    把自定义pki密钥拷到对应目录下。

    sudo: unable to resolve host gpu-10-0-1-18

    在/etc/hosts文件中加上主机名映射。

    getenforce
    

    添加node节点

    kubeadm join 10.0.0.39:6443 --token 4g0p8w.w5p29ukwvitim2ti 
    --discovery-token-ca-cert-hash sha256:21d0adbfcb409dca97e655641573b2ee51c
    77a212f194e20a307cb459e5f77c8
    kubeadm token list
    kubeadm token create --print-join-command
    
    apt-get update && apt-get install -y apt-transport-https curl
    curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
    cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
    deb https://apt.kubernetes.io/ kubernetes-xenial main
    EOF
    apt-get update
    apt-get install -y kubelet kubeadm kubectl
    apt-mark hold kubelet kubeadm kubectl
    

      

    新加的节点,get nodes的ROLES为<none> 

    kubectl get pods -n kube-system | grep flannel
    

      

    kubectl get pods -n kube-system -o wide | grep gpu-10-0-1-24

    参考链接

    https://tomoyadeng.github.io/blog/2018/10/12/k8s-in-ubuntu18.04/index.html

    kubeadm token list empty: 

    https://www.serverlab.ca/tutorials/containers/kubernetes/how-to-add-workers-to-kubernetes-clusters/

    https://stackoverflow.com/questions/51380934/unable-to-connect-worker-node-to-kubernetes-cluster 

    https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

    join显示成功,但是get nodes没有:

    https://github.com/kubernetes/kubernetes/issues/61224

    The connection to the server localhost:8080 was refused - did you specify the right host or port?

    https://www.jianshu.com/p/6fa06b9bbf6a

    Attempting to reclaim ephemeral-storage

    ImagePullBackOff

    kubectl -n kube-system logs kube-flannel-ds-jpp96 -c install-cni

    node ready并不代表网络插件flannel通了。

    flannel也是在镜像中启动的。

    k8s可以有多个master节点。

    给节点添加role标签

    kubectl label node k8s-node1 node-role.kubernetes.io/worker=worker

    systemctl restart kubelet会触发联网拉镜像

    root@cpu-10-0-3-9:~# ks init xps-kubeflow
    INFO Using context "kubernetes-admin@kubernetes" from kubeconfig file "/root/.kube/config"
    INFO Creating environment "default" with namespace "default", pointing to "version:v1.8.0" cluster at address "https://10.0.3.9:6443"
    INFO Generating ksonnet-lib data at path '/root/xps-kubeflow/lib/ksonnet-lib/v1.8.0'
    root@cpu-10-0-3-9:~/xps-k8s# kubectl create -f xps_crd.yaml
    customresourcedefinition.apiextensions.k8s.io/xps.tencent.com created
    

      

    kubectl get crd
    

      

    Pod sandbox changed, it will be killed and re-created.
    

      

    docker run --security-opt=no-new-privileges --cap-drop=ALL --network=none -it -v /var/lib/kubelet/device-plugins:/var/lib/kubelet/device-plugins nvidia/k8s-device-plugin:1.11
    

      

    emptydir只在pod范围内共享 所以只要保证一个pod一个容器就行

    k8s默认不会调度到master节点上

    kubectl taint nodes --all node-role.kubernetes.io/master-
    

      

    查看所有mxjobs

    kubectl get mxjobs.kubeflow.org
    

      

    分配pod到node:

    https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity 

  • 相关阅读:
    CentOS 5.5 Oracle 11g
    安装VMware后,设置WinRM
    Scientific linux 6 使用第三方软件仓库(转)
    ASP.NET 标签问题
    《Linux网络编程》读书笔记
    基本通信模型
    SQL Server中的自增长
    windows平台通信基础
    线程学习小结
    SQL Server中添加注释
  • 原文地址:https://www.cnblogs.com/yangwenhuan/p/11484859.html
Copyright © 2011-2022 走看看