zoukankan      html  css  js  c++  java
  • kubernetes 中遇见的一些坑(持续更新)

    一、官网镜像无法下载

    解决方法:需要翻墙

    配置docker翻墙机:

    cat /usr/lib/systemd/system/docker.service
     
    [Service]
    Environment="HTTP_PROXY=http://10.53.16.201:1080/"
     
    重启docker服务 

    二、pause k8s镜像下载失败

    pod启动失败,查看pod详情(kubectl describe pods podname)

        Events:
          FirstSeen LastSeen    Count   From            SubobjectPath   Type        Reason      Message
          --------- --------    -----   ----            -------------   --------    ------      -------
          56s       56s     1   {default-scheduler }            Normal      Scheduled   Successfully assigned nfs-rc-fc2w8 to duni-node1
          11s       11s     1   {kubelet duni-node1}            Warning     FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for gcr.io/google_containers/pause-amd64:3.0, this may be because there are no credentials on this request.  details: (Get https://gcr.io/v1/_ping: dial tcp 74.125.203.82:443: i/o timeout)"

    解决:

    找到翻墙机器 下载镜像,上传到自己的私有仓库:xxx.xxxx.xxxx/pause-amd64:3.0

    gcr.io/google_containers/pause-amd64:3.0

    修改kubelet 配置:

    --pod-infra-container-image=xxx.xxxx.xxxx/pause-amd64:3.0

    重启: kubelet

    三、权限问题

    通过rc配置文件起pod,rc中配置了privileged为true,发现pod状态一直Running不起来,查看pod详情发现

        [root@docker tmp]# kubectl describe pods nfs-rc-acbo1
        Name:       nfs-rc-acbo1
        Namespace:  default
        Node:       duni-node2
        Labels:     role=nfs-server
        Status:     Pending
        IP:     
        Controllers:    ReplicationController/nfs-rc
        Containers:
          nfs-server:
            Image:          192.168.100.90:5000/nfs-data
            Port:           2049/TCP
            Volume Mounts:      <none>
            Environment Variables:  <none>
        Conditions:
          Type      Status
          PodScheduled  True 
        No volumes.
        QoS Class:  BestEffort
        Tolerations:    <none>
        Events:
          FirstSeen LastSeen    Count   From            SubobjectPath   Type        Reason          Message
          --------- --------    -----   ----            -------------   --------    ------          -------
          27s       27s     1   {default-scheduler }            Normal      Scheduled       Successfully assigned nfs-rc-acbo1 to duni-node2
          27s       27s     1   {kubelet duni-node2}            Warning     FailedValidation    Error validating pod nfs-rc-acbo1.default from api, ignoring: spec.containers[0].securityContext.privileged: Forbidden: disallowed by policy

    解决:

    修改所有node以及master节点的k8s配置文件 vim /etc/kubernetes/config

    $ KUBE_ALLOW_PRIV="--allow-privileged=true"
    $ systemctl restart kube-apiserver


    四、getsockopt: connection timed out’问题 
    如果安装的docker版本为1.13及以上,并且网络畅通,flannel、etcd都正常,但还是会出现getsockopt: connection timed out'的错误,则可能是iptables配置问题。具体问题:
    Error: 'dial tcp 10.233.50.3:8443: getsockopt: connection timed out
    • 1
    docker从1.13版本开始,可能将iptables FORWARD chain的默认策略设置为DROP,从而导致ping其他Node上的Pod IP失败,遇到这种问题时,需要手动设置策略为ACCEPT:
    sudo iptables -P FORWARD ACCEPT
    • 1
    使用iptables -nL命令查看,发现Forward的策略还是drop,可是我们明明执行了iptables -P FORWARD ACCEPT。原来,docker是在这句话执行之后启动的,需要每次在docker之后再执行这句话。。。这么做有点太麻烦了,所以我们修改下docker的启动脚本:
    vi /usr/lib/systemd/system/docker.service
    [Service] Type=notify # the default is not to use systemd for cgroups because the delegate issues still # exists and systemd currently does not support the cgroup feature set required # for containers run by docker
    ExecStart=/usr/bin/dockerd $DOCKER_NETWORK_OPTIONS $DOCKER_OPTS $DOCKER_DNS_OPTIONS # 添加这行操作,在每次重启docker之前都会设置iptables策略为ACCEPT
    ExecStartPost=/sbin/iptables -I FORWARD -s 0.0.0.0/0 -j ACCEPT
    ExecReload=/bin/kill -s HUP $MAINPID
     
     
    五、部署 Ingress Controller 的问题

    部署完了后端就得把最重要的组件 Nginx+Ingres Controller(官方统一称为 Ingress Controller) 部署上
    ➜ ~ kubectl create -f nginx-ingress-controller.yaml daemonset "nginx-ingress-lb" created
    注意:官方的 Ingress Controller 有个坑,至少我看了 DaemonSet 方式部署的有这个问题:没有绑定到宿主机 80 端口,也就是说前端 Nginx 没有监听宿主机 80 端口(这还玩个卵啊);所以需要把配置搞下来自己加一下 hostNetwork,截图如下
     
     
     
    hostNetwork: true ##这行
    containers:
    - name: nginx-ingress-controller
    image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.9.0
     
    
    
    六、问题6
    kube-proxy: E0103 18:30:04.060770 40611 proxier.go:977] conntrack return with error: error looking for path of conntrack: exec: "conntrack": executable file not found in $PATH
     
    yum install conntrack-tools -y
     
     
    七、api-server 多台api
    uthentication.go:58] Unable to authenticate the request due to an error: [invalid bearer token, [invalid bearer token, [invalid bearer token, crypto/rsa: verification error, invalid bearer token
     
    update-ca-trust
     
    八、容器状态为Terminating
    强制删除 状态STATUS为Terminating:
    kubectl delete pod kubernetes-dashboard-5c7d5fc568-gq7lj --grace-period=0 --force -n kube-system


    九、访问dashboard的问题
    Unauthorized问题
    { "kind": "Status",
    "apiVersion": "v1",
    "metadata": { },
    "status": "Failure",
    "message": "Unauthorized",
    "reason": "Unauthorized",
    "code": 401 }
     
    解决方法: 
    新建basic_auth_file文件,并在其中添加:
    admin,admin,1002
    • 1
    文件内容格式:password,username,uid
    然后在api-server配置文件(即上面的配置文件)中添加--=/etc/kubernetes/basic_auth_file
    保存重启kube-apiserver:
     
     



  • 相关阅读:
    连载《一个程序猿的生命周期》-《发展篇》- 6.2016年发展元年,本职工作和个人事业均有突破和起色
    转载 | 北漂纪实:70后最幸福、80后买不起房、90后一手烂牌
    连载《一个程序猿的生命周期》-《发展篇》
    连载《一个程序猿的生命周期》-《发展篇》
    连载《一个程序猿的生命周期》-《发展篇》
    连载《一个程序猿的生命周期》-《发展篇》
    连载《一个程序猿的生命周期》-《发展篇》
    随笔《一个程序猿的生命周期》- 拉风险投资搞创业是一种病吗?
    ubuntu 解决安装dpkg 依赖错误
    Ubuntu 卸载 nginx 并重新安装
  • 原文地址:https://www.cnblogs.com/Qing-840/p/9279513.html
Copyright © 2011-2022 走看看