zoukankan      html  css  js  c++  java
  • ETCD磁盘空间爆满解决方案

    ETCD磁盘报警处理

    etcd默认的空间配额限制为2G,超出空间配额限制就会影响服务,所以需要定期清理

    查看ETCD日志

    8月 04 17:00:04 1.novalocal etcd[24848]: read-only range request "key:"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:1 size:3775" took too long (1.354750458s) to execute
    8月 04 17:00:05 1.novalocal etcd[24848]: read-only range request "key:"XXXXXXXXXXXXXXXXXXXXX" range_end:"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:2303873 size:1274241272" took too long (11.31986XXXXXXXXXXXXXXXXXXXXX
    8月 04 17:05:09 1.novalocal etcd[24848]: read-only range request "key:"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:1 size:3775" took too long (1.136787261s) to execute
    8月 04 17:05:10 1.novalocal etcd[24848]: read-only range request "key:"XXXXXXXXXXXXXXXXXXXXX" range_end:"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:2303873 size:1274241272" took too long (11.68081XXXXXXXXXXXXXXXXXXXXX
    8月 04 17:05:11 1.novalocal etcd[24848]: WARNING: 2020/08/04 17:05:11 grpc: Server.processUnaryRPC failed to write status connection error: desc = "transport is closing"
    8月 04 17:10:14 1.novalocal etcd[24848]: read-only range request "key:"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:1 size:3775" took too long (1.173390639s) to execute
    8月 04 17:10:15 1.novalocal etcd[24848]: read-only range request "key:"XXXXXXXXXXXXXXXXXXXXX" range_end:"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:2303873 size:1274241272" took too long (11.42705XXXXXXXXXXXXXXXXXXXXX
    8月 04 17:15:19 1.novalocal etcd[24848]: read-only range request "key:"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:1 size:3775" took too long (1.311071626s) to execute
    8月 04 17:15:20 1.novalocal etcd[24848]: read-only range request "key:"XXXXXXXXXXXXXXXXXXXXX" range_end:"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:2303873 size:1274241272" took too long (11.22721XXXXXXXXXXXXXXXXXXXXX
    

    发现存在大量 took too long (11.42705XXXXXXXXXXXXXXXXXXXXX 日志

    查看ETCD集群状态

    • 查看集群状态
    ETCDCTL_API=3 ./etcdctl --endpoints=$ip:$port --write-out=table endpoint status
    
    +------------------------+------------------+---------+---------+-----------+-----------+------------+
    |        ENDPOINT        |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
    +------------------------+------------------+---------+---------+-----------+-----------+------------+
    | http://127.0.0.1:2379 | 728d3145169b227d |  3.3.10 |  2.1 GB |      false |         6 |    3616392 |
    +------------------------+------------------+---------+---------+-----------+-----------+------------+
    
    • 查看ETCD集群报警情况
    ETCDCTL_API=3 ./etcdctl --endpoints=$ip:$port alarm list
    
    meberID:XXXXXXXXXXXXXXX alarm:NOSPACE
    

    此处 alarm 提示 NOSPACE,需要升级 ETCD 集群的空间(默认为2G的磁盘使用空间),或者压缩老数据,升级空间后,需要使用 etcd命令,取消此报警信息,否则集群依旧无法使用

    增加etcd的容量,由2G-->8G,增加以下三个参数

    vi /etc/systemd/system/rio-etcd.service
    ## auto-compaction-retention 参数#(单位⼩时)
    
    --auto-compaction-mode=revision --auto-compaction-retention=24 --quota-backend-bytes=8589934592
    

    获取当前etcd数据的修订版本(revision)

    rev=$(ETCDCTL_API=3 etcdctl --endpoints=$ip:$port endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9].*')
    
    echo $rev
    
    • 整合压缩旧版本数据
    ETCDCTL_API=3 etcdctl --endpoints=$ip:$port compact $rev
    
    • 执行碎片整理
    ETCDCTL_API=3 etcdctl --endpoints=$ip:$port defrag
    

    解除告警

    ETCDCTL_API=3 etcdctl --endpoints=$ip:$port alarm disarm
    

    验证可以添加新数据

    ETCDCTL_API=3 etcdctl --endpoints=$ip:$port put newkeytestfornospace 123
    

    参考文档

    https://www.cnblogs.com/lvcisco/p/10775021.html
    
  • 相关阅读:
    Docker 系列(四):Docker 容器数据卷简单使用
    【QML 动态对象】使用JS中的语句动态创建和销毁组件
    【QML 动态对象】Loader动态加载组件
    vue-cli2.0全局使用sass变量
    两边+居中 布局
    跳转子路由后左侧菜单跳转为空白页,路由地址出错
    el-tree可搜索单选
    el-tree固定高度加滚动条
    前端 权限控制 方式
    综合分析类
  • 原文地址:https://www.cnblogs.com/evescn/p/13438905.html
Copyright © 2011-2022 走看看