zoukankan      html  css  js  c++  java
  • 监控Elasticsearch的插件【check_es_system】

    插件地址:https://www.claudiokuenzler.com/monitoring-plugins/check_es_system.php

    下载地址

    #!/bin/bash
    ################################################################################
    # Script:       check_es_system.sh                                             #
    # Author:       Claudio Kuenzler www.claudiokuenzler.com                       #
    # Purpose:      Monitor ElasticSearch Store (Disk) Usage                       #
    # Official doc: www.claudiokuenzler.com/monitoring-plugins/check_es_system.php #
    # License:      GPLv2                                                          #
    # GNU General Public Licence (GPL) http://www.gnu.org/                         #
    # This program is free software; you can redistribute it and/or                #
    # modify it under the terms of the GNU General Public License                  #
    # as published by the Free Software Foundation; either version 2               #
    # of the License, or (at your option) any later version.                       #
    #                                                                              #
    # This program is distributed in the hope that it will be useful,              #
    # but WITHOUT ANY WARRANTY; without even the implied warranty of               #
    # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the                #
    # GNU General Public License for more details.                                 #
    #                                                                              #
    # You should have received a copy of the GNU General Public License            #
    # along with this program; if not, see <https://www.gnu.org/licenses/>.        #
    #                                                                              #
    # Copyright 2016,2018-2020 Claudio Kuenzler                                    #
    # Copyright 2018 Tomas Barton                                                  #
    # Copyright 2020 NotAProfessionalDeveloper                                     #
    #                                                                              #
    # History:                                                                     #
    # 20160429: Started programming plugin                                         #
    # 20160601: Continued programming. Working now as it should =)                 #
    # 20160906: Added memory usage check, check types option (-t)                  #
    # 20160906: Renamed plugin from check_es_store to check_es_system              #
    # 20160907: Change internal referenced variable name for available size        #
    # 20160907: Output now contains both used and available sizes                  #
    # 20161017: Add missing -t in usage output                                     #
    # 20180105: Fix if statement for authentication (@deric)                       #
    # 20180105: Fix authentication when wrong credentials were used                #
    # 20180313: Configure max_time for Elastic to respond (@deric)                 #
    # 20190219: Fix alternative subject name in ssl (issue 4), direct to auth      #
    # 20190220: Added status check type                                            #
    # 20190403: Check for mandatory parameter checktype, adjust help               #
    # 20190403: Catch connection refused error                                     #
    # 20190426: Catch unauthorized (403) error                                     #
    # 20190626: Added readonly check type                                          #
    # 20190905: Catch empty cluster health status (issue #13)                      #
    # 20190909: Added jthreads and tps (thread pool stats) check types             #
    # 20190909: Handle correct curl return codes                                   #
    # 20190924: Missing 'than' in tps output                                       #
    # 20191104: Added master check type                                            #
    # 20200401: Fix/handle 503 errors with curl exit code 0 (issue #20)            #
    # 20200409: Fix 503 error lookup (issue #22)                                   #
    # 20200430: Support both jshon and jq as json parsers (issue #18)              #
    # 20200609: Fix readonly check on ALL indices (issue #26)                      #
    ################################################################################
    #Variables and defaults
    STATE_OK=0              # define the exit code if status is OK
    STATE_WARNING=1         # define the exit code if status is Warning
    STATE_CRITICAL=2        # define the exit code if status is Critical
    STATE_UNKNOWN=3         # define the exit code if status is Unknown
    export PATH=$PATH:/usr/local/bin:/usr/bin:/bin # Set path
    version=1.8.1
    port=9200
    httpscheme=http
    unit=G
    indexes='_all'
    max_time=30
    parsers=(jshon jq)
    ################################################################################
    #Functions
    help () {
    echo -e "$0 $version (c) 2016-$(date +%Y) Claudio Kuenzler and contributors (open source rulez!)
    
    Usage: ./check_es_system.sh -H ESNode [-P port] [-S] [-u user] [-p pass] -t checktype [-d int] [-o unit] [-w int] [-c int] [-m int] [-X parser]
    
    Options:
    
       *  -H Hostname or ip address of ElasticSearch Node
          -P Port (defaults to 9200)
          -S Use https
          -u Username if authentication is required
          -p Password if authentication is required
       *  -t Type of check (disk, mem, status, readonly, jthreads, tps, master)
       +  -d Available size of disk or memory (ex. 20)
          -o Disk space unit (K|M|G) (defaults to G)
          -i Space separated list of indexes to be checked for readonly (default: '_all')
          -w Warning threshold (see usage notes below)
          -c Critical threshold (see usage notes below)
          -m Maximum time in seconds to wait for response (default: 30)
          -e Expect master node (used with 'master' check)
          -X The json parser to be used jshon or jq (default: jshon)
          -h Help!
    
    *mandatory options
    +mandatory option for types disk,mem
    
    Threshold format for 'disk' and 'mem': int (for percent), defaults to 80 (warn) and 95 (crit)
    Threshold format for 'tps': int,int,int (active, queued, rejected), no defaults
    Threshold format for all other check types': int, no defaults
    
    Requirements: curl, expr and one of $(IFS=,; echo "${parsers[*]}")"
    exit $STATE_UNKNOWN;
    }
    
    authlogic () {
    if [[ -z $user ]] && [[ -z $pass ]]; then echo "ES SYSTEM UNKNOWN - Authentication required but missing username and password"; exit $STATE_UNKNOWN
    elif [[ -n $user ]] && [[ -z $pass ]]; then echo "ES SYSTEM UNKNOWN - Authentication required but missing password"; exit $STATE_UNKNOWN
    elif [[ -n $pass ]] && [[ -z $user ]]; then echo "ES SYSTEM UNKNOWN - Missing username"; exit $STATE_UNKNOWN
    fi
    }
    
    unitcalc() {
    # ES presents the currently used disk space in Bytes
    if [[ -n $unit ]]; then
      case $unit in
        K) availsize=$(expr $available * 1024); outputsize=$(expr ${size} / 1024);;
        M) availsize=$(expr $available * 1024 * 1024); outputsize=$(expr ${size} / 1024 / 1024);;
        G) availsize=$(expr $available * 1024 * 1024 * 1024); outputsize=$(expr ${size} / 1024 / 1024 / 1024);;
      esac
      if [[ -n $warning ]] ; then
        warningsize=$(expr $warning * ${availsize} / 100)
      fi
      if [[ -n $critical ]] ; then
        criticalsize=$(expr $critical * ${availsize} / 100)
      fi
      usedpercent=$(expr $size * 100 / $availsize)
    else echo "UNKNOWN - Shouldnt exit here. No units given"; exit $STATE_UNKNOWN
    fi
    }
    
    availrequired() {
    if [ -z ${available} ]; then echo "UNKNOWN - Missing parameter '-d'"; exit $STATE_UNKNOWN; fi
    }
    
    thresholdlogic () {
    if [ -n $warning ] && [ -z $critical ]; then echo "UNKNOWN - Define both warning and critical thresholds"; exit $STATE_UNKNOWN; fi
    if [ -n $critical ] && [ -z $warning ]; then echo "UNKNOWN - Define both warning and critical thresholds"; exit $STATE_UNKNOWN; fi
    }
    
    default_percentage_thresholds() {
    if [ -z $warning ] || [ "${warning}" = "" ]; then warning=80; fi
    if [ -z $critical ] || [ "${critical}" = "" ]; then critical=95; fi
    }
    
    json_parse() {
      json_parse_usage() { echo "$0: [-r] [-q] [-c] [-a] -x arg1 -x arg2 ..." 1>&2; exit; }
    
      local OPTIND opt r q c a x
      while getopts ":rqcax:" opt
      do
        case "${opt}" in
        r)  raw=1;;
        q)  quiet=1;; # only required for jshon
        c)  continue=1;; # only required for jshon
        a)  across=1;;
        x)  args+=("$OPTARG");;
        *)  json_parse_usage;;
        esac
      done
    
      case ${parser} in
      jshon)
        cmd=()
        for arg in "${args[@]}"; do
          cmd+=(-e $arg)
        done
        jshon ${quiet:+-Q} ${continue:+-C} ${across:+-a} "${cmd[@]}" ${raw:+-u}
        ;;
      jq)
        cmd=()
        for arg in "${args[@]}"; do
          cmd+=(.$arg)
        done
        jq ${raw:+-r} $(IFS=; echo ${across:+.[]}"${cmd[*]}")
        ;;
      esac
    }
    
    ################################################################################
    # Check for people who need help - aren't we all nice ;-)
    if [ "${1}" = "--help" -o "${#}" = "0" ]; then help; exit $STATE_UNKNOWN; fi
    ################################################################################
    # Get user-given variables
    while getopts "H:P:Su:p:d:o:i:w:c:t:m:e:X:" Input
    do
      case ${Input} in
      H)      host=${OPTARG};;
      P)      port=${OPTARG};;
      S)      httpscheme=https;;
      u)      user=${OPTARG};;
      p)      pass=${OPTARG};;
      d)      available=${OPTARG};;
      o)      unit=${OPTARG};;
      i)      indexes=${OPTARG};;
      w)      warning=${OPTARG};;
      c)      critical=${OPTARG};;
      t)      checktype=${OPTARG};;
      m)      max_time=${OPTARG};;
      e)      expect_master=${OPTARG};;
      X)      parser=${OPTARG:=jshon};;
      *)      help;;
      esac
    done
    
    # Check for mandatory opts
    if [ -z ${host} ]; then help; exit $STATE_UNKNOWN; fi
    if [ -z ${checktype} ]; then help; exit $STATE_UNKNOWN; fi
    ################################################################################
    # Check requirements
    for cmd in curl expr ${parser}; do
      if ! `which ${cmd} >/dev/null 2>&1`; then
        echo "UNKNOWN: ${cmd} does not exist, please check if command exists and PATH is correct"
        exit ${STATE_UNKNOWN}
      fi
    done
    # Find parser
    if [ -z ${parser} ]; then
      for cmd in ${parsers[@]}; do
        if `which ${cmd} >/dev/null 2>&1`; then
          parser=${cmd}
          break
        fi
      done
      if [ -z "${parser}" ]; then
        echo "UNKNOWN: No JSON parser found. Either one of the following is required: $(IFS=,; echo "${parsers[*]}")"
        exit ${STATE_UNKNOWN}
      fi
    fi
    
    ################################################################################
    # Retrieve information from Elasticsearch
    getstatus() {
    esurl="${httpscheme}://${host}:${port}/_cluster/stats"
    eshealthurl="${httpscheme}://${host}:${port}/_cluster/health"
    if [[ -z $user ]]; then
      # Without authentication
      esstatus=$(curl -k -s --max-time ${max_time} $esurl)
      esstatusrc=$?
      if [[ $esstatusrc -eq 7 ]]; then
        echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
        exit $STATE_CRITICAL
      elif [[ $esstatusrc -eq 28 ]]; then
        echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
        exit $STATE_CRITICAL
      elif [[ $esstatus =~ "503 Service Unavailable" ]]; then
        echo "ES SYSTEM CRITICAL - Elasticsearch not available: ${host}:${port} return error 503"
        exit $STATE_CRITICAL
      fi
      # Additionally get cluster health infos
      if [ $checktype = status ]; then
        eshealth=$(curl -k -s --max-time ${max_time} $eshealthurl)
        if [[ -z $eshealth ]]; then
          echo "ES SYSTEM CRITICAL - unable to get cluster health information"
          exit $STATE_CRITICAL
        fi
      fi
    fi
    
    if [[ -n $user ]] || [[ -n $(echo $esstatus | grep -i authentication) ]] ; then
      # Authentication required
      authlogic
      esstatus=$(curl -k -s --max-time ${max_time} --basic -u ${user}:${pass} $esurl)
      esstatusrc=$?
      if [[ $esstatusrc -eq 7 ]]; then
        echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
        exit $STATE_CRITICAL
      elif [[ $esstatusrc -eq 28 ]]; then
        echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
        exit $STATE_CRITICAL
      elif [[ $esstatus =~ "503 Service Unavailable" ]]; then
        echo "ES SYSTEM CRITICAL - Elasticsearch not available: ${host}:${port} return error 503"
        exit $STATE_CRITICAL
      elif [[ -n $(echo $esstatus | grep -i "unable to authenticate") ]]; then
        echo "ES SYSTEM CRITICAL - Unable to authenticate user $user for REST request"
        exit $STATE_CRITICAL
      elif [[ -n $(echo $esstatus | grep -i "unauthorized") ]]; then
        echo "ES SYSTEM CRITICAL - User $user is unauthorized"
        exit $STATE_CRITICAL
      fi
      # Additionally get cluster health infos
      if [[ $checktype = status ]]; then
        eshealth=$(curl -k -s --max-time ${max_time} --basic -u ${user}:${pass} $eshealthurl)
        if [[ -z $eshealth ]]; then
          echo "ES SYSTEM CRITICAL - unable to get cluster health information"
          exit $STATE_CRITICAL
        fi
      fi
    fi
    
    # Catch empty reply from server (typically happens when ssl port used with http connection)
    if [[ -z $esstatus ]] || [[ $esstatus = '' ]]; then
      echo "ES SYSTEM UNKNOWN - Empty reply from server (verify ssl settings)"
      exit $STATE_UNKNOWN
    fi
    }
    ################################################################################
    # Do the checks
    case $checktype in
    disk) # Check disk usage
      availrequired
      default_percentage_thresholds
      getstatus
      size=$(echo $esstatus | json_parse -x indices -x store -x "size_in_bytes")
      unitcalc
      if [ -n "${warning}" ] || [ -n "${critical}" ]; then
        # Handle tresholds
        thresholdlogic
        if [ $size -ge $criticalsize ]; then
          echo "ES SYSTEM CRITICAL - Disk usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_disk=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_CRITICAL
        elif [ $size -ge $warningsize ]; then
          echo "ES SYSTEM WARNING - Disk usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_disk=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_WARNING
        else
          echo "ES SYSTEM OK - Disk usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_disk=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_OK
        fi
      else
        # No thresholds
        echo "ES SYSTEM OK - Disk usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_disk=${size}B;;;;"
        exit $STATE_OK
      fi
      ;;
    
    mem) # Check memory usage
      availrequired
      default_percentage_thresholds
      getstatus
      size=$(echo $esstatus | json_parse -x nodes -x jvm -x mem -x "heap_used_in_bytes")
      unitcalc
      if [ -n "${warning}" ] || [ -n "${critical}" ]; then
        # Handle tresholds
        thresholdlogic
        if [ $size -ge $criticalsize ]; then
          echo "ES SYSTEM CRITICAL - Memory usage is at ${usedpercent}% ($outputsize $unit) from $available $unit|es_memory=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_CRITICAL
        elif [ $size -ge $warningsize ]; then
          echo "ES SYSTEM WARNING - Memory usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_memory=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_WARNING
        else
          echo "ES SYSTEM OK - Memory usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_memory=${size}B;${warningsize};${criticalsize};;"
          exit $STATE_OK
        fi
      else
        # No thresholds
        echo "ES SYSTEM OK - Memory usage is at ${usedpercent}% ($outputsize $unit from $available $unit)|es_memory=${size}B;;;;"
        exit $STATE_OK
      fi
      ;;
    
    status) # Check Elasticsearch status
      getstatus
      status=$(echo $esstatus | json_parse -r -x status)
      shards=$(echo $esstatus | json_parse -r -x indices -x shards -x total)
      docs=$(echo $esstatus | json_parse -r -x indices -x docs -x count)
      nodest=$(echo $esstatus | json_parse -r -x nodes -x count -x total)
      nodesd=$(echo $esstatus | json_parse -r -x nodes -x count -x data)
      relocating=$(echo $eshealth | json_parse -r -x relocating_shards)
      init=$(echo $eshealth | json_parse -r -x initializing_shards)
      unass=$(echo $eshealth | json_parse -r -x unassigned_shards)
      if [ "$status" = "green" ]; then
        echo "ES SYSTEM OK - Elasticsearch Cluster is green (${nodest} nodes, ${nodesd} data nodes, ${shards} shards, ${docs} docs)|total_nodes=${nodest};;;; data_nodes=${nodesd};;;; total_shards=${shards};;;; relocating_shards=${relocating};;;; initializing_shards=${init};;;; unassigned_shards=${unass};;;; docs=${docs};;;;"
        exit $STATE_OK
      elif [ "$status" = "yellow" ]; then
        echo "ES SYSTEM WARNING - Elasticsearch Cluster is yellow (${nodest} nodes, ${nodesd} data nodes, ${shards} shards, ${relocating} relocating shards, ${init} initializing shards, ${unass} unassigned shards, ${docs} docs)|total_nodes=${nodest};;;; data_nodes=${nodesd};;;; total_shards=${shards};;;; relocating_shards=${relocating};;;; initializing_shards=${init};;;; unassigned_shards=${unass};;;; docs=${docs};;;;"
          exit $STATE_WARNING
      elif [ "$status" = "red" ]; then
        echo "ES SYSTEM CRITICAL - Elasticsearch Cluster is red (${nodest} nodes, ${nodesd} data nodes, ${shards} shards, ${relocating} relocating shards, ${init} initializing shards, ${unass} unassigned shards, ${docs} docs)|total_nodes=${nodest};;;; data_nodes=${nodesd};;;; total_shards=${shards};;;; relocating_shards=${relocating};;;; initializing_shards=${init};;;; unassigned_shards=${unass};;;; docs=${docs};;;;"
          exit $STATE_CRITICAL
      fi
      ;;
    
    readonly) # Check Readonly status on given indexes
      icount=0
      for index in $indexes; do
        if [[ -z $user ]]; then
          # Without authentication
          settings=$(curl -k -s --max-time ${max_time} ${httpscheme}://${host}:${port}/$index/_settings)
          if [[ $? -eq 7 ]]; then
            echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
            exit $STATE_CRITICAL
          elif [[ $? -eq 28 ]]; then
            echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
            exit $STATE_CRITICAL
          fi
          rocount=$(echo $settings | json_parse -r -q -c -a -x settings -x index -x blocks -x read_only | grep -c true)
          roadcount=$(echo $settings | json_parse -r -q -c -a -x settings -x index -x blocks -x read_only_allow_delete | grep -c true)
          if [[ $rocount -gt 0 ]]; then
            output[${icount}]=" $index is read-only -"
            roerror=true
          fi
          if [[ $roadcount -gt 0 ]]; then
            output[${icount}]+=" $index is read-only (allow delete) -"
            roerror=true
          fi
        fi
    
        if [[ -n $user ]] || [[ -n $(echo $esstatus | grep -i authentication) ]] ; then
          # Authentication required
          authlogic
          settings=$(curl -k -s --max-time ${max_time} --basic -u ${user}:${pass} ${httpscheme}://${host}:${port}/$index/_settings)
          settingsrc=$?
          if [[ $settingsrc -eq 7 ]]; then
            echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
            exit $STATE_CRITICAL
          elif [[ $settingsrc -eq 28 ]]; then
            echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
            exit $STATE_CRITICAL
          elif [[ -n $(echo $esstatus | grep -i "unable to authenticate") ]]; then
            echo "ES SYSTEM CRITICAL - Unable to authenticate user $user for REST request"
            exit $STATE_CRITICAL
          elif [[ -n $(echo $esstatus | grep -i "unauthorized") ]]; then
            echo "ES SYSTEM CRITICAL - User $user is unauthorized"
            exit $STATE_CRITICAL
          fi
          rocount=$(echo $settings | json_parse -r -q -c -a -x settings -x index -x blocks -x read_only | grep -c true)
          roadcount=$(echo $settings | json_parse -r -q -c -a -x settings -x index -x blocks -x read_only_allow_delete | grep -c true)
          if [[ $rocount -gt 0 ]]; then
            if [[ "$index" = "_all" ]]; then 
              output[${icount}]=" $rocount index(es) found read-only -"
            else output[${icount}]=" $index is read-only -"
            fi
            roerror=true
          fi
          if [[ $roadcount -gt 0 ]]; then
            if [[ "$index" = "_all" ]]; then 
              output[${icount}]+=" $rocount index(es) found read-only (allow delete) -"
            else output[${icount}]+=" $index is read-only (allow delete) -"
            fi
            roerror=true
          fi
        fi
        let icount++
      done
    
      if [[ $roerror ]]; then
        echo "ES SYSTEM CRITICAL - ${output[*]}"
        exit $STATE_CRITICAL
      else
        echo "ES SYSTEM OK - Elasticsearch Indexes ($indexes) are writeable"
        exit $STATE_OK
      fi
      ;;
    
    jthreads) # Check JVM threads
      getstatus
      threads=$(echo $esstatus | json_parse -r -x nodes -x jvm -x "threads")
      if [ -n "${warning}" ] || [ -n "${critical}" ]; then
        # Handle tresholds
        thresholdlogic
        if [[ $threads -ge $criticalsize ]]; then
          echo "ES SYSTEM CRITICAL - Number of JVM threads is ${threads}|es_jvm_threads=${threads};${warning};${critical};;"
          exit $STATE_CRITICAL
        elif [[ $threads -ge $warningsize ]]; then
          echo "ES SYSTEM WARNING - Number of JVM threads is ${threads}|es_jvm_threads=${threads};${warning};${critical};;"
          exit $STATE_WARNING
        else
          echo "ES SYSTEM OK - Number of JVM threads is ${threads}|es_jvm_threads=${threads};${warning};${critical};;"
          exit $STATE_OK
        fi
      else
        # No thresholds
        echo "ES SYSTEM OK - Number of JVM threads is ${threads}|es_jvm_threads=${threads};${warning};${critical};;"
        exit $STATE_OK
      fi
      ;;
    
    tps) # Check Thread Pool Statistics
      if [[ -z $user ]]; then
        # Without authentication
        threadpools=$(curl -k -s --max-time ${max_time} ${httpscheme}://${host}:${port}/_cat/thread_pool)
        threadpoolrc=$?
        if [[ $threadpoolrc -eq 7 ]]; then
          echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
          exit $STATE_CRITICAL
        elif [[ $threadpoolrc -eq 28 ]]; then
          echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
          exit $STATE_CRITICAL
        fi
      fi
    
      if [[ -n $user ]] || [[ -n $(echo $esstatus | grep -i authentication) ]] ; then
        # Authentication required
        authlogic
        threadpools=$(curl -k -s --max-time ${max_time} --basic -u ${user}:${pass} ${httpscheme}://${host}:${port}/_cat/thread_pool)
        threadpoolrc=$?
        if [[ $threadpoolrc -eq 7 ]]; then
          echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
          exit $STATE_CRITICAL
        elif [[ $threadpoolrc -eq 28 ]]; then
          echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
          exit $STATE_CRITICAL
        elif [[ -n $(echo $esstatus | grep -i "unable to authenticate") ]]; then
          echo "ES SYSTEM CRITICAL - Unable to authenticate user $user for REST request"
          exit $STATE_CRITICAL
        elif [[ -n $(echo $esstatus | grep -i "unauthorized") ]]; then
          echo "ES SYSTEM CRITICAL - User $user is unauthorized"
          exit $STATE_CRITICAL
        fi
      fi
    
      tpname=($(echo "$threadpools" | awk '{print $1"-"$2}' | sed "s/
    //g"))
      tpactive=($(echo "$threadpools" | awk '{print $3}' | sed "s/
    //g"))
      tpqueue=($(echo "$threadpools" | awk '{print $4}' | sed "s/
    //g"))
      tprejected=($(echo "$threadpools" | awk '{print $5}' | sed "s/
    //g"))
    
      if [ -n "${warning}" ] || [ -n "${critical}" ]; then
        # Handle thresholds. They have to come in a special format: n,n,n (active, queue, rejected)
        thresholdlogic
        wactive=$(echo ${warning} | awk -F',' '{print $1}')
        wqueue=$(echo ${warning} | awk -F',' '{print $2}')
        wrejected=$(echo ${warning} | awk -F',' '{print $3}')
        cactive=$(echo ${critical} | awk -F',' '{print $1}')
        cqueue=$(echo ${critical} | awk -F',' '{print $2}')
        crejected=$(echo ${critical} | awk -F',' '{print $3}')
    
        i=0; for tp in ${tpname[*]}; do
          perfdata[$i]="tp_${tp}_active=${tpactive[$i]};${wactive};${cactive};; tp_${tp}_queue=${tpqueue[$i]};${wqueue};${cqueue};; tp_${tp}_rejected=${tprejected[$i]};${wrejected};${crejected};; "
          let i++
        done
    
        i=0
        for tpa in $(echo ${tpactive[*]}); do
          if [[ $tpa -ge $cactive ]]; then
            echo "Thread Pool ${tpname[$i]} is critical: Active ($tpa) is equal or higher than threshold ($cactive)|${perfdata[*]}"
            exit $STATE_CRITICAL
          elif [[ $tpa -ge $wactive ]]; then
            echo "Thread Pool ${tpname[$i]} is warning: Active ($tpa) is equal or higher than threshold ($wactive)|${perfdata[*]}"
            exit $STATE_WARNING
          fi
          let i++
        done
    
        i=0
        for tpq in $(echo ${tpqueue[*]}); do
          if [[ $tpq -ge $cqueue ]]; then
            echo "Thread Pool ${tpname[$i]} is critical: Queue ($tpq) is equal or higher than threshold ($cqueue)|${perfdata[*]}"
            exit $STATE_CRITICAL
          elif [[ $tpq -ge $wqueue ]]; then
            echo "Thread Pool ${tpname[$i]} is warning: Queue ($tpq) is equal or higher than threshold ($wqueue)|${perfdata[*]}"
            exit $STATE_WARNING
          fi
          let i++
        done
    
        i=0
        for tpr in $(echo ${tprejected[*]}); do
          if [[ $tpr -ge $crejected ]]; then
            echo "Thread Pool ${tpname[$i]} is critical: Rejected ($tpr) is equal or higher than threshold ($crejected)|${perfdata[*]}"
            exit $STATE_CRITICAL
          elif [[ $tpr -ge $wrejected ]]; then
            echo "Thread Pool ${tpname[$i]} is warning: Rejected ($tpr) is equal or higher than threshold ($wrejected)|${perfdata[*]}"
            exit $STATE_WARNING
          fi
          let i++
        done
    
       echo "ES SYSTEM OK - Found ${#tpname[*]} thread pools in cluster|${perfdata[*]}"
       exit $STATE_OK
       fi
    
      # No Thresholds
      i=0; for tp in ${tpname[*]}; do
        perfdata[$i]="tp_${tp}_active=${tpactive[$i]};;;; tp_${tp}_queue=${tpqueue[$i]};;;; tp_${tp}_rejected=${tprejected[$i]};;;; "
        let i++
      done
      echo "ES SYSTEM OK - Found ${#tpname[*]} thread pools in cluster|${perfdata[*]}"
      exit $STATE_OK
      ;;
    
    master) # Check Cluster Master
      if [[ -z $user ]]; then
        # Without authentication
        master=$(curl -k -s --max-time ${max_time} ${httpscheme}://${host}:${port}/_cat/master)
        masterrc=$?
        if [[ $masterrc -eq 7 ]]; then
          echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
          exit $STATE_CRITICAL
        elif [[ $masterrc -eq 28 ]]; then
          echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
          exit $STATE_CRITICAL
        fi
      fi
    
      if [[ -n $user ]] || [[ -n $(echo $esstatus | grep -i authentication) ]] ; then
        # Authentication required
        authlogic
        master=$(curl -k -s --max-time ${max_time} --basic -u ${user}:${pass} ${httpscheme}://${host}:${port}/_cat/master)
        masterrc=$?
        if [[ $threadpoolrc -eq 7 ]]; then
          echo "ES SYSTEM CRITICAL - Failed to connect to ${host} port ${port}: Connection refused"
          exit $STATE_CRITICAL
        elif [[ $threadpoolrc -eq 28 ]]; then
          echo "ES SYSTEM CRITICAL - server did not respond within ${max_time} seconds"
          exit $STATE_CRITICAL
        elif [[ -n $(echo $esstatus | grep -i "unable to authenticate") ]]; then
          echo "ES SYSTEM CRITICAL - Unable to authenticate user $user for REST request"
          exit $STATE_CRITICAL
        elif [[ -n $(echo $esstatus | grep -i "unauthorized") ]]; then
          echo "ES SYSTEM CRITICAL - User $user is unauthorized"
          exit $STATE_CRITICAL
        fi
      fi
    
      masternode=$(echo "$master" | awk '{print $NF}')
    
      if [[ -n ${expect_master} ]]; then
        if [[ "${expect_master}" = "${masternode}" ]]; then
          echo "ES SYSTEM OK - Master node is $masternode"
          exit $STATE_OK
        else
          echo "ES SYSTEM WARNING - Master node is $masternode but expected ${expect_master}"
          exit $STATE_WARNING
        fi
      else
        echo "ES SYSTEM OK - Master node is $masternode"
        exit $STATE_OK
      fi
      ;;
    
    *) help
    esac
    
    

    要求

    • curl(SUSE:curl中的zypper,Debian / Ubuntu:apt-get install curl,CentOS / RHEL:yum install curl

    • json解析器,其中之一:

      • jshon(SUSE:搜索jshon,Debian / Ubuntu:apt-get install jshon
      • jq命令(SUSE:jq中的zypper,Debian / Ubuntu:apt-get install jq
    • Bash内部命令/功能(插件检查其存在性)

    参数定义

    参数 描述
    -H* ElasticSearch节点的主机名或IP地址
    -P 端口(默认为9200)
    -S 使用安全的HTTP(https)
    -u 用户名(如果需要身份验证)
    -p 密码(如果需要验证)
    -t * 要运行的检查类型(磁盘|内存|状态)
    -d + 可用磁盘或内存大小(例如20)
    -o 大小单位(K | M | G)(默认为G)
    -一世 要检查的只读索引的空格分隔列表(默认值:“ _ all”)
    -w 警告阈 值“磁盘”和“内存”的阈值格式:整数(以百分比表示),默认为80(警告)和95(暴击)。 阈值格式为“ tps”:整数,整数,整数(活动,排队,已拒绝),没有默认 值所有其他检查类型的阈值格式:int,没有默认值
    -C “磁盘”和“内存”的临界阈值格式:int(百分比),默认为80(警告)和95(临界) tps的阈值格式:int,int,int(活动,已排队,已拒绝),没有默认 值所有其他检查类型的阈值格式:int,没有默认值
    -米 等待Elasticsearch服务器响应的最长时间(以秒为单位)(默认值:30)
    -e 给定的节点应该是Elasticsearch集群的主节点(仅影响“主”检查)
    -X 要使用的json解析器,可以是jshon或jq(默认值:jshon)
    -H 帮帮我!

    *必填参数

    +对于磁盘和内存检查类型是必需的

    检查类型的定义

    类型 描述
    状态 检查集群的当前状态(绿色,黄色,红色)。除此之外,还显示其他信息(节点,分片,文档)。当状态为黄色或红色时,将显示相关的分片信息(例如,初始化或未分配)。
    记忆 检查当前内存使用情况,并将其与-d参数定义的可用内存进行比较。阈值可能。
    磁碟 检查当前磁盘使用情况,并将其与使用-d参数定义的可用磁盘容量进行比较。阈值可能。
    只读 检查所有-i参数列出的(默认值:_all)或索引是否为只读标志。
    线程 监视跨ES集群的Java线程数。阈值可能。
    监视跨ES群集的线程池统计信息。对于每个线程池,都会检查“活动”,“排队”和“拒绝”队列。某些队列的数量不断增加,这可能表明您的Elasticsearch集群出现问题。阈值可能。
    监视ES群集的当前主节点。参数-e可用于检查某个节点是否为主节点,如果不是这种情况,则发出警报。

    用法/在命令行上运行插件

    用法:

    ./check_es_system.sh -H主机名[-P端口] [-S] [-u用户] [-p密码] -t checktype [-d容量] [-o单位] [-i索引] [- w警告] [-c严重] [-m时间]

    示例1:经典状态检查。此处,Elasticsearch群集在escluster.example.com上运行,并使用HTTPS(使用-S启用https)在端口9243上使用基本身份验证凭据用户和密码进行访问。输出显示集群状态(绿色)和一些其他信息。
    作为性能数据,使用了节点号​​,分片信息和文档总数。
    注意:当状态变为黄色(=警告)或红色(=严重)时,输出还将包含重定位,初始化和未分配的分片的信息。性能数据保持不变,不会混淆图形数据库。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p password -t状态
    ES系统正常-Elasticsearch集群为绿色(3个节点,2个数据节点,114个分片,8426885个文档)| total_nodes = 3 ;;;; data_nodes = 2 ;;; total_shards = 114 ;;; relocating_shards = 0 ;;;; initializing_shards = 0 ;;;; unassigned_shards = 0 ;;;; docs = 8426885 ;;;

    示例2:磁盘使用情况检查。访问与之前相同的Elasticsearch集群。由于此ES在云中运行,因此我们没有主机监控可用(意味着:我们无法进行文件系统监控)。但是我们知道,我们有128GB的可用磁盘空间。我们告诉插件,我们的容量为128GB(-d 128)。让插件完成其余工作:

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t磁盘-d 128
    ES系统正常-磁盘使用率为14%(128 G中为18 G)| es_disk = 19637018938B ; 109951162777; 130567005798 ;;

    示例3:内存使用情况检查。与以前一样,ES在云中运行,我们无法在主机本身上进行内存监视。但是我们已经预订了24GB RAM /内存的Elasticsearch服务。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t mem -d 24
    ES系统正常-内存使用率为58%(24 G中为14 G)| es_memory = 15107616304B ; 20615843020; 24481313587 ;;

    示例4:只读索引检查。插件使用-i参数检查提到的索引是否为只读标志。如果未使用-i参数,则将检查所有索引。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t只读-i“ filebeat- * logstash- *”严重-Elasticsearch
    Index filebeat- *是只读的(找到53索引设置为只读)Elasticsearch索引logstash- *为只读(找到的125个索引设置为只读)

    示例5:JVM线程检查。该插件检查整个集群中的JVM线程数。该插件应在200个或更多线程正在运行时发出警告,在300个或更多线程处于运行状态时会发出警告。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t jthreads -w 200 -c 300
    ES系统关键-JVM线程数为319 | es_jvm_threads = 319; 200; 300 ;;

    示例6:TPS(线程池统计信息)。该插件将遍历所有检测到的集群线程池。没有阈值,该插件仅输出检测到的线程池的数量并添加性能数据。使用阈值(请注意特殊格式!),插件将警告线程池之一是否等于或大于阈值。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t tps -w 200 -c 300
    ES系统正常-在cluster | tp_es02-analyze_active = 0; 10; 50 ;;中找到46个线程池 tp_es02-analyze_queue = 0; 50; 200 ;; tp_es02-analyze_rejected = 0; 1000; 2000 ;; tp_es02-ccr_active = 0; 10; 50 ;; tp_es02-ccr_queue = 0; 50; 200 ;; tp_es02-ccr_rejected = 0; 1000; 2000 ;; tp_es02-fetch_shard_started_active = 0; 10; 50 ;; tp_es02-fetch_shard_started_queue = 0; 50; 200 ;; tp_es02-fetch_shard_started_rejected = 0; 1000; 2000 ;; tp_es02-fetch_shard_store_active = 0; 10; 50 ;; tp_es02-fetch_shard_store_queue = 0; 50; 200 ;; tp_es02-fetch_shard_store_rejected = 0; 1000; 2000 ;; tp_es02-flush_active = 0; 10; 50 ;; tp_es02-flush_queue = 0; 50; 200 ;; tp_es02-flush_rejected = 0; 1000; 2000 ;; tp_es02-force_merge_active = 0; 10; 50 ;; tp_es02-force_merge_queue = 0; 50; 200 ;; tp_es02-force_merge_rejected = 0; 1000; 2000 ;; tp_es02-generic_active = 0; 10; 50 ;; tp_es02-generic_queue = 0; 50; 200 ;; tp_es02-generic_rejected = 0; 1000; 2000 ;; tp_es02-get_active = 0; 10; 50 ;; tp_es02-get_queue = 0; 50; 200 ;; tp_es02-get_rejected = 0; 1000; 2000 ;; tp_es02-index_active = 0; 10; 50 ;; tp_es02-index_queue = 0; 50; 200 ;; tp_es02-index_rejected = 0; 1000; 2000 ;; tp_es02-listener_active = 0; 10; 50 ;; tp_es02-listener_queue = 0; 50; 200 ;; tp_es02-listener_rejected = 0; 1000; 2000 ;; tp_es02-management_active = 1; 10; 50 ;; tp_es02-management_queue = 0; 50; 200 ;; tp_es02-management_rejected = 0; 1000; 2000 ;; tp_es02-ml_autodetect_active = 0; 10; 50 ;; tp_es02-ml_autodetect_queue = 0; 50; 200 ;; tp_es02-ml_autodetect_rejected = 0; 1000; 2000 ;; tp_es02-ml_datafeed_active = 0; 10; 50 ;; tp_es02-ml_datafeed_queue = 0; 50; 200 ;; tp_es02-ml_datafeed_rejected = 0; 1000; 2000 ;; tp_es02-ml_utility_active = 0; 10; 50 ;; tp_es02-ml_utility_queue = 0; 50; 200 ;; tp_es02-ml_utility_rejected = 0; 1000; 2000 ;; tp_es02-refresh_active = 1; 10; 50 ;; tp_es02-refresh_queue = 0; 50; 200 ;; tp_es02-refresh_rejected = 0; 1000; 2000 ;; tp_es02-rollup_indexing_active = 0; 10; 50 ;; tp_es02-rollup_indexing_queue = 0; 50; 200 ;; tp_es02-rollup_indexing_rejected = 0; 1000; 2000 ;; tp_es02-search_active = 0; 10; 50 ;; tp_es02-search_queue = 0; 50; 200 ;; tp_es02-search_rejected = 0; 1000; 2000 ;; tp_es02-search_throttled_active = 0; 10; 50 ;; tp_es02-search_throttled_queue = 0; 50; 200 ;; tp_es02-search_throttled_rejected = 0; 1000; 2000 ;; tp_es02-security-token-key_active = 0; 10; 50 ;; tp_es02-security-token-key_queue = 0; 50; 200 ;; tp_es02-security-token-key_rejected = 0; 1000; 2000 ;; tp_es02-snapshot_active = 0; 10; 50 ;; tp_es02-snapshot_queue = 0; 50; 200 ;; tp_es02-snapshot_rejected = 0; 1000; 2000 ;; tp_es02-warmer_active = 0; 10; 50 ;; tp_es02-warmer_queue = 0; 50; 200 ;; tp_es02-warmer_rejected = 0; 1000; 2000 ;; tp_es02-watcher_active = 0; 10; 50 ;; tp_es02-watcher_queue = 0; 50; 200 ;; tp_es02-watcher_rejected = 0; 1000; 2000 ;; tp_es02-write_active = 8; 10; 50 ;; tp_es02-write_queue = 10; 50; 200 ;; tp_es02-write_rejected = 0; 1000; 2000 ;; tp_es01-analyze_active = 0; 10; 50 ;; tp_es01-analyze_queue = 0; 50; 200 ;; tp_es01-analyze_rejected = 0; 1000; 2000 ;; tp_es01-ccr_active = 0; 10; 50 ;; tp_es01-ccr_queue = 0; 50; 200 ;; tp_es01-ccr_rejected = 0; 1000; 2000 ;; tp_es01-fetch_shard_started_active = 0; 10; 50 ;; tp_es01-fetch_shard_started_queue = 0; 50; 200 ;; tp_es01-fetch_shard_started_rejected = 0; 1000; 2000 ;; tp_es01-fetch_shard_store_active = 0; 10; 50 ;; tp_es01-fetch_shard_store_queue = 0; 50; 200 ;; tp_es01-fetch_shard_store_rejected = 0; 1000; 2000 ;; tp_es01-flush_active = 0; 10; 50 ;; tp_es01-flush_queue = 0; 50; 200 ;; tp_es01-flush_rejected = 0; 1000; 2000 ;; tp_es01-force_merge_active = 0; 10; 50 ;; tp_es01-force_merge_queue = 0; 50; 200 ;; tp_es01-force_merge_rejected = 0; 1000; 2000 ;; tp_es01-generic_active = 0; 10; 50 ;; tp_es01-generic_queue = 0; 50; 200 ;; tp_es01-generic_rejected = 0; 1000; 2000 ;; tp_es01-get_active = 0; 10; 50 ;; tp_es01-get_queue = 0; 50; 200 ;; tp_es01-get_rejected = 0; 1000; 2000 ;; tp_es01-index_active = 0; 10; 50 ;; tp_es01-index_queue = 0; 50; 200 ;; tp_es01-index_rejected = 0; 1000; 2000 ;; tp_es01-listener_active = 0; 10; 50 ;; tp_es01-listener_queue = 0; 50; 200 ;; tp_es01-listener_rejected = 0; 1000; 2000 ;; tp_es01-management_active = 1; 10; 50 ;; tp_es01-management_queue = 0; 50; 200 ;; tp_es01-management_rejected = 0; 1000; 2000 ;; tp_es01-ml_autodetect_active = 0; 10; 50 ;; tp_es01-ml_autodetect_queue = 0; 50; 200 ;; tp_es01-ml_autodetect_rejected = 0; 1000; 2000 ;; tp_es01-ml_datafeed_active = 0; 10; 50 ;; tp_es01-ml_datafeed_queue = 0; 50; 200 ;; tp_es01-ml_datafeed_rejected = 0; 1000; 2000 ;; tp_es01-ml_utility_active = 0; 10; 50 ;; tp_es01-ml_utility_queue = 0; 50; 200 ;; tp_es01-ml_utility_rejected = 0; 1000; 2000 ;; tp_es01-refresh_active = 0; 10; 50 ;; tp_es01-refresh_queue = 0; 50; 200 ;; tp_es01-refresh_rejected = 0; 1000; 2000 ;; tp_es01-rollup_indexing_active = 0; 10; 50 ;; tp_es01-rollup_indexing_queue = 0; 50; 200 ;; tp_es01-rollup_indexing_rejected = 0; 1000; 2000 ;; tp_es01-search_active = 0; 10; 50 ;; tp_es01-search_queue = 0; 50; 200 ;; tp_es01-search_rejected = 78; 1000; 2000 ;; tp_es01-search_throttled_active = 0; 10; 50 ;; tp_es01-search_throttled_queue = 0; 50; 200 ;; tp_es01-search_throttled_rejected = 0; 1000; 2000 ;; tp_es01-security-token-key_active = 0; 10; 50 ;; tp_es01-security-token-key_queue = 0; 50; 200 ;; tp_es01-security-token-key_rejected = 0; 1000; 2000 ;; tp_es01-snapshot_active = 0; 10; 50 ;; tp_es01-snapshot_queue = 0; 50; 200 ;; tp_es01-snapshot_rejected = 0; 1000; 2000 ;; tp_es01-warmer_active = 0; 10; 50 ;; tp_es01-warmer_queue = 0; 50; 200 ;; tp_es01-warmer_rejected = 0; 1000; 2000 ;; tp_es01-watcher_active = 0; 10; 50 ;; tp_es01-watcher_queue = 0; 50; 200 ;; tp_es01-watcher_rejected = 0; 1000; 2000 ;; tp_es01-write_active = 8; 10; 50 ;; tp_es01-write_queue = 20; 50; 200 ;; tp_es01-write_rejected = 0; 1000; 2000 ;;

    示例7:主检查。插件检查给定Elasticsearch集群的哪个节点是当前主节点。使用可选参数-e(期望主服务器),可以指定节点名称。如果给定的节点名称和群集的当前主节点不同,则插件将发出警告。

    ./check_es_system.sh -H escluster.example.com -P 9243 -S -u用户-p密码-t master -e node1
    ES系统警告-主节点为node2但预期为node1

    命令定义

    Nagios,Icinga 1.x,Shinken,Naemon中的命令定义

    以下命令定义允许在ARG4中全部定义可选参数。

    #'check_es_system'命令定义
    定义命令{
    command_name check_es_system
    command_line $ USER1 $ / check_es_system.sh -H $ ARG1 $ -t $ ARG3 $ $ ARG4 $
    }

    Icinga 2.x中的命令定义

    对象CheckCommand“ check_es_system” {
    导入“ plugin-check-command”
    命令= [PluginContribDir +“ /check_es_system.sh”]

    参数= {
    “ -H” = {
    值=“ $ es_address $”
    描述=“主机名或IP地址ElasticSearch节点“
    }

    ​ ” -P“ = {
    ​ 值=” $ es_port $“
    ​ 描述=”端口号(默认值:9200)“
    ​ }

    ​ ” -S“ = {
    ​ set_if =” $ es_ssl $“
    ​ description =”使用https“
    ​ }

    ​ ” -u“= {
    ​ 值=“ $ es_user $”
    ​ 描述=“如果需要验证,则为用户名”
    ​ }

    ​ “ -p” = {
    ​ value =“ $ es_password $”
    ​ description =“如果需要身份验证,则需要输入密码”
    ​ }

    ​ “ -d” = {
    ​ value =“ $ es_available $”
    ​ description =“定义ES的可用磁盘或内存大小cluster“
    ​ }

    ​ ” -t“ = {
    ​ value =” $ es_checktype $“
    ​ description =”定义检查类型(磁盘|内存|状态)“
    ​ }

    ​ ” -o“ = {
    ​ value =” $ es_unit $“
    ​ description =”选择大小单位(K | M | G)-千兆字节默认为G“
    ​ }

    ​ “ -i” = {
    ​ value =“ $ es_index $”
    ​ description =“要检查的只读索引的空格分隔列表(默认:'_all')”
    ​ }

    ​ “ -w” = {
    ​ value =“ $ es_warn $”
    ​ description =“警告阈值”
    ​ }

    ​ “ -c“ = {
    ​ 值=” $ es_crit $“
    ​ 描述=”关键阈值“
    ​ }

    ​ ” -m“ = {
    ​ 值=” $ es_max_time $“
    ​ 描述=”以秒为单位的最大时间(超时),Elasticsearch响应(默认值:30 )“
    ​ }

    ​ ” -e“= {
    ​ value =“ $ es_expect_master $”
    ​ description =“给定的节点名称应该是Elasticsearch集群的主节点。”
    ​ }
    }

    vars.es_address =“ $ address $”
    vars.es_ssl = false
    }

    服务定义

    Nagios,Icinga 1.x,Shinken,Naemon中的服务定义

    在此示例中,磁盘检查在myexcluster.in.the.cloud上进行,并假定有50GB的可用磁盘空间。访问集群统计信息需要身份验证,因此此处的登录使用用户“ read”和密码“ only”进行。

    #检查ElasticSearch磁盘使用情况
    定义服务{
    使用通用服务
    host_name myesnode
    service_description ElasticSearch磁盘使用情况
    check_command check_es_system!myescluster.in.the.cloud!disk!-d 50 -u只读-p
    }

    在下一个示例中,在myexcluster.in.the.cloud中检查Elasticsearch集群的状态:

    #检查ElasticSearch状态
    定义服务{
    使用通用服务
    host_name myesnode
    service_description ElasticSearch状态
    check_command check_es_system!myescluster.in.the.cloud!status
    }

    服务对象定义Icinga 2.x

    在此示例中,磁盘检查在myexcluster.in.the.cloud上进行,并假定有50GB的可用磁盘空间。访问集群统计信息需要身份验证,因此此处的登录使用用户“ read”和密码“ only”进行。

    #检查Elasticsearch磁盘使用情况
    对象服务“ ElasticSearch磁盘使用情况” {
    import“ generic-service”
    host_name =“ myesnode”
    check_command =“ check_es_system”
    vars.es_address =“ myescluster.in.the.cloud”
    vars.es_user =“读取”
    vars .es_password =“仅”
    vars.es_checktype =“磁盘”
    vars.es_available =“ 50”
    }

    在此示例中,将检查在myexcluster.in.the.cloud上运行的Elasticsearch的状态。访问集群统计信息需要身份验证,因此此处的登录使用用户“ read”和密码“ only”进行。

    #检查Elasticsearch Status
    对象服务“ ElasticSearch Status” {
    import“通用服务”
    host_name =“ myesnode”
    check_command =“ check_es_system”
    vars.es_address =“ myescluster.in.the.cloud”
    vars.es_user =“已读”
    vars.es_password =“仅”
    vars.es_checktype =“状态”
    }

    在此示例中,将检查在myexcluster.in.the.cloud上运行的Elasticsearch的线程池统计信息。如果任何线程池状态(活动,已排队,已拒绝)触发阈值,则插件将以警告或严重状态退出。

    #检查Elasticsearch Status
    对象服务“ ElasticSearch Status” {
    import“通用服务”
    host_name =“ myesnode”
    check_command =“ check_es_system”
    vars.es_address =“ myescluster.in.the.cloud”
    vars.es_user =“已读”
    vars.es_password =“仅”
    vars.es_checktype =“ tps”
    vars.es_warn =“ 50,170,1000”
    vars.es_crit =“ 100,200,5000”
    }

  • 相关阅读:
    ios UIWebView截获html并修改便签内容(转载)
    IOS获取系统时间 NSDate
    ios 把毫秒值转换成日期 NSDate
    iOS  如何判断当前网络连接状态  网络是否正常  网络是否可用
    IOS开发 xcode报错之has been modified since the precompiled header was built
    iOS系统下 的手机屏幕尺寸 分辨率 及系统版本 总结
    iOS 切图使用 分辨率 使用 相关总结
    整合最优雅SSM框架:SpringMVC + Spring + MyBatis 基础
    Java面试之PO,VO,TO,QO,BO
    Notes模板说明
  • 原文地址:https://www.cnblogs.com/sanduzxcvbnm/p/13140212.html
Copyright © 2011-2022 走看看