已ec2自建gp连接数监控为例:
编写监控脚本monitor.sh配置文件:
1 #!/usr/bin/env bash 2 . /home/gpadmin/.bashrc 3 export PGPASSWORD=xxxxx 4 home=/home/gpadmin/scripts/ 5 host=xxxx 6 ## transactions 7 psql -qtAX -h$host -p5432 -Udev_user -dcamel -f /home/gpadmin/scripts/sql/pgsql.transactions.sql > /home/gpadmin/scripts/logs/transactions.txt 8 ## diskfree 9 psql -qtAX -h$host -p5432 -Udev_user -dcamel -f /home/gpadmin/scripts/sql/pgsql.diskfree.sql|sort -r > /home/gpadmin/scripts/logs/diskfree.txt 10 ## connections 11 psql -qtAX -h$host -p5432 -Udev_user -dcamel -f /home/gpadmin/scripts/sql/pgsql.connections.sum.sql > /home/gpadmin/scripts/logs/connections.txt 12 ## bgwriter 13 #psql -qtAX -h$host -p5432 -Udev_user -dcamel -f /home/gpadmin/scripts/sql/pgsql.bgwriter.sql > /home/gpadmin/scripts/logs/bgwriter.txt 14 ## tps_qps 15 #psql -qtAX -h$host -p5432 -Udev_user -dcamel -f /home/gpadmin/scripts/sql/pgsql.tpqs.sql > /home/gpadmin/scripts/logs/tqps.txt 16 transactions_idle=$(cat /$home/logs/transactions.txt|jq .idle) 17 transactions_active=$(cat /$home/logs/transactions.txt|jq .active) 18 transactions_waiting=$(cat /$home/logs/transactions.txt|jq .waiting) 19 transactions_prepared=$(cat /$home/logs/transactions.txt|jq .prepared) 20 active_cnt=$(cat /$home/logs/connections.txt|jq .active) 21 idle_cnt=$(cat /$home/logs/connections.txt|jq .idle) 22 total_cnt=$(cat /$home/logs/connections.txt|jq .total) 27 #tps=$(cat /home/gpadmin/scripts/logs/tqps.txt|jq .segment_disk_free_gb|sed -n 4p) 28 29 #checkpoints_timed=$(cat /home/gpadmin/scripts/logs/bgwriter.txt|jq .checkpoints_timed) 30 #checkpoints_req=$(cat /home/gpadmin/scripts/logs/bgwriter.txt|jq .checkpoints_timed) 31 #buffers_checkpoint=$(cat /home/gpadmin/scripts/logs/bgwriter.txt|jq .checkpoints_timed) 32 #buffers_backend=$(cat /home/gpadmin/scripts/logs/bgwriter.txt|jq .checkpoints_timed) 33 #buffers_alloc=$(cat /home/gpadmin/scripts/logs/bgwriter.txt|jq .checkpoints_timed) 34 35 aws cloudwatch put-metric-data --metric-name transactions_idle --dimensions Instance=i-xxxx --namespace "GreenPlum" --value $transactions_idle 36 aws cloudwatch put-metric-data --metric-name transactions_active --dimensions Instance=i-xxxx --namespace "GreenPlum" --value $transactions_active 37 aws cloudwatch put-metric-data --metric-name transactions_waiting --dimensions Instance=i-xxx --namespace "GreenPlum" --value $transactions_waiting 38 aws cloudwatch put-metric-data --metric-name transactions_prepared --dimensions Instance=i-xxx --namespace "GreenPlum" --value $transactions_prepared 39 aws cloudwatch put-metric-data --metric-name active_cnt --dimensions Instance=i-xxx --namespace "GreenPlum" --value $active_cnt 40 aws cloudwatch put-metric-data --metric-name idle_cnt --dimensions Instance=i-xxx --namespace "GreenPlum" --value $idle_cnt 41 aws cloudwatch put-metric-data --metric-name total_cnt --dimensions Instance=i-xxx --namespace "GreenPlum" --value $total_cnt
以上监控了gp会话,可用磁盘空间以及事物情况,可以自己根据实际需求修改。
然后把再服务器上放置好配置文件里面所需.sql查询语句。
1 gpadmin@ip-10-10-34-163:~/scripts$ cat sql/pgsql.connections.sum.sql 2 DO LANGUAGE plpgsql $$ 3 DECLARE 4 ver integer; 5 res text; 6 BEGIN 7 SELECT current_setting('server_version_num') INTO ver; 8 9 IF (ver >= 90600) THEN 10 SELECT row_to_json(T) INTO res from ( 11 SELECT 12 sum(CASE WHEN state = 'active' THEN 1 ELSE 0 END) AS active, 13 sum(CASE WHEN state = 'idle' THEN 1 ELSE 0 END) AS idle, 14 sum(CASE WHEN state = 'idle in transaction' THEN 1 ELSE 0 END) AS idle_in_transaction, 15 count(*) AS total, 16 count(*)*100/(SELECT current_setting('max_connections')::int) AS total_pct, 17 sum(CASE WHEN waiting_reason IS NOT NULL THEN 1 ELSE 0 END) AS waiting, 18 (SELECT count(*) FROM pg_prepared_xacts) AS prepared 19 FROM pg_stat_activity WHERE datid is not NULL 20 ) T; 21 22 ELSE 23 SELECT row_to_json(T) INTO res from ( 24 SELECT 25 sum(CASE WHEN state = 'active' THEN 1 ELSE 0 END) AS active, 26 sum(CASE WHEN state = 'idle' THEN 1 ELSE 0 END) AS idle, 27 sum(CASE WHEN state = 'idle in transaction' THEN 1 ELSE 0 END) AS idle_in_transaction, 28 count(*) AS total, 29 count(*)*100/(SELECT current_setting('max_connections')::int) AS total_pct, 30 sum(CASE WHEN waiting IS TRUE THEN 1 ELSE 0 END) AS waiting, 31 (SELECT count(*) FROM pg_prepared_xacts) AS prepared 32 FROM pg_stat_activity 33 ) T; 34 END IF; 35 36 perform set_config('conn_json_res', res, false); 37 END $$; 38 39 select current_setting('conn_json_res');
每一个监控指标数据,都可以用sql写出来,写好sql查询指标放到指定目录,然后配置好awsconfig,然后在服务器上就可以取到所需值了。
最后把monitor.sh加到crontab,1min推送一次。
1 gpadmin@ip-10-10-34-163:~/scripts$ crontab -l 2 3 #*/1 * * * * script -c "/home/gpadmin/scripts/monitor.sh >>/tmp/crontab.log" 4 */1 * * * * /home/gpadmin/scripts/monitor.sh >>/tmp/crontab.log 2>&1
最后查看cloudwatch指标,发现之前的监控图所示:
有数据收集上来表示没有问题。