今天提供一个Linux 系统监控脚本,监控的指标有CPU,内存使用率,负载,IO等, 并且将主机的这些指标插入到数据库里,便于历史分析。
创建监控表
1 2 3 4 5 6 7 8 9 10 11
|
创建一张监控表,用于记录各项批指标信息 CREATE TABLE tbl_monitor ( id serial, hostname character varying(64), add_time timestamp without time zone, cpu_useratio numeric(5,2), mem_useratio numeric(5,2), load numeric(5,2), io_wa smallint, constraint pk_tbl_monitor PRIMARY KEY (ID) );
|
SHELL 脚本
CPU、内存、负载、IO指标监控脚本如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
|
#!/bin/bash
export PGHOME=/opt/pgsql export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$PGSQL_HOME/lib:$PROJ_HOME/lib:$GEOS_HOME/lib:$POSTGIS_HOME/lib export PATH=$PGHOME/bin:$PATH:.
file_dir="/home/postgres/script/tf/monitor"
file_dir="/home/postgres/script/tf/monitor" vmstat 1 3 > ${file_dir}/cpu_file.txt free -m > ${file_dir}/mem_file.txt cpu_file="${file_dir}/cpu_file.txt" mem_file="${file_dir}/mem_file.txt"
v_hostname="`hostname`" v_hostip="xxx.xxx.xxx.xxx" v_time="`date +%F %T`" v_cpuidle=`cat ${cpu_file} | sed -n '$'p | awk '{print $15}'` v_cpuuse=`echo "scale=2; 100.00-${v_cpuidle}" | bc` v_memtotal=`cat ${mem_file} | sed -n '2'p | awk '{print $2}'` v_memused=`cat ${mem_file} | sed -n '2'p | awk '{print $3}'` v_memratio=`echo "scale=2; ${v_memused}*100/${v_memtotal}" | bc` v_load=`uptime | awk '{print $10}'| tr -d ","` v_io=`cat ${cpu_file} | sed -n '$'p | awk '{print $(NF-1) }'` v_email="[francs.tan@sky-mobi.com](mailto:francs.tan@sky-mobi.com) [Francs3@163.com](mailto:Francs3@163.com)"
if [ ${v_cpuidle} -lt 95 ]; then echo "`date +%F %T` ${v_hostip}: CPU usage alarm ,please check ! " | mutt -s "CPU usage ${v_cpuuse}% , ${v_hostname} " ${v_email} fi
psql -h 127.0.0.1 -p 1921 -U skytf -d skytf -c " insert into skytf.tbl_monitor(hostname,add_time,cpu_useratio, mem_useratio , load ,io_wa ) values ( '${v_hostname}', '${v_time}', ${v_cpuuse}, ${v_memratio}, ${v_load}, ${v_io} );"
rm -f ${cpu_file} rm -f ${mem_file}
|
备注 这个脚本取当前LINUX操作系统的各项指标并录入到数据库里;
1. CPU 指标来源于 "vmstat" 输出的 id 字段;
2. memory 指标来源" free -m" 输出;
3. load 指标来源于 " uptime " 输出;
4. io 指标来源于 "vmstat" 输出的 wa 字段。
设置 Crontab 任务计划
1 2
|
###### Monitor cpu, memory, io, load */5 * * * * /home/postgres/script/tf/monitor/monitor.sh >> /home/postgres/script/tf/monitor/monitor.log 2>&1
|
取部分数据
1 2 3 4 5 6 7 8 9
|
skytf=> select * From tbl_monitor limit 5; id | hostname | add_time | cpu_useratio | mem_useratio | load | io_wa ----+--------------+---------------------+--------------+--------------+------+------- 37 | 172_16_3_216 | 2011-04-13 09:55:03 | 14.00 | 99.42 | 1.47 | 7 38 | 172_16_3_216 | 2011-04-13 09:57:03 | 0.00 | 83.30 | 0.86 | 0 39 | 172_16_3_216 | 2011-04-13 09:59:03 | 0.00 | 83.26 | 0.11 | 0 40 | 172_16_3_216 | 2011-04-13 10:05:03 | 13.00 | 84.89 | 0.97 | 0 41 | 172_16_3_216 | 2011-04-13 10:10:03 | 13.00 | 85.05 | 1.02 | 0 (5 rows)
|
发现数据已经进去了,将OS层面数据记录到数据库里,从而当系统出现问题时,可以做为一个监控凭证。
总结
这脚本尚处于测试阶段,后续如做生产用途还需完善。
原创文章,作者:bd101bd101,如若转载,请注明出处:https://blog.ytso.com/236386.html