Exporter(Linux主机监控)
由于 Linux 操作系统自身并不支持 Prometheus,所以 Prometheus 官方提供了 Go 语言编写的 Node exporter 来实现对 linux 操作系统主机的监控数据采集。它提供了系统内几乎所有的标准指标,如 CPU、内存、磁盘空间、磁盘I/O、系统负载和网络带宽。另外它还提供了由内核公开的大量额外监控指标,从负载平均到主板温度等。
在安装之前,首先在官方下载页面 https://github.com/prometheus/node_exporter/releases 找到最新 Node exporter 版本,下载最新版本中特定平台的二进制文件,如下:
一、部署 Node exporter
我这里都是kubernetes环境,就不讲二进制部署了。以下是 DaemonSet 运行的 Node export 配置清单 yaml 文件,Node exporter 版本:node-exporter:v1.3.1:
kind: DaemonSet apiVersion: apps/v1 metadata: name: node-exporter namespace: kubesphere-monitoring-system labels: app.kubernetes.io/component: exporter app.kubernetes.io/name: node-exporter app.kubernetes.io/part-of: kube-prometheus app.kubernetes.io/version: 1.3.1 annotations: deprecated.daemonset.template.generation: '1' kubectl.kubernetes.io/last-applied-configuration: > {"apiVersion":"apps/v1","kind":"DaemonSet","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"exporter","app.kubernetes.io/name":"node-exporter","app.kubernetes.io/part-of":"kube-prometheus","app.kubernetes.io/version":"1.3.1"},"name":"node-exporter","namespace":"kubesphere-monitoring-system"},"spec":{"selector":{"matchLabels":{"app.kubernetes.io/component":"exporter","app.kubernetes.io/name":"node-exporter","app.kubernetes.io/part-of":"kube-prometheus"}},"template":{"metadata":{"labels":{"app.kubernetes.io/component":"exporter","app.kubernetes.io/name":"node-exporter","app.kubernetes.io/part-of":"kube-prometheus","app.kubernetes.io/version":"1.3.1"}},"spec":{"affinity":{"nodeAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"node-role.kubernetes.io/edge","operator":"DoesNotExist"}]}]}}},"containers":[{"args":["--web.listen-address=127.0.0.1:9100","--path.procfs=/host/proc","--path.sysfs=/host/sys","--path.rootfs=/host/root","--no-collector.wifi","--no-collector.hwmon","--collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)","--collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$"],"image":"registry.cn-beijing.aliyuncs.com/kubesphereio/node-exporter:v1.3.1","name":"node-exporter","resources":{"limits":{"cpu":1,"memory":"500Mi"},"requests":{"cpu":"102m","memory":"180Mi"}},"volumeMounts":[{"mountPath":"/host/proc","name":"proc","readOnly":true},{"mountPath":"/host/sys","name":"sys","readOnly":true},{"mountPath":"/host/root","mountPropagation":"HostToContainer","name":"root","readOnly":true}]},{"args":["--logtostderr","--secure-listen-address=[$(IP)]:9100","--tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256","--upstream=http://127.0.0.1:9100/"],"env":[{"name":"IP","valueFrom":{"fieldRef":{"fieldPath":"status.podIP"}}}],"image":"registry.cn-beijing.aliyuncs.com/kubesphereio/kube-rbac-proxy:v0.11.0","name":"kube-rbac-proxy","ports":[{"containerPort":9100,"hostPort":9100,"name":"https"}],"resources":{"limits":{"cpu":1,"memory":"100Mi"},"requests":{"cpu":"10m","memory":"20Mi"}},"securityContext":{"runAsGroup":65532,"runAsNonRoot":true,"runAsUser":65532}}],"hostNetwork":true,"hostPID":true,"nodeSelector":{"kubernetes.io/os":"linux"},"securityContext":{"runAsNonRoot":true,"runAsUser":65534},"serviceAccountName":"node-exporter","tolerations":[{"operator":"Exists"}],"volumes":[{"hostPath":{"path":"/proc"},"name":"proc"},{"hostPath":{"path":"/sys"},"name":"sys"},{"hostPath":{"path":"/"},"name":"root"}]}}}} spec: selector: matchLabels: app.kubernetes.io/component: exporter app.kubernetes.io/name: node-exporter app.kubernetes.io/part-of: kube-prometheus template: metadata: creationTimestamp: null labels: app.kubernetes.io/component: exporter app.kubernetes.io/name: node-exporter app.kubernetes.io/part-of: kube-prometheus app.kubernetes.io/version: 1.3.1 spec: volumes: - name: proc hostPath: path: /proc type: '' - name: sys hostPath: path: /sys type: '' - name: root hostPath: path: / type: '' containers: - name: node-exporter image: 'registry.cn-beijing.aliyuncs.com/kubesphereio/node-exporter:v1.3.1' args: - '--web.listen-address=127.0.0.1:9100' - '--path.procfs=/host/proc' - '--path.sysfs=/host/sys' - '--path.rootfs=/host/root' - '--no-collector.wifi' - '--no-collector.hwmon' - >- --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/) - >- --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$ resources: limits: cpu: '1' memory: 500Mi requests: cpu: 102m memory: 180Mi volumeMounts: - name: proc readOnly: true mountPath: /host/proc - name: sys readOnly: true mountPath: /host/sys - name: root readOnly: true mountPath: /host/root mountPropagation: HostToContainer terminationMessagePath: /dev/termination-log terminationMessagePolicy: File imagePullPolicy: IfNotPresent - name: kube-rbac-proxy image: >- registry.cn-beijing.aliyuncs.com/kubesphereio/kube-rbac-proxy:v0.11.0 args: - '--logtostderr' - '--secure-listen-address=[$(IP)]:9100' - >- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 - '--upstream=http://127.0.0.1:9100/' ports: - name: https hostPort: 9100 containerPort: 9100 protocol: TCP env: - name: IP valueFrom: fieldRef: apiVersion: v1 fieldPath: status.podIP resources: limits: cpu: '1' memory: 100Mi requests: cpu: 10m memory: 20Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File imagePullPolicy: IfNotPresent securityContext: runAsUser: 65532 runAsGroup: 65532 runAsNonRoot: true restartPolicy: Always terminationGracePeriodSeconds: 30 dnsPolicy: ClusterFirst nodeSelector: kubernetes.io/os: linux serviceAccountName: node-exporter serviceAccount: node-exporter hostNetwork: true hostPID: true securityContext: runAsUser: 65534 runAsNonRoot: true affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: node-role.kubernetes.io/edge operator: DoesNotExist schedulerName: default-scheduler tolerations: - operator: Exists updateStrategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 maxSurge: 0 revisionHistoryLimit: 10
二、与 Prometheus 集成
Node exporter 和 Prometheus 启动后,没有经过配置文件配置,他们还是没有进行对接关联,此时,两个程序是各自独立运行的应用程序。
现在需要将已部署好的 node_exporter 添加到 Prometheus 服务器中。在 Prometheus 主机目录中,找到主配置文件,使用其中的静态配置功能 static_configs 来采集 node_exporter 提供的数据。
~ # cat /etc/prometheus/prometheus.yml # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: "prometheus" # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ["localhost:9090"] ~ #
在默认的配置文件的基础上,重新编辑 /etc/prometheus/prometheus.yaml 文件,添加 job 与 node_exporter 进行关联的参考配置文件内容如下:
scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: "prometheus" # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ["localhost:9090"] - job_name: "node_exporter" static_configs: - targets: ["192.168.2.121:9100"]
配置完成后,需要我们重新启动 prometheus 或 进行动态热加载操作,使操作修改后的配置文件加载生效。
- 1、首先,可以在 Prometheus UI 首页点开 “status” 中的 “Targets”,如下图所示:
- 2、进入 Targets 页面后,可以在列表中看到刚才配置好的 node_exporter 的状态为 “UP”,说明 Prometheus 最后一次从 Node exporter 中采集数据是成功的,此刻被监控的服务器主机工作状态是正常的,如下图所示:
- 3、我们也可以在 Prometheus UI 提供的 graph 页面,在搜索框中查找,这里就不说了。
- 4、metrics 查看:
- CPU 数据采集。
- 内存数据采集:数据源来源于 /proc/meminfo 文件。
- 磁盘数据采集:数据来源于 /proc/diskstats 文件。
- 文件系统数据采集。
- 网络数据采集。
原创文章,作者:wure,如若转载,请注明出处:https://blog.ytso.com/274457.html