注意:如果您处于初始 12 个月阶段,请注意 Amazon CloudWatch Container Insights 不属于 AWS 免费套餐,因此使用可能会产生额外费用。
大约 | |
---|---|
✅ AWS 体验 | 200 – 中级 |
⏱ 完成时间 | 30 分钟 |
🧩 先决条件 | – AWS 账户 |
📢 反馈 | 任何反馈、问题,或者只是一个 👍 / 👎 ? |
⏰ 最后更新 | 2023-10-02 |
ClusterName
RegionName
my-cluster
us-east-2
FluentBitHttpPort
FluentBitReadFromHead
2
3
4
export LogRegion=us-east-2
export FluentBitHttpPort=‘2020’
export FluentBitReadFromHead=‘Off’
FluentBitHttpServer
2
3
4
5
6
–namespace amazon-cloudwatch
–cluster ${ClusterName} –role-name fluent-bit
–attach-policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy
–approve –region ${LogRegion}
–override-existing-serviceaccounts
workload.yaml
- 创建一个名为 workload.yaml 的 Kubernetes 清单,并将以下内容粘贴到其中。
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
kind: Namespace
metadata:
labels:
kubernetes.io/metadata.name: quickstart
name: quickstart
apiVersion: apps/v1
kind: Deployment
metadata:
name: “quickstart-nginx-deployment”
namespace: “quickstart”
spec:
selector:
matchLabels:
app: “quickstart-nginx”
replicas: 3
template:
metadata:
labels:
app: “quickstart-nginx”
role: “backend”
spec:
dnsPolicy: Default
enableServiceLinks: false
automountServiceAccountToken: false
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
– image: public.ecr.aws/nginx/nginx:latest
imagePullPolicy: Always
name: “quickstart-nginx”
resources:
requests:
memory: “64Mi”
cpu: “250m”
limits:
memory: “128Mi”
cpu: “500m”
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
ports:
– containerPort: 80
command: [“/bin/sh”]
args: [“-c”, “echo PodName: $MY_POD_NAME NodeName: $MY_NODE_NAME podIP: $MY_POD_IP> /usr/share/nginx/html/index.html && exec nginx -g ‘daemon off;'”]
env:
– name: MY_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
– name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
– name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
volumeMounts:
– name: cache
mountPath: /var/cache/nginx
– name: usr
mountPath: /var/run
– name: tmp
mountPath: /usr/share/nginx/html
volumes:
– name: cache
emptyDir: {}
– name: tmp
emptyDir: {}
– name: usr
emptyDir: {}
apiVersion: v1
kind: Service
metadata:
name: quickstart-nginx-service
namespace: quickstart
spec:
type: NodePort
selector:
app: “quickstart-nginx”
role: “backend”
ports:
– port: 80
targetPort: 80
apiVersion: v1
kind: Pod
metadata:
name: load
namespace: quickstart
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
automountServiceAccountToken: false
containers:
– name: load
image: public.ecr.aws/docker/library/busybox:1.36.1
imagePullPolicy: Always
command: [“/bin/sh”]
args: [“-c”, “while sleep 0.5; do wget -q -O- http://quickstart-nginx-service; done”]
resources:
requests:
memory: “64Mi”
cpu: “250m”
limits:
memory: “128Mi”
cpu: “500m”
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
- 在 中部署 Kubernetes 资源。
workload.yaml
2
3
4
deployment.apps/quickstart-nginx-deployment created
service/quickstart-nginx-service created
pod/load created
-
使用以下命令检查已部署的 Nginx 容器的状态,并确保它们正在运行:
2
3
4
5
6
7
8
9
10
11
12
13
14
pod/load 1/1 Running 0 15s
pod/quickstart-nginx-deployment-7cd757dc7b-9fss6 1/1 Running 0 16s
pod/quickstart-nginx-deployment-7cd757dc7b-fv592 1/1 Running 0 16s
pod/quickstart-nginx-deployment-7cd757dc7b-wpw4x 1/1 Running 0 16s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/quickstart-nginx-service NodePort 10.100.233.21 <none> 80:31243/TCP 16s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/quickstart-nginx-deployment 3/3 3 3 17s
NAME DESIRED CURRENT READY AGE
replicaset.apps/quickstart-nginx-deployment-7cd757dc7b 3 3 3 17s
-
使用以下命令查看 “load” Pod 的实时日志,该 Pod 正在不断向 Nginx 服务发出请求。使用 Ctrl+C 停止。
2
3
4
PodName: quickstart-nginx-deployment-7cd757dc7b-fv592 NodeName: ip-192-168-177-109.us-east-2.compute.internal podIP: 192.168.164.31
PodName: quickstart-nginx-deployment-7cd757dc7b-fv592 NodeName: ip-192-168-177-109.us-east-2.compute.internal podIP: 192.168.164.31
PodName: quickstart-nginx-deployment-7cd757dc7b-9fss6 NodeName: ip-192-168-119-7.us-east-2.compute.internal podIP: 192.168.112.25
步骤 3:使用 CloudWatch Logs Insights 查询搜索和分析容器日志
/aws/containerinsights/Cluster_Name/application
它包含集群中每个工作节点上的所有日志文件。/var/log/containers
- 打开 CloudWatch 控制台。
- 在导航窗格中,选择 Logs,然后选择 Log groups。
- 单击日志组 。其中 CLUSTER_NAME 是 EKS 集群的实际名称。
/aws/containerinsights/CLUSTER_NAME/application
- 在日志详细信息(右上角)下,单击“在日志见解中查看”。
- 在 CloudWatch Log Insight 查询编辑器中删除默认查询。然后,输入以下命令并选择“运行查询”:
2
3
4
| filter PodName like ‘quickstart-nginx-deployment’
| sort @timestamp desc
| limit 200
-
使用时间间隔选择器选择要查询的时间段。例如:
- 在 https://console.aws.amazon.com/cloudwatch/ 打开 CloudWatch 控制台。
- 在左侧导航窗格中,打开 Insights 下拉菜单,然后选择 Container Insights。
- 在“Container Insights”(顶部)下,从下拉菜单中选择“性能监控”。
- 在“EKS 集群”下拉字段中,选择集群的名称。
- 使用其他下拉菜单筛选资源,例如“EKS 集群”和“EKS Pod”。例如:
- 使用以下命令创建一个名为 geo-api 的 Kubernetes 清单,以部署名为 geo-api 的简单后端应用程序:
geo-api.yaml
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
kind: Deployment
metadata:
name: geo-api
spec:
selector:
matchLabels:
run: geo-api
replicas: 1
template:
metadata:
labels:
run: geo-api
spec:
containers:
– name: geo-api
image: registry.k8s.io/hpa-example
ports:
– containerPort: 80
resources:
limits:
cpu: 250m
memory: “12Mi”
requests:
cpu: 125m
memory: “10Mi”
apiVersion: v1
kind: Service
metadata:
name: geo-api
labels:
run: geo-api
spec:
ports:
– port: 80
selector:
run: geo-api
-
使用以下命令部署应用程序:
-
通过运行容器为 Web 服务器创建负载。
2
3
–image=busybox
–replicas=2 — /bin/sh -c “while sleep 0.01; do wget -q -O- http://geo-api; done”
-
验证 Pod 状态:
2
3
4
5
6
7
geo-api-load-c9c7bf98c-4rrn8 0/1 ContainerCreating 0 0s
geo-api-load-c9c7bf98c-kzsjs 0/1 ContainerCreating 0 0s
geo-api-load-c9c7bf98c-4rrn8 1/1 Running 0 1s
geo-api-load-c9c7bf98c-kzsjs 1/1 Running 0 2s
geo-api-76f6dcf999-ptpz5 1/1 Running 20 (5m8s ago) 118m
geo-api-76f6dcf999-ptpz5 0/1 OOMKilled 20 (5m13s ago) 118m
2
3
4
5
6
7
8
terminated:
containerID: containerd://4bbdfee06a3d3daca0e74f14f18f8a66ac0a415c79720eae44ea9ad4c46bcb37
exitCode: 137
finishedAt: “2023-08-26T12:48:37Z”
reason: OOMKilled
startedAt: “2023-08-26T12:47:27Z”
name: geo-api
-
让我们查看此 Pod 的容器见解指标:
-
在 https://console.aws.amazon.com/cloudwatch/ 打开 CloudWatch 控制台。
-
在导航窗格中,选择 Metrics (指标),然后选择 All metrics (所有指标)。
-
选择 ContainerInsights 指标命名空间。选择“ClusterName”、“Namespace”和“PodName”,在搜索栏中复制并粘贴 PodName=“geo-api”。
-
通过选择以下指标,您可以查看 Pod 使用的 CPU 单位相对于 Pod 限制的百分比,以及 Pod 使用的内存相对于 Pod 限制的百分比:
-
pod_cpu_utilization_over_pod_limit
-
pod_memory_utilization_over_pod_limit
-
-
2
3
4
5
6
7
kubectl delete -f workload.yaml -n quickstart
kubectl delete -f geo-api.yaml
# Delete the the CloudWatch agent and Fluentbit for Container Insights
kubectl delete -f cwagent-fluent-bit-quickstart.yaml
原创文章,作者:奋斗,如若转载,请注明出处:https://blog.ytso.com/tech/cloud/312921.html