[推荐] 用CW代理监控ec2的内存和磁盘指标

上一篇介绍了如何在ec2上下载脚本,通过运行两个perl文件来监控特定ec2的内存占用情况,今天和大家一起学习下如何用官方推荐的方式-安装cloudwatch代理(以下简称CW)来实现对 ec2 内存和磁盘占用的监(窥)控(探)。

-- D.C

老规矩:给EC2创建IAM Role

iam_role

iam_role_ec2

解决方法:
方案1. 在角色 CloudWatchAgentServerRole 上再加attach一条Policy如 AmazonS3ReadOnlyAccessAmazonS3FullAccess就可以了。
方案2. 在ec2上设置aws configure的全部信息(包括AKSK信息,不推荐)。

iam_role_adds3

登录ec2安装CW代理

Linux:

wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm

如果下载速度太慢可以wget这个链接:
https://publicuse.s3.cn-northwest-1.amazonaws.com.cn/amazon-cloudwatch-agent.rpm
或者
Baidu云盘下载再上传到ec2。

其他操作系统代理链接:

Centos:
https://s3.amazonaws.com/amazoncloudwatch-agent/centos/amd64/latest/amazon-cloudwatch-agent.rpm

Ubuntu:
https://s3.amazonaws.com/amazoncloudwatch-agent/ubuntu/amd64/latest/amazon-cloudwatch-agent.deb

$ sudo rpm -U ./amazon-cloudwatch-agent.rpm

$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s

/opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source default --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default
Successfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/default.tmp
Start configuration validation...
/opt/aws/amazon-cloudwatch-agent/bin/config-translator --input /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json --input-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --output /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default
2020/02/06 08:16:03 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/default.tmp ...
Valid Json input schema.
I! Detecting runasuser...
No csm configuration found.
No log configuration found.
Configuration validation first phase succeeded
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml
Configuration validation second phase succeeded
Configuration validation succeeded
Created symlink from /etc/systemd/system/multi-user.target.wants/amazon-cloudwatch-agent.service to /etc/systemd/system/amazon-cloudwatch-agent.service.
Redirecting to /bin/systemctl restart amazon-cloudwatch-agent.service

关于amazon-cloudwatch-agent-ctl脚本使用说明:

usage: amazon-cloudwatch-agent-ctl -a stop|start|status|fetch-config|append-config|remove-config [-m ec2|onPremise|auto] [-c default|ssm:<parameter-store-name>|file:<file-path>] [-s]

e.g.
1. apply a SSM parameter store config on EC2 instance and restart the agent afterwards:
    amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:AmazonCloudWatch-Config.json -s
2. append a local json config file on onPremise host and restart the agent afterwards:
    amazon-cloudwatch-agent-ctl -a append-config -m onPremise -c file:/tmp/config.json -s
3. query agent status:
    amazon-cloudwatch-agent-ctl -a status

-a: action
    stop:                                   stop the agent process.
    start:                                  start the agent process.
    status:                                 get the status of the agent process.
    fetch-config:                           use this json config as the agent's only configuration.
    append-config:                          append json config with the existing json configs if any.
    remove-config:                          remove json config based on the location (ssm parameter store name, file name)

-m: mode
    ec2:                                    indicate this is on ec2 host.
    onPremise:                              indicate this is on onPremise host.
    auto:                                   use ec2 metadata to determine the environment, may not be accurate if ec2 metadata is not available for some reason on EC2.

-c: configuration
    default:                                default configuration for quick trial.
    ssm:<parameter-store-name>:             ssm parameter store name
    file:<file-path>:                       file path on the host

-s: optionally restart after configuring the agent configuration
    this parameter is used for 'fetch-config', 'append-config', 'remove-config' action only.

测试内存小脚本:

#!/bin/bash
mkdir /tmp/memory
mount -t tmpfs -o size=1024M tmpfs /tmp/memory
dd if=/dev/zero of=/tmp/memory/block
sleep 3600
rm /tmp/memory/block
umount /tmp/memory
rmdir /tmp/memory

进入CW查看监控

cw_metrics

$ sudo bash testmem.sh &
[1] 3337
2097153+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 2.78481 s, 386 MB/s

cw_metrics_memchange

坚持很重要,若非不得已,不要放弃!