10分钟一键创建HPC集群-pcluster (parallelcluster)

brief introduction of pcluster on aws

-- D.C

俗话说的好,没图说个基霸

pcluster

好话说前头

举例:

pcluster 版本 系统 宁夏区ami 北京区ami
2.5.1 ubuntu ami-0202652c7cb199eb6 ami-00881ffc995032786
2.5.1 linux ami-05038e0c41061b799 ami-0bbadb6ff64415ab3
2.5.1 ubuntu ami-0b0ebbfcd0c50f225 ami-0d3a6e7dd85085042
名称 说明
模板机 从官方的pcluster镜像启动,安装用户自己的软件后,打包成分析镜像AMI
控制机 不需要从镜像启动,用于安装pcluster命令,用于控制集群
主节点 用于调度集群作业的节点
计算节点 用于分析计算的节点

[必读] 创建分析镜像

原始pcluster官方镜像(2.7版本,ami-0c7a09bc17088086c) --> 启动模板机 --> 安装分析流程 -->打镜像sanpshot --> 得到custom_ami

控制机(默认Amazon Linux 2 AMI) --> 安装pcluster软件2.7版本 --> 配置config --> 运行命令启动集群

打开一台虚拟机作为Pcluster的 控制服务器 (v2.7.0)

安装pcluster 控制软件

# 安装python3 依赖包
sudo yum -y install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel
sudo yum install gcc -y

# 安装python3 和 pip3
sudo yum -y install python3
python3 -m pip install --upgrade pip --user

# 设置清华镜像
pip3 install pip -U
pip3 config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

# 安装parallelcluster
pip3 install aws-parallelcluster --upgrade --user  #默认安装最新版pcluster
pcluster version

TIPS:

万一装错了pcluster版本咋办:

pip3 uninstall aws-parallelcluster # 删除老的
pip3 install aws-parallelcluster==2.7.0 # 安装指定版本
$ aws configure
AWS Access Key ID [None]: AKXXXXXXXXXXXXXXXX
AWS Secret Access Key [None]: wJalrXUtnFEMI/KXXXXXXXXXXXXXXXXX
Default region name [us-east-1]: cn-northwest-1
Default output format [None]: json

# 确认下aksk秘钥生效(能摸到S3上数据)
$ aws s3 ls
2020-01-29 11:10:27 lovevideo

生成并配置集群的config文件, 一键启动pcluster

[config template]

[aws]
aws_region_name = cn-northwest-1 # change if you want
[global]
cluster_template = myname # change if you want, MUST remember this name!
update_check = true
sanity_check = true
[aliases]
ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}
[cluster myname] # change if you changed the name of cluster_template in [global] settings above
key_name = mytest # change to your keypair name
master_instance_type = c5.2xlarge  # master node type
compute_instance_type = c5.large  # compute node type
pre_install =  https://awshcls.s3.cn-northwest-1.amazonaws.com.cn/efsfix.sh # 中国区用EFS加上这句
pre_install_args = NONE  # 中国区用EFS加上这句
initial_queue_size = 1  #  number of compute nodes when this cluster be created
max_queue_size = 30  # maximum number of compute nodes
maintain_initial_size = true  # will remain 1 compute node even no job submitted
master_root_volume_size = 25  # master's root disk volumn, 17G by default
compute_root_volume_size = 25  # compute's root disk volumn, 17G by default
cluster_type = spot  # ondemand/spot
spot_price = 0.4   # change if you want use spot as compute nodes, get latest price of specific instance in your EC2 console
base_os = alinux2  # change if you did not choose amazon linux as your template ami OS
scheduler = sge #  设定调度引擎,sge,torque,slurm, awsbatch
custom_ami = ami-xxxxxxxxxxxxxx # ami-0c7a09bc17088086c 2.7.0; ami-0c081e1551e30ee5a 2.6.1 ; change to your customized AMI based on pcluster ami
s3_read_resource = NONE
s3_read_write_resource = NONE
placement = compute
vpc_settings = default
#ebs_settings = custom1, custom2 # use EBS to be shared as NFS
efs_settings = custom1 # use EFS to be shared [recommmend]
[vpc default]
vpc_id = vpc-6b111111  # change to your vpc id
master_subnet_id = subnet-a23v24c # change to your subnet id
#compute_subnet_id = subnet-a23v24c # if you want to put compute in private subnet for security
#[ebs custom1]  # change or add more if you want
#shared_dir = data1  # the dir will show in your master or compute nodes
#volume_type = gp2
#volume_size = 80    # GB
#[ebs custom2]  # change or add more if you want
#shared_dir = data2
#volume_type = gp2
#volume_size = 200 # GB
[efs custom1]
shared_dir = myefs # change and will be mounted as /myefs in your master and computer node
encrypted = false
performance_mode = generalPurpose
[scaling custom]
scaledown_idletime = 10  # if idle time of compute nodes exceeds, it will be terminated for cost control (min)

pcluster configure

$ pcluster create -c config myname
Beginning cluster creation for cluster: myname
Creating stack named: parallelcluster-myname
Status: parallelcluster-myname - CREATE_COMPLETE
MasterPublicIP: 161.111.111.111
ClusterUser: ec2-user
MasterPrivateIP: 172.11.11.11

登录主节点,确认集群功能

$ cat >> test.slurm<<EOF
#!/bin/bash
#SBATCH -J array
#SBATCH -p compute
#SBATCH -N 1
#SBATCH --cpus-per-task=1
#SBATCH -t 5:00
#SBATCH -a 0-2

input=(foo bar baz)
echo "This is job #\${SLURM_ARRAY_JOB_ID}, with parameter ${input[$SLURM_ARRAY_TASK_ID]}"
echo "There are \${SLURM_ARRAY_TASK_COUNT} task(s) in the array."
echo "  Max index is \${SLURM_ARRAY_TASK_MAX}"
echo "  Min index is \${SLURM_ARRAY_TASK_MIN}"
sleep 5
EOF

提交任务sbatch test.slurm

$ cat >> test.pbs<<EOF
#!/bin/bash
#PBS -l nodes=1:ppn=2

sleep 600
EOF

qsub test.pbs

qstat

$ cat >> test.sh<<EOF
#!/bin/bash
sleep 600
EOF

qsub -cwd -pe smp 2 -l vf=2G test.sh

qhost to check your queue

df -h to check your volumns

qsub test.sh to check your cluster function

qstat -f to see job status

submit your jobs using command like qsub -cwd -S /bin/bash -V -l vf=2G -pe smp 4 -o output -e output -q all.q yourscript.sh

关于共享存储

我们一般可以用EBS的gp2类型来组建NFS盘,可以和本地环境无缝对接了,但是对于高IOPS的应用场景,可能就会遇到瓶颈了,所幸Pcluster还为我们提供了其他的共享存储选项:

以EFS为例,将EBS设置部分替换成如下:

efs_settings = customfs

[efs customfs]
shared_dir = efs
encrypted = false
performance_mode = generalPurpose

这样build出来的集群,EFS共享存储性能有保证,且容量弹性,按照实际占用空间计费,如果善于利用EFS的生命周期管理(去EFS页面设置),成本也能控制的不错。

注意: 如果想利用现有的EFS盘也可以(如下设置),但是要注意要提前删除这块EFS上所有的挂载目标(mount target),以前被坑过的就是,当我手动在EFS界面新建efs盘后,系统会默认为这块efs添加所在region的所有可用区的mount target,这样做无疑是为了以后使用方便,但是这样的盘是无法被pcluste利用的,推测其后台会在建集群的时候会分配一个对应可用区的mount target, 如果发现已经有了mount target就会卡在那里。

doc 是这么说的:

Specifying this option voids all other Amazon EFS options except for shared_dir. If you set this option to config_sanity, it only supports file systems:

That don't have a mount target in the stack's Availability Zone

OR

That do have an existing mount target in the stack's Availability Zone, with inbound and outbound NFS traffic allowed from 0.0.0.0/0.

删除mount target方法:EFS界面-点击对应的efs id - 右下角Network - Manage - 把每个可用区的mount target统统remove掉 - save

efs_settings = customfs

[efs customfs]
shared_dir = efs
efs_fs_id = fs-302c28d5

for debug

for autoscaling 混搭[高阶]

wget https://awshcls.s3.cn-northwest-1.amazonaws.com.cn/pcluster/asgmodify.json

#修改其中ASG name、LaunchTemplateName和所需实例类型
vi asgmodify.json

aws autoscaling update-auto-scaling-group --cli-input-json file://asgmodify.json --profile zhy

pcluster_User_Guide

行者无疆,干就是了。