Karpenter Setup
In this section we will configure Karpenter to allow the creation of Inferentia and Trainium EC2 instances. Karpenter can detect the pending Pods that require an inf2 or trn1 instance. Karpenter will then launch the required instance to schedule the Pod.
You can learn more about Karpenter in the Karpenter module that's provided in this workshop.
Karpenter has been installed in our EKS cluster, and runs as a deployment:
NAME READY UP-TO-DATE AVAILABLE AGE
...
karpenter 2/2 2 2 11m
Karpenter requires a NodePool
to provision nodes. This is the Karpenter NodePool
that we will create:
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: aiml
spec:
template:
metadata:
labels:
instanceType: "neuron"
provisionerType: "karpenter"
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values:
- on-demand
- key: karpenter.k8s.aws/instance-family
operator: In
values:
- inf2
- trn1
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: aiml
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: aiml
spec:
amiFamily: AL2
amiSelectorTerms:
- alias: al2@latest
blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
deleteOnTermination: true
volumeSize: 100Gi
volumeType: gp3
role: ${KARPENTER_NODE_ROLE}
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: ${EKS_CLUSTER_NAME}
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: ${EKS_CLUSTER_NAME}
tags:
app.kubernetes.io/created-by: eks-workshop
In this section we assign what instances this NodePool is allowed to provision for us
You can see here that we've configured this NodePool to only allow the creation of inf2 and trn1 instances
Apply the NodePool
and EC2NodeClass
manifest:
Now the NodePool is ready for the creation for our training and inference Pods.