Deleting a managed node group from AWS EKS.
The Problem
I’ve got an issue with an EKS cluster node group where I can’t delete it. As a consequence, I can’t delete my EKS cluster because it has a node group attached to it.
I’ve provisioned my EKS cluster with Terraform and configured aws-auth as per official documentation. I’ve got admin access to the cluster, my worker nodes have connected successfully and there were a couple of deployments running on the cluster already.
My ConfigMap can be seen below:
$ kubectl describe configmap aws-auth -n kube-system Name: aws-auth Namespace: kube-system Labels: Annotations: Data ==== mapRoles: ---- - rolearn: arn:aws:iam::000000000000:role/cluster-role username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes - rolearn: arn:aws:iam::000000000000:role/node-group-role username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes - rolearn: arn:aws:iam::000000000000:role/aws-reserved/sso.amazonaws.com/eu-west-2/AWSReservedSSO_AdministratorAccess username: {{SessionName}} groups: - system:masters Events: none
I then wanted to create a node group that I could test with cluster autoscaler. Unfortunatelly, when I attempted to create/delete the node group, either via Terraform or using AWS Console, I received the following error:
“AccessDenied: The aws-auth ConfigMap in your cluster is invalid.”
A Solution
My goal was to delete the node group from the cluster.
Unfortunatelly, I could not work out what exactly was invalid about my aws-auth ConfigMap.
I suspect that the managed node group failed to get created because of the self-managed node IAM role role/node-group-role
that was already present in the aws-auth ConfigMap. I wanted to use the same node IAM role which evidently did not work. Amazon recommends using a role that is not currently in use by any self-managed node group, but as far as I am aware, this is a recommendation and not a requirement.
The documentation states that if you delete a managed node group that uses a node IAM role that isn’t used by any other managed node group in the cluster, the role is removed from the aws-auth ConfigMap. I believe that the opposite is also true: when you create a managed node group, the role is added to the aws-auth ConfigMap.
Official documentation says that when you create an EKS cluster, the IAM entity user that creates the cluster is automatically granted system:masters
permissions in the cluster’s RBAC configuration in the control plane. This IAM entity does not appear in the ConfigMap, but can be used to get access to the cluster. Since it is my IAM user that has created the cluster, I would not loose access to EKS even if I removed the aws-auth ConfigMap from the cluster.
A solution for me was to delete the aws-auth ConfigMap, what in turn allowed the successful removal of the node group from the EKS cluster. After deleting the node group, I re-created the aws-auth ConfigMap to minimise disruption.
References
https://github.com/aws/containers-roadmap/issues/1287
It turns out that the proposed way by the AWS documentation to integrate SSO users into the clusters is not compatible with the latest version of EKS.
The placeholder {{SessionName}} cannot be evaluated.
Hey Lisenet,
We leverage SSO roles in our aws-auth configmaps. I believe the issue you ran into is due to aws-iam-authenticator not supporting path in the IAM role ARN.
should be referenced as
instead.
Here is how we programmatically strip the path out of the role’s ARN in our Terraform:
Hope that helps!
Hi Seb, thanks. The problem in my case was that the placeholder
{{SessionName}}
could not be evaluated.AWS support advised me to use something along the lines of
admin:{{SessionName}}
. In reality, all I needed to do was to use quotation marks to resolve it"{{SessionName}}"
.