Key Concepts
Node
A worker machine (physical or virtual) that runs pods.
Cordon
Mark a node as unschedulable to prevent new pods.
Drain
Safely evict all pods before maintenance.
Taint
Prevent pods from scheduling unless they have matching tolerations.
Required Permissions
| Action | Permission |
|---|---|
| View nodes | iam:project:infrastructure:kubernetes:read |
| Add/Remove nodes | iam:project:infrastructure:kubernetes:write |
| Cordon/Drain/Uncordon | iam:project:infrastructure:kubernetes:write |
| Manage taints/labels | iam:project:infrastructure:kubernetes:write |
How to Add Nodes to a Cluster
Nodes must be reachable via SSH and meet cluster requirements before joining.
How to Remove Nodes
How to Cordon a Node
Cordoning prevents new pods from being scheduled on a node. Existing pods continue running.How to Drain a Node
Draining evicts all pods from a node before maintenance.
Drain behavior:
- Regular pods are evicted and rescheduled
- DaemonSet pods are skipped
- Static pods are skipped
- Pods with local storage may fail without force flag
How to Uncordon a Node
How to Add a Taint
Taints prevent pods from scheduling unless they have matching tolerations.Configure Taint
- Key - Taint identifier (e.g.,
dedicated,gpu) - Value - Optional value (e.g.,
gpu-node) - Effect - Scheduling behavior
| Effect | Description |
|---|---|
NoSchedule | Pods without toleration won’t be scheduled |
PreferNoSchedule | System tries to avoid scheduling pods without toleration |
NoExecute | Existing pods without toleration will be evicted |
How to Remove a Taint
How to Add a Label
Labels help organize nodes and enable workload targeting.Configure Label
- Key - Label key (e.g.,
environment,tier) - Value - Label value (e.g.,
production,frontend)
How to Remove a Label
Troubleshooting
Node shows NotReady status
Node shows NotReady status
- Check kubelet is running on the node
- Verify network connectivity to control plane
- Check node conditions for memory, disk, or PID pressure
- Review kubelet logs
Drain operation fails
Drain operation fails
- Some pods may have PodDisruptionBudgets preventing eviction
- Pods with local storage may not drain without force
- Check drain summary for specific failures
Pods not scheduling on node
Pods not scheduling on node
- Check if node is cordoned
- Verify node has sufficient resources
- Check for taints that may prevent scheduling
- Ensure pods have matching tolerations
Cannot add taint or label
Cannot add taint or label
- Verify you have write permission
- Check key format is valid
- System labels may be protected
FAQ
What's the difference between cordon and drain?
What's the difference between cordon and drain?
Cordon only marks the node as unschedulable. Drain both cordons AND evicts all pods.
What happens to pods when I drain?
What happens to pods when I drain?
Pods are gracefully terminated and rescheduled on other nodes. DaemonSet and static pods are skipped.
How do taints and tolerations work?
How do taints and tolerations work?
Taints on nodes repel pods. Tolerations on pods allow them to schedule on tainted nodes. A pod must tolerate all taints on a node to schedule there.
Can I remove a master node?
Can I remove a master node?
Yes, but ensure you have other masters for high availability. Removing the last master makes the cluster unavailable.