Taints and Tolerations

Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. They form a powerful mechanism to control which workloads can run on specific nodes.

How Taints and Tolerations Work

Taints mark nodes as unsuitable for certain workloads, while tolerations allow pods to overcome these restrictions. This relationship creates a selective scheduling system where:

  • Nodes with taints repel pods that don’t have matching tolerations
  • Pods with tolerations can be scheduled on nodes with matching taints (but aren’t required to be)
  • Without a matching toleration, pods won’t be scheduled on tainted nodes

Taint Effects

Taints have three possible effects that determine how pods without matching tolerations are treated:

EffectBehavior for Pods without Matching Toleration
NoScheduleWill not be scheduled on the tainted node
PreferNoScheduleSystem tries to avoid scheduling on the node, but not guaranteed
NoExecuteWon’t be scheduled AND existing pods will be evicted if they lack the toleration

Example: Applying Taints and Tolerations

First, taint a node to mark it for specific workloads:

kubectl taint nodes node1 key=value:NoSchedule

Then, create a pod with a matching toleration:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-toleration
spec:
  tolerations:
  - key: "key"
    operator: "Equal"
    value: "value"
    effect: "NoSchedule"
  containers:
  - name: nginx
    image: nginx

Common Use Cases for Taints and Tolerations

Taints and tolerations are particularly useful for several common cluster management scenarios:

Use CaseImplementationBenefit
Dedicated NodesTaint nodes with dedicated=purpose:NoScheduleReserve nodes for specific workloads
Special HardwareTaint GPU nodes with hardware=gpu:NoSchedulePrevent general workloads from consuming specialized resources
Automatic Node ProblemsKubernetes automatically applies taints like node.kubernetes.io/not-readyPrevent scheduling on problematic nodes
Zone IsolationTaint nodes by zone for controlled schedulingImplement advanced availability patterns
Gradual Node DecommissioningApply NoExecute taints with tolerationSecondsDrain nodes gradually for maintenance

Tolerations and Pod Eviction

The NoExecute effect can be combined with tolerationSeconds to control how long a pod can run on a node after a taint is applied:

tolerations:
- key: "node.kubernetes.io/unreachable"
  operator: "Exists"
  effect: "NoExecute"
  tolerationSeconds: 300  # Pod will be evicted after 5 minutes

This allows for graceful pod migration during node problems or maintenance.

Further Reading and Resources