This content originally appeared on DEV Community and was authored by Guptaji Teegela
When every Pod screams for CPU and memory, who decides who lives, who waits, and who gets evicted?
Kubernetes isn't just a scheduler — it's a negotiator of fairness and efficiency.
Every second, it balances hundreds of workloads, deciding what runs, what waits, and what gets terminated — while maintaining reliability and cost efficiency.
This article unpacks how Quality of Service (QoS), Priority Classes, Preemption, and Bin-Packing Scoring come together to keep your cluster stable and fair.
⚙️ The Challenge: Competing Workloads in Shared Clusters
When multiple workloads share cluster resources, conflicts are inevitable:
- High-traffic apps starve lower workloads.
- Batch jobs hog memory.
- Pods without limits cause unpredictable evictions.
Kubernetes addresses this by applying a layered decision-making model — QoS, Priority, Preemption, and Scoring.
🧭 QoS (Quality of Service): Who Gets Evicted First
Each Pod belongs to a QoS class based on CPU and memory configuration:
| QoS Class | Description | Eviction Priority |
|---|---|---|
| Guaranteed | Requests = Limits for all containers | Evicted last |
| Burstable | Requests < Limits | Evicted after BestEffort |
| BestEffort | No requests/limits set | Evicted first |
💡 Lesson: Always define requests and limits — QoS decides who survives under node pressure.
🧱 Priority Classes: Who Runs First
QoS defines who stays, while Priority Classes define who starts.
Assigning PriorityClass values (integer-based) helps rank workloads during scheduling.
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: critical-services
value: 100000
description: Critical platform workloads
💡 Lesson: Reserve high priorities for mission-critical services.
Overusing "high" priority leads to chaos — not resilience.
⚔️ Preemption: Controlled Sacrifice, Not Chaos
When a high-priority Pod can't be scheduled:
- The scheduler identifies lower-priority Pods occupying resources.
- Marks them for termination.
- Reschedules the high-priority Pod.
This is guided by PodDisruptionBudgets (PDBs) to avoid excessive collateral damage.
💡 Lesson: Preemption is controlled resilience — ensuring important workloads run while maintaining order.
⚖️ Scoring & Bin-Packing: Finding the Right Home
Once eligible nodes are filtered, Kubernetes enters the scoring phase to find the best fit.
Plugins involved:
- LeastRequestedPriority → favors underutilized nodes.
- BalancedResourceAllocation → balances CPU & memory use.
- ImageLocalityPriority → prefers nodes with cached images.
- NodeAffinityPriority → honors affinity preferences.
- TopologySpreadConstraint → ensures zone diversity.
Each node receives a score (0–100) from multiple plugins.
Weighted scores are combined:
final_score = (w1*s1) + (w2*s2) + ...
QoS defines survivability.
Priority defines importance.
Scoring defines placement.
Together, they shape a stable and efficient cluster.
🧩 Visual Flow: Kubernetes Scheduling & Bin-Packing
🧠 Key Lessons for SREs & Platform Teams
✅ Always define CPU/memory requests & limits.
✅ Use PriorityClasses sparingly.
✅ Test evictions under simulated stress.
✅ Combine QoS + PDB + Priority for controlled resilience.
✅ Observe scheduling metrics (kube_pod_status_phase, scheduler_score) regularly.
🚀 Takeaway
Kubernetes doesn't just schedule Pods — it negotiates priorities.
Reliability doesn't come from overprovisioning, but from predictable, fair, and disciplined scheduling.
Resilience = Consistency in scheduling decisions.
This content originally appeared on DEV Community and was authored by Guptaji Teegela
Guptaji Teegela | Sciencx (2025-11-20T00:53:43+00:00) Beyond Scheduling: How Kubernetes Uses QoS, Priority, and Scoring to Keep Your Cluster Balanced. Retrieved from https://www.scien.cx/2025/11/20/beyond-scheduling-how-kubernetes-uses-qos-priority-and-scoring-to-keep-your-cluster-balanced/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.