Kubernetes Interview Series

rjshk013 November 6, 2025

Kubernetes Interview Series – Part 1

🎯 Master the Kubernetes Scheduler: 15 Must-Know Questions

From fundamentals to production scenarios – test your scheduler expertise!

📚 Fundamentals (Questions 1-5)

1️⃣ Where does the Kubernetes Scheduler fit into the Kubernetes architecture?

Answer: The Scheduler is a core control plane component that runs on master/control-plane nodes alongside:

API Server
Controller Manager
etcd

🔑 Key Point: It does NOT run on worker nodes. Worker nodes run kubelet, kube-proxy, and container runtime.

Pro Tip: In HA setups, multiple scheduler instances run, but only one is active (leader election).

2️⃣ What is the main function of the Kubernetes Scheduler?

Answer: Its primary job is Pod-to-Node assignment.

The Process:

👀 Watches for unscheduled Pods (no nodeName set)
🔍 Evaluates all nodes based on:
- Resource availability (CPU, memory)
- Constraints (affinity, taints, tolerations)
- Policies and priorities
✅ Selects the best-fit node
📝 Updates Pod spec with nodeName

Not the Scheduler’s Job: Actually running containers (that’s kubelet’s responsibility)

3️⃣ How does the Scheduler know which Pods need scheduling?

Answer: The Scheduler uses a watch mechanism on the API Server to monitor Pods with:

spec.nodeName = empty/unset
status.phase = Pending

Example Pod in need of scheduling:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  nodeName: ""  # Empty = needs scheduling
  containers:
  - name: nginx
    image: nginx

After Scheduling:

spec:
  nodeName: "worker-node-1"  # Scheduler assigns this

4️⃣ What happens after the Scheduler selects a node for a Pod?

Answer: The Handoff Process:

Scheduler → Updates Pod object via API Server: spec: nodeName: "selected-node"
API Server → Persists to etcd
Kubelet (on selected node) → Sees the assignment:
- Pulls container image
- Creates containers via container runtime
- Manages Pod lifecycle
Pod starts running → Status updated to Running

Analogy: Scheduler is like a dispatcher assigning taxis to customers. The driver (kubelet) actually picks them up!

5️⃣ What are the two main phases the Scheduler uses to pick a node?

Answer:

Phase 1: Filtering (Predicates) 🔍 Eliminates nodes that can’t run the Pod:

❌ Insufficient CPU/memory
❌ Taints without matching tolerations
❌ Node selector mismatch
❌ Pod affinity/anti-affinity violations
❌ Volume binding conflicts

Phase 2: Scoring (Priorities) 📊 Ranks remaining nodes (0-100 score):

⚖️ Balanced resource allocation
📦 Pod spreading across zones
🏷️ Image locality (already pulled)
🎯 Priority class weights

Example:

100 nodes in cluster
After filtering: 20 feasible nodes
After scoring: Node with score 95 is selected

🔧 Deep Dive (Questions 6-10)

6️⃣ Does the Scheduler directly run Pods on nodes?

Answer: ❌ Absolutely NOT!

What Scheduler Does:

✅ Decides WHERE a Pod should run
✅ Updates Pod spec with nodeName

What Kubelet Does:

✅ Actually creates containers
✅ Pulls images
✅ Manages container lifecycle
✅ Reports back to API Server

Real-World Analogy:

Scheduler = Airport control tower (decides which gate)
Kubelet = Ground crew (actually parks the plane)

Common Interview Trap: “Does the Scheduler run containers?” → NO!

7️⃣ What are the key sub-components or plugins inside the Scheduler?

Answer:

Scheduling Framework Components:

1. Scheduling Queue 📥

Stores Pods waiting to be scheduled
Priority queue (higher priority Pods scheduled first)
BackoffQ for failed scheduling attempts

2. Filter Plugins 🔍

NodeResourcesFit: Checks CPU/memory
NodeAffinity: Evaluates node selectors
TaintToleration: Checks taints/tolerations
PodTopologySpread: Distributes Pods evenly
VolumeBinding: Ensures PVC availability

3. Score Plugins 📊

NodeResourcesBalancedAllocation: Prefers balanced usage
ImageLocality: Prefers nodes with images cached
InterPodAffinity: Considers Pod affinity rules
NodeResourcesLeastAllocated: Spreads Pods across nodes

4. Bind Plugin 🔗

Final step: binds Pod to selected node
Updates API Server

Extension Points: PreFilter, Filter, PostFilter, PreScore, Score, Reserve, Permit, PreBind, Bind, PostBind

8️⃣ How does the Scheduler interact with other control plane components?

Answer:

Component Interactions:

🔷 With API Server (Primary Interface):

Watches for unscheduled Pods
Reads Node resource info
Updates Pod bindings
All communication goes through API Server

🔷 With etcd (Indirect via API Server):

All scheduling decisions persisted in etcd
Reads cluster state

🔷 With Controller Manager:

Controllers create Pods (ReplicaSet, Deployment)
Scheduler assigns them to nodes
Controllers handle Pod lifecycle

🔷 With Kubelet (Indirect):

Scheduler writes nodeName
Kubelet reads and acts on assignment

Data Flow Example:

Deployment → Controller Manager → Creates Pods → API Server → 
Scheduler watches → Assigns node → API Server → Kubelet → Runs Pod

9️⃣ Can a cluster have more than one Scheduler?

Answer: ✅ YES! Multiple schedulers are supported and commonly used.

Use Cases:

1. Custom Scheduling Logic

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  schedulerName: gpu-scheduler  # Custom scheduler
  containers:
  - name: ml-app
    image: tensorflow/tensorflow:latest-gpu

2. Default Scheduler (if not specified)

spec:
  schedulerName: default-scheduler  # Implicit if omitted

3. Multiple Schedulers Running Simultaneously

Default Kubernetes Scheduler
Custom GPU scheduler
Custom batch job scheduler
Third-party schedulers (Volcano, YuniKorn)

How It Works:

Each scheduler watches for Pods with matching schedulerName
Only one scheduler processes each Pod
Leader election for multiple instances of same scheduler

Real-World Example: ML team uses custom scheduler for GPU workloads while web apps use default scheduler.

🔟 What would happen if the Scheduler goes down?

Answer:

Impact Analysis:

✅ What STILL Works:

Existing Pods keep running (kubelet manages them)
Services, Ingress, ConfigMaps continue functioning
Controllers keep monitoring existing resources
kubectl commands for existing resources work

❌ What BREAKS:

New Pods remain in Pending state indefinitely
Scaling operations stuck (new replicas won’t schedule)
Deployments/StatefulSets can’t create new Pods
Pod rescheduling after node failures won’t work

Real-World Scenario:

# Before Scheduler fails
$ kubectl get pods
NAME          STATUS    RESTARTS   AGE
app-1         Running   0          5m

# Scheduler goes down

$ kubectl scale deployment app --replicas=3
$ kubectl get pods
NAME          STATUS    RESTARTS   AGE
app-1         Running   0          5m
app-2         Pending   0          30s  # ⚠️ Stuck!
app-3         Pending   0          30s  # ⚠️ Stuck!

# Check events
$ kubectl describe pod app-2
Events:
  Warning  FailedScheduling  5s  default-scheduler  0/3 nodes available: Scheduler not reachable

Recovery:

In HA setups: Another scheduler instance takes over (leader election)
Manual restart: kubectl -n kube-system delete pod kube-scheduler-xxx
Pending Pods automatically scheduled once Scheduler is back

Pro Tip: Always run multiple scheduler replicas in production!

🚀 Advanced Scenarios (Questions 11-15)

1️⃣1️⃣ How does Priority and Preemption work in the Scheduler?

Answer:

Priority Scheduling allows critical Pods to be scheduled before others, and even evict lower-priority Pods if needed.

Setting Up Priority:

1. Create PriorityClass

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000000  # Higher = more important
globalDefault: false
description: "Critical production workloads"

2. Assign to Pod

apiVersion: v1
kind: Pod
metadata:
  name: critical-app
spec:
  priorityClassName: high-priority
  containers:
  - name: app
    image: my-app

Preemption Process:

Scenario: Cluster at capacity, high-priority Pod arrives

Scheduler finds no available nodes (all resources used)
Identifies preemption candidates:
- Lower priority Pods on suitable nodes
- Calculates minimum Pods to evict
Evicts lower-priority Pods (graceful termination)
Schedules high-priority Pod once resources free up

Example:

# Before: Cluster full with low-priority Pods
$ kubectl get pods
NAME          PRIORITY    STATUS
web-1         100         Running
web-2         100         Running
web-3         100         Running

# High-priority Pod arrives
$ kubectl create -f critical-pod.yaml

# After: Low-priority Pod evicted
$ kubectl get pods
NAME          PRIORITY    STATUS
critical-app  1000000     Running
web-1         100         Running
web-2         100         Terminating  # ⚠️ Preempted
web-3         100         Running

Real-World Use Cases:

Database backups (lower priority) vs live queries (higher priority)
Batch jobs (can be preempted) vs API services (critical)
Development workloads vs production workloads

Best Practice: Always set PodDisruptionBudgets to prevent too many Pods being preempted simultaneously!

1️⃣2️⃣ You have a Pod stuck in Pending state with “0/5 nodes are available: Insufficient cpu”. How do you troubleshoot?

Answer:

Step-by-Step Troubleshooting:

1. Check Pod Resource Requests

$ kubectl describe pod stuck-pod
...
Requests:
  cpu:     4000m  # Requesting 4 CPUs
  memory:  8Gi
Events:
  Warning  FailedScheduling  0/5 nodes are available: 5 Insufficient cpu.

2. Check Node Allocatable Resources

$ kubectl describe nodes | grep -A 5 "Allocated resources"
Allocated resources:
  Resource           Requests    Limits
  cpu                3800m (95%)  4000m (100%)
  memory             7.5Gi (94%)  8Gi (100%)

3. Identify Resource Hogs

$ kubectl top nodes
NAME           CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
worker-node-1  3850m        96%    7680Mi          96%
worker-node-2  3900m        97%    7890Mi          98%

4. Check for Taints/Tolerations

$ kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
NAME           TAINTS
worker-node-1  [map[effect:NoSchedule key:node.kubernetes.io/disk-pressure]]

Solutions:

Option 1: Reduce Pod Resource Requests

spec:
  containers:
  - name: app
    resources:
      requests:
        cpu: 2000m     # Reduced from 4000m
        memory: 4Gi    # Reduced from 8Gi

Option 2: Scale Down Other Pods

$ kubectl scale deployment low-priority-app --replicas=0

Option 3: Add More Nodes

# In cloud environments
$ eksctl scale nodegroup --cluster=my-cluster --name=ng-1 --nodes=3

Option 4: Use Cluster Autoscaler Automatically adds nodes when Pods can’t be scheduled

Diagnostic Commands:

# See all pending Pods and reasons
$ kubectl get events --field-selector involvedObject.kind=Pod --sort-by='.lastTimestamp'

# Check scheduler logs
$ kubectl logs -n kube-system kube-scheduler-xxx

# Simulate scheduling
$ kubectl describe node worker-node-1 | grep -A 10 "Allocated resources"

Real-World Tip: Always set requests lower than actual usage and use limits for safety!

1️⃣3️⃣ What is Pod Topology Spread Constraints and when would you use it?

Answer:

Pod Topology Spread Constraints ensure Pods are evenly distributed across failure domains (zones, nodes, racks) for high availability.

The Problem It Solves:

Without constraints, scheduler might place all replicas on same zone/node:

Zone A: 5 Pods  ⚠️ All eggs in one basket!
Zone B: 0 Pods
Zone C: 0 Pods

With Topology Spread:

Zone A: 2 Pods  ✅ Distributed
Zone B: 2 Pods  ✅ Distributed
Zone C: 1 Pod   ✅ Distributed

Configuration Example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 6
  template:
    spec:
      topologySpreadConstraints:
      - maxSkew: 1                    # Max difference between zones
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule  # or ScheduleAnyway
        labelSelector:
          matchLabels:
            app: web
      containers:
      - name: nginx
        image: nginx

Key Parameters:

1. maxSkew

Maximum allowed difference in Pod count
maxSkew: 1 → Zones can differ by at most 1 Pod
Lower = more even distribution

2. topologyKey

Node label to use as topology domain
Common keys:
- topology.kubernetes.io/zone (AZs)
- kubernetes.io/hostname (individual nodes)
- topology.kubernetes.io/region

3. whenUnsatisfiable

DoNotSchedule: Strict (Pod stays Pending if can’t spread)
ScheduleAnyway: Soft (tries to spread but schedules anyway)

Real-World Scenarios:

Scenario 1: High Availability Across AZs

topologySpreadConstraints:
- maxSkew: 1
  topologyKey: topology.kubernetes.io/zone
  whenUnsatisfiable: DoNotSchedule
  # Ensures no single AZ failure takes down all Pods

Scenario 2: Even Load Across Nodes

topologySpreadConstraints:
- maxSkew: 2
  topologyKey: kubernetes.io/hostname
  whenUnsatisfiable: ScheduleAnyway
  # Spreads across nodes but doesn't block scheduling

Scenario 3: Multi-Region Distribution

topologySpreadConstraints:
- maxSkew: 1
  topologyKey: topology.kubernetes.io/region
  whenUnsatisfiable: DoNotSchedule
  # For global applications

Comparison with Pod Anti-Affinity:

Feature	Topology Spread	Pod Anti-Affinity
Granularity	Fine control (maxSkew)	Binary (yes/no)
Use Case	Even distribution	Avoid co-location
Flexibility	More options	Less flexible
Performance	Better at scale	Can be slow

Best Practice for Production:

topologySpreadConstraints:
# Zone-level spreading (primary)
- maxSkew: 1
  topologyKey: topology.kubernetes.io/zone
  whenUnsatisfiable: DoNotSchedule
# Node-level spreading (secondary)
- maxSkew: 2
  topologyKey: kubernetes.io/hostname
  whenUnsatisfiable: ScheduleAnyway

Debugging:

# Check Pod distribution
$ kubectl get pods -o wide -l app=web --sort-by=.spec.nodeName

# See why Pod didn't spread
$ kubectl describe pod my-pod | grep -A 10 "Events"

1️⃣4️⃣ How would you implement custom scheduling logic without writing a full custom scheduler?

Answer:

Three Approaches:

Option 1: Scheduler Extender (Webhook-Based) 🔌

Extends default scheduler with HTTP webhooks.

How It Works:

Default scheduler runs Filter/Score phases
Calls your extender webhook at specific points
Your service returns additional filtering/scoring
Scheduler combines results

Example Extender Configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: scheduler-config
  namespace: kube-system
data:
  scheduler-config.yaml: |
    apiVersion: kubescheduler.config.k8s.io/v1
    kind: KubeSchedulerConfiguration
    extenders:
    - urlPrefix: "http://my-extender:8080"
      filterVerb: "filter"
      prioritizeVerb: "prioritize"
      weight: 1
      enableHTTPS: false
      nodeCacheCapable: true
      ignorable: false

Extender Service (Python Example):

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/filter', methods=['POST'])
def filter_nodes():
    data = request.json
    pod = data['Pod']
    nodes = data['Nodes']['items']
    
    # Custom filtering logic
    # Example: Only schedule on nodes with GPU
    filtered_nodes = [
        node for node in nodes
        if 'nvidia.com/gpu' in node['status']['allocatable']
    ]
    
    return jsonify({
        'Nodes': {'items': filtered_nodes},
        'FailedNodes': {},
        'Error': ''
    })

@app.route('/prioritize', methods=['POST'])
def prioritize_nodes():
    data = request.json
    nodes = data['Nodes']['items']
    
    # Custom scoring logic
    # Example: Prefer nodes with more available memory
    scores = []
    for node in nodes:
        available_mem = get_available_memory(node)
        score = min(int(available_mem / 1000000), 100)
        scores.append({'host': node['metadata']['name'], 'score': score})
    
    return jsonify(scores)

Use Cases:

GPU-aware scheduling
License-based placement
Custom cost optimization
Integration with external systems

Option 2: Scheduling Profiles 📋

Configure different scheduling behaviors without code.

Example:

apiVersion: kubescheduler.config.k8s.io/v1
kind: KubeSchedulerConfiguration
profiles:
# Profile 1: Default
- schedulerName: default-scheduler
  plugins:
    score:
      enabled:
      - name: NodeResourcesBalancedAllocation
        weight: 1

# Profile 2: Bin Packing (pack tightly)
- schedulerName: bin-packing-scheduler
  plugins:
    score:
      enabled:
      - name: NodeResourcesMostAllocated  # Opposite of default!
        weight: 1
      disabled:
      - name: NodeResourcesBalancedAllocation

# Profile 3: GPU Optimized
- schedulerName: gpu-scheduler
  plugins:
    filter:
      enabled:
      - name: NodeResourcesFit
    score:
      enabled:
      - name: NodeResourcesMostAllocated

Using Different Profiles:

# Batch job - use bin packing
apiVersion: batch/v1
kind: Job
metadata:
  name: data-processing
spec:
  template:
    spec:
      schedulerName: bin-packing-scheduler
      containers:
      - name: processor
        image: data-processor

# ML training - use GPU scheduler
apiVersion: v1
kind: Pod
metadata:
  name: ml-training
spec:
  schedulerName: gpu-scheduler
  containers:
  - name: trainer
    image: tensorflow/tensorflow:latest-gpu

Option 3: Admission Controllers + Mutating Webhooks 🎯

Modify Pod specs before scheduling.

Example Use Case: Auto-add node selectors based on namespace

Mutating Webhook:

func mutatePod(pod *corev1.Pod) {
    // Add node selector for production namespace
    if pod.Namespace == "production" {
        if pod.Spec.NodeSelector == nil {
            pod.Spec.NodeSelector = make(map[string]string)
        }
        pod.Spec.NodeSelector["tier"] = "production"
        pod.Spec.NodeSelector["ssd"] = "true"
    }
    
    // Add toleration for batch jobs
    if pod.Labels["workload"] == "batch" {
        pod.Spec.Tolerations = append(pod.Spec.Tolerations,
            corev1.Toleration{
                Key:      "batch",
                Operator: corev1.TolerationOpEqual,
                Value:    "true",
                Effect:   corev1.TaintEffectNoSchedule,
            })
    }
}

Comparison Table:

Approach	Complexity	Flexibility	Use Case
Scheduler Extender	Medium	High	Custom logic, external integration
Scheduling Profiles	Low	Medium	Different policies per workload
Admission Webhooks	Medium	High	Pre-scheduling Pod modifications

Recommendation:

Simple needs → Scheduling Profiles
External integration → Scheduler Extender
Pod mutation → Admission Webhooks
Complex custom logic → Full custom scheduler

1️⃣5️⃣ In a multi-tenant cluster, how would you ensure fair resource allocation and prevent one team from starving others?

Answer:

Multi-Layered Approach:

Layer 1: ResourceQuotas (Namespace-Level Limits) 🎯

Prevent teams from consuming more than their fair share.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-alpha-quota
  namespace: team-alpha
spec:
  hard:
    requests.cpu: "100"        # Total CPU across all Pods
    requests.memory: 200Gi     # Total memory
    limits.cpu: "200"          # Max CPU with limits
    limits.memory: 400Gi       # Max memory
    persistentvolumeclaims: "10"
    pods: "50"                 # Max Pod count
    services.loadbalancers: "3"

Check Quota Usage:

$ kubectl describe resourcequota -n team-alpha
Name:                   team-alpha-quota
Resource                Used   Hard
--------                ----   ----
requests.cpu            85     100    # ⚠️ 85% used!
requests.memory         180Gi  200Gi
pods                    45     50

Layer 2: LimitRanges (Pod-Level Defaults) 📏

Prevent individual Pods from being too large.

apiVersion: v1
kind: LimitRange
metadata:
  name: pod-limits
  namespace: team-alpha
spec:
  limits:
  # Container limits
  - type: Container
    max:
      cpu: "4"            # No single container > 4 CPU
      memory: 8Gi         # No single container > 8Gi
    min:
      cpu: 100m           # Minimum request
      memory: 128Mi
    default:
      cpu: 500m           # Default if not specified
      memory: 512Mi
    defaultRequest:
      cpu: 250m
      memory: 256Mi
  
  # Pod limits
  - type: Pod
    max:
      cpu: "8"            # No Pod > 8 CPU total
      memory: 16Gi

Effect:

# Pod without requests/limits
apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
  - name: app
    image: my-image
    # No resources specified

# After LimitRange mutation:
spec:
  containers:
  - name: app
    resources:
      requests:
        cpu: 250m      # Auto-added
        memory: 256Mi
      limits:
        cpu: 500m      # Auto-added
        memory: 512Mi

Layer 3: Priority Classes (Workload Importance) 🏆

Ensure critical workloads get scheduled first.

# Production workloads
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: production-high
value: 1000000
globalDefault: false
description: "Production critical services"

---
# Development workloads
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: development-low
value: 100
preemptionPolicy: Never  # Can't preempt others
description: "Development and testing"

Assign to Pods:

# Production Pod
apiVersion: v1
kind: Pod
metadata:
  name: api-server
  namespace: team-alpha
spec:
  priorityClassName: production-high
  containers:
  - name: api
    image: api-server

# Dev Pod  
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
  namespace: team-beta
spec:
  priorityClassName: development-low
  containers:
  - name: test
    image: test-app

Layer 4: Node Pools / Taints (Physical Isolation) 🏭

Dedicate nodes to specific teams.

Setup:

# Label nodes for each team
$ kubectl label nodes node-1 node-2 node-3 team=alpha
$ kubectl label nodes node-4 node-5 node-6 team=beta

# Add taints to prevent other teams
$ kubectl taint nodes node-1 node-2 node-3 team=alpha:NoSchedule
$ kubectl taint nodes node-4 node-5 node-6 team=beta:NoSchedule

Team-A Pods (with toleration):

apiVersion: v1
kind: Pod
metadata:
  name: team-a-pod
spec:
  nodeSelector:
    team: alpha
  tolerations:
  - key: team
    operator: Equal
    value: alpha
    effect: NoSchedule
  containers:
  - name: app
    image: team-a-image

Result:

Team A Pods → Only run on nodes 1-3
Team B Pods → Only run on nodes 4-6
No cross-team interference

Layer 5: Cluster Autoscaler Configuration ⚙️

Fair autoscaling per team.

# Separate node groups per team with autoscaling
# In cloud provider (AWS EKS example):

# Team Alpha node group
eksctl create nodegroup \
  --cluster=my-cluster \
  --name=team-alpha-ng \
  --node-labels=team=alpha \
  --node-taints=team=alpha:NoSchedule \
  --nodes-min=3 \
  --nodes-max=10

# Team Beta node group
eksctl create nodegroup \
  --cluster=my-cluster \
  --name=team-beta-ng \
  --node-labels=team=beta \
  --node-taints=team=beta:NoSchedule \
  --nodes-min=3 \
  --nodes-max=10

Complete Multi-Tenant Setup Example:

# 1. Create namespace per team
apiVersion: v1
kind: Namespace
metadata:
  name: team-alpha
  labels:
    team: alpha
    environment: production

---
# 2. ResourceQuota
apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-alpha-quota
  namespace: team-alpha
spec:
  hard:
    requests.cpu: "100"
    requests.memory: 200Gi
    pods: "50"

---
# 3. LimitRange
apiVersion: v1
kind: LimitRange
metadata:
  name: team-alpha-limits
  namespace: team-alpha
spec:
  limits:
  - type: Container
    max:
      cpu: "4"
      memory: 8Gi
    defaultRequest:
      cpu: 250m
      memory: 256Mi

---
# 4. NetworkPolicy (bonus: network isolation)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: team-alpha-isolation
  namespace: team-alpha
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          team: alpha  # Only allow traffic from same team

Monitoring Fair Allocation:

# Check quota usage across namespaces
$ kubectl get resourcequota --all-namespaces

# Check which team is using most resources
$ kubectl top pods --all-namespaces --sort-by=cpu | head -20

# See pending Pods (potential starvation)
$ kubectl get pods --all-namespaces --field-selector=status.phase=Pending

# Audit events for quota exceeded
$ kubectl get events --all-namespaces | grep "exceeded quota"

Alert When Team Hits Limits:

# Prometheus alert rule
- alert: NamespaceQuotaExceeded
  expr: |
    kube_resourcequota{type="hard"} - 
    kube_resourcequota{type="used"} < 0
  for: 5m
  labels:
    severity: warning
  annotations:
    description: "Team {{ $labels.namespace }} exceeded quota"

Best Practices:

✅ Start with generous quotas, adjust based on usage
✅ Use PriorityClasses to protect production workloads
✅ Monitor quota usage and set alerts
✅ Document team resource allocation policies
✅ Regular reviews and adjustments
✅ Consider cost allocation tools (Kubecost, OpenCost)

🎓 Bonus Tips for Interviews

Red Flags to Avoid ❌

❌ “Scheduler runs containers” (No, that’s kubelet!)
❌ “Only one scheduler per cluster” (Multiple are supported!)
❌ “Scheduler directly talks to kubelet” (All via API Server!)
❌ “Scheduler stores state” (It’s stateless, etcd has state!)

Pro Interview Answers ✅

✅ Mention “Filter and Score” phases
✅ Reference specific plugins (NodeResourcesFit, NodeAffinity)
✅ Discuss production scenarios (preemption, topology spread)
✅ Show troubleshooting knowledge
✅ Mention HA and leader election

Follow-Up Topics to Study 📚

Custom Resource Definitions (CRDs) for scheduling
Descheduler (rebalances Pods post-scheduling)
Cluster Autoscaler integration
Volcano / YuniKorn schedulers
Scheduling latency optimization
Multi-cluster scheduling

📊 Quick Reference Cheat Sheet

SCHEDULER WORKFLOW
==================
1. Watch API Server for Pods with nodeName = ""
2. FILTER: Remove incompatible nodes
3. SCORE: Rank remaining nodes (0-100)
4. BIND: Update Pod with selected nodeName
5. Kubelet takes over → Runs container

KEY PLUGINS
===========
Filter:
• NodeResourcesFit - CPU/memory check
• NodeAffinity - Node selectors
• TaintToleration - Taints/tolerations
• PodTopologySpread - Even distribution

Score:
• NodeResourcesBalancedAllocation - Balanced usage
• ImageLocality - Cached images
• InterPodAffinity - Pod affinity
• NodeResourcesLeastAllocated - Spread Pods

TROUBLESHOOTING COMMANDS
========================
kubectl describe pod &lt;pod-name>
kubectl get events --sort-by=.lastTimestamp
kubectl logs -n kube-system kube-scheduler-xxx
kubectl top nodes
kubectl describe nodes | grep -A 5 "Allocated"

🚀 What’s Next?

Practice Scenarios:

Set up Priority Classes in a test cluster
Implement Pod Topology Spread Constraints
Configure ResourceQuotas for multi-tenancy
Write a simple scheduler extender
Troubleshoot real Pending Pods

Further Reading:

Found this helpful? Share your scheduler war stories in the comments! 👇

Next in series: Part 2 – Kubernetes Controllers Deep Dive

#Kubernetes #K8s #DevOps #SRE #Interview #CloudNative #Scheduler #TechInterview #Learning

💡 Pro Tip: Star this for your next interview prep! These questions cover 90% of scheduler-related interview topics.

Kubernetes Interview Series

Kubernetes Interview Series – Part 1

🎯 Master the Kubernetes Scheduler: 15 Must-Know Questions

📚 Fundamentals (Questions 1-5)

1️⃣ Where does the Kubernetes Scheduler fit into the Kubernetes architecture?

2️⃣ What is the main function of the Kubernetes Scheduler?

3️⃣ How does the Scheduler know which Pods need scheduling?

4️⃣ What happens after the Scheduler selects a node for a Pod?

5️⃣ What are the two main phases the Scheduler uses to pick a node?

🔧 Deep Dive (Questions 6-10)

6️⃣ Does the Scheduler directly run Pods on nodes?

7️⃣ What are the key sub-components or plugins inside the Scheduler?

8️⃣ How does the Scheduler interact with other control plane components?

9️⃣ Can a cluster have more than one Scheduler?

🔟 What would happen if the Scheduler goes down?

🚀 Advanced Scenarios (Questions 11-15)

1️⃣1️⃣ How does Priority and Preemption work in the Scheduler?

1️⃣2️⃣ You have a Pod stuck in Pending state with “0/5 nodes are available: Insufficient cpu”. How do you troubleshoot?

1️⃣3️⃣ What is Pod Topology Spread Constraints and when would you use it?

1️⃣4️⃣ How would you implement custom scheduling logic without writing a full custom scheduler?

1️⃣5️⃣ In a multi-tenant cluster, how would you ensure fair resource allocation and prevent one team from starving others?

🎓 Bonus Tips for Interviews

Red Flags to Avoid ❌

Pro Interview Answers ✅

Follow-Up Topics to Study 📚

📊 Quick Reference Cheat Sheet

🚀 What’s Next?

How to Optimize AWS NAT Gateway Costs

Building a Production-Ready Multi-Model NLP API with Hugging Face & Docker

Building a RAG Chatbot with LangChain, ChromaDB, Streamlit & OpenAI

How to Build a Complete MLOps Pipeline: Automated Machine Learning from Training to Production

CLDOP Real-World DevOps Project Series-Part 3

K8s Interview Series

Leave a Comment Cancel reply

Stay up to date with our blogs.

Subscribe to receive email notifications for new blog posts.