Abstract
Benefits
- Self-healing
- automatic rollbacks
- horizontal scaling
Attention
- Can be complex to maintain
- Costs associated with running nodes
Managed control planes can help mitigate complexity.
Sandbox to play with k8s
Play with Kubernetes provides you with Linux machines that have k8s preinstalled.
Control Plane
- Runs on multiple nodes across data center zones for high availability
Key Components
Controller Manager
- Replication Controller: Maintains the desired number of worker nodes
- Deployment Controller: Handles rollbacks and updates
Scheduler
- Schedules pods onto worker nodes, making placement decisions
etcd
- Distributed key-value store
- Stores cluster state, available resources, and health information
- Used by other control plane components
API Server
- REST API interface between the control plane and worker nodes
Worker nodes
- Run containers, which are encapsulated within pods
- Pods are the smallest deployable units in Kubernetes
- Pods provide shared storage and networking for containers
Key Components
Kubelet
- Communicates with the control plane
- Ensures the desired state of pods is maintained
Container Runtime
- Can be Docker or another compatible runtime
- Runs containers on worker nodes
- Pulls images, starts/stops containers
Kube-proxy
- Routes traffic to the correct pods
- Handles load balancing
- Cluster networking ensures that pods on different nodes can communicate seamlessly, so traffic can be routed between nodes without issue.
Containerization Workflow
- Kubelet (node agent) receives Pod spec
- It talks to the CRI runtime (containerd, CRI-O)
- Kubelet asks the runtime to: create containers and create the Pod-level cgroup
- The containers inside the Pod share: the Pod cgroup, namespaces (some shared, some isolated)
- Kubernetes writes Pod
cpuLimits,memoryLimits, etc. into cgroup controllers - Kernel enforces those resource restrictions dynamically
GPU Scheduling
- By default, Kubernetes does not know GPUs exist on a node. The NVIDIA Device Plugin is a DaemonSet that runs on every GPU node and registers
nvidia.com/gpuas a schedulable resource
# Pod spec requesting a GPU
resources:
limits:
nvidia.com/gpu: 1- On EKS, GPU-enabled node groups use instances with NVIDIA GPUs (p4d, p5, g5, g6) and the EKS-optimized GPU AMI which comes with NVIDIA drivers pre-installed
How it works
The flow is: GPU instance (hardware) → NVIDIA drivers (in AMI) → NVIDIA Device Plugin (DaemonSet, exposes GPUs to k8s scheduler) → Pod requests
nvidia.com/gpuin resource limits.Without the Device Plugin, the GPU hardware is physically present on the node but invisible to the Kubernetes scheduler. No pod can request or use it.
Cost optimization
Use Karpenter to auto-provision GPU nodes only when pods need them and scale to zero when idle. GPU instances are expensive, so avoiding idle nodes is critical.
