Kubernetes

This is my practical Kubernetes handbook. It is organised so you can either learn progressively or jump straight to troubleshooting and operations topics.

Beginner Track

Start with fundamentals, then move to config and networking.

Core Config Networking

Intermediate Track

Build reliability with observability, storage, and scheduling.

Observability Storage Scheduling

Advanced Track

Focus on security hardening, administration, and architecture.

Security Cluster Admin Architecture

1. Core Concepts

These are the fundamental building blocks that run your containers. Understanding these resources is essential before moving to more advanced topics.

Pod

The smallest deployable unit in Kubernetes. Think of it as a wrapper around one or more containers that share the same network and storage. Pods are designed to be temporary. They can be created, destroyed, and replaced at any time. This is why you rarely create Pods directly. Instead, you use controllers like Deployments that manage Pods for you.

Command	Description
`kubectl get pods`	List pods in current namespace
`kubectl describe pod <name>`	Show detailed pod info and events
`kubectl delete pod <name>`	Delete a pod (controller recreates it)

Example YAML

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  containers:
  - name: nginx
    image: nginx:alpine
    ports:
    - containerPort: 80

Pod Lifecycle CKA

Pods progress through distinct phases during their lifetime. Understanding these phases helps with debugging and designing resilient applications.

Phase	Description	Common Causes
`Pending`	Pod accepted but containers not running	Image pulling, PVC binding, resource constraints
`Running`	At least one container is running	Normal operation
`Succeeded`	All containers terminated successfully	Job completed, init container finished
`Failed`	All containers terminated, at least one failed	CrashLoopBackOff, OOMKilled, error exit code
`Unknown`	Cannot determine pod state	Node communication lost

Container States

Within each Pod, individual containers have their own state:

State	Meaning
`Waiting`	Container waiting to start (image pull, init container running)
`Running`	Container executing
`Terminated`	Container exited (success or failure)

# Check container states
kubectl get pod <name> -o jsonpath='\u007b.status.containerStatuses[*].state\u007d'

# Check init container status
kubectl get pod <name> -o jsonpath='\u007b.status.initContainerStatuses[*].name\u007d'

Deployment CKA

The most common way to run stateless applications like web servers or APIs. Deployments manage ReplicaSets, which manage Pods. This layered approach enables rolling updates. When you update the container image, the Deployment creates new Pods with the new version, waits for them to be ready, then gradually removes the old Pods. Your application stays online throughout the update.

Command	Description
`kubectl create deploy <name> --image=<img>`	Create deployment imperatively
`kubectl scale deploy <name> --replicas=3`	Scale to 3 replicas
`kubectl rollout status deploy <name>`	Watch rollout progress

Example YAML

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80

Deployment Patterns CKAD

Beyond rolling updates, Kubernetes supports advanced deployment strategies for zero-downtime releases and risk mitigation.

Pattern	Description	Use Case
Blue-Green	Run both versions simultaneously, switch traffic instantly	Zero-downtime deployments, easy rollback
Canary	Route small % of traffic to new version, gradually increase	Risk mitigation, A/B testing

Blue-Green Deployment Example

Deploy two versions simultaneously. Switch traffic by updating the Service selector. Instant rollback by switching back.

# 1. Create blue deployment (current version)
kubectl create deployment nginx-blue --image=nginx:1.14 --replicas=3

# 2. Create green deployment (new version)
kubectl create deployment nginx-green --image=nginx:1.15 --replicas=3

# 3. Create service pointing to blue
kubectl expose deployment nginx-blue --name=nginx --port=80 --selector=version=blue

# 4. Switch traffic to green (instant)
# 4. Switch traffic to green (instant) - Edit service to change selector
kubectl edit service nginx  # Change selector to version=green

# 5. Rollback to blue if issues
# 5. Rollback to blue if issues - Edit service to change selector  
kubectl edit service nginx  # Change selector to version=blue

# 6. Cleanup old deployment
kubectl delete deployment nginx-blue

Canary Deployment Example

Start with 1 replica of the new version alongside the old. Gradually scale up canary while scaling down stable. Monitor for errors before full rollout.

# 1. Stable deployment (9 replicas)
kubectl create deployment nginx --image=nginx:1.14 --replicas=9

# 2. Canary deployment (1 replica)
kubectl create deployment nginx-canary --image=nginx:1.15 --replicas=1

# 3. Both services route to same endpoint via label
kubectl label deployment nginx version=stable
kubectl label deployment nginx-canary version=stable

# 4. Monitor canary - check logs, metrics
kubectl logs -l app=nginx-canary --tail=100

# 5. Gradually increase canary
kubectl scale deployment nginx-canary --replicas=3
kubectl scale deployment nginx --replicas=7

# 6. Continue until fully migrated
kubectl scale deployment nginx-canary --replicas=10
kubectl delete deployment nginx

Note: For traffic percentage control, use an Ingress controller with canary annotations (Nginx, Traefik) or a service mesh (Istio, Linkerd).

ReplicaSet CKA

The engine inside a Deployment that keeps your application running. ReplicaSets ensure the correct number of Pod copies are always running. If a Pod crashes or a node fails, the ReplicaSet automatically creates a new Pod to replace it. You typically do not work with ReplicaSets directly. Deployments manage them for you. Understanding ReplicaSets helps explain why applications remain available during failures.

Command	Description
`kubectl get rs`	List ReplicaSets
`kubectl describe rs <name>`	Show ReplicaSet details and events

StatefulSet CKA

Use StatefulSets for applications that need stable identities, like databases or message queues. Unlike Deployments where Pods are interchangeable, StatefulSet Pods get predictable names (mysql-0, mysql-1) and are created in order. Each Pod maintains its own persistent storage that survives restarts. This makes StatefulSets ideal for clustered databases where each node needs to know its identity.

Command	Description
`kubectl get sts`	List StatefulSets
`kubectl scale sts <name> --replicas=3`	Scale StatefulSet

Example YAML

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: mysql
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

DaemonSet CKA

Ensures exactly one Pod runs on every node in your cluster. Perfect for system-level services that need to be present everywhere. Common examples include log collectors that gather container logs from each node, monitoring agents that report node metrics, or network plugins that set up networking on each machine. When you add a new node to the cluster, the DaemonSet automatically deploys its Pod there without any manual intervention.

Command	Description
`kubectl get ds`	List DaemonSets
`kubectl describe ds <name>`	Show DaemonSet details

Job / CronJob CKA

Jobs run Pods that complete a task and then exit. Use a Job for one-time tasks like database migrations or batch data processing. Use a CronJob to run Jobs on a schedule, such as nightly backups or periodic cleanup tasks. Unlike Deployments which keep Pods running indefinitely, Jobs and CronJobs are designed for tasks that have a definite end point.

Command	Description
`kubectl create job <name> --image=<img>`	Create a one-time job
`kubectl get jobs`	List jobs and completion status
`kubectl get cronjobs`	List scheduled CronJobs

CronJob Example

apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup
spec:
  schedule: "0 2 * * *"  # Daily at 2am
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: backup-tool:latest
            command: ["/bin/sh", "-c", "backup.sh"]
          restartPolicy: OnFailure

Multi-container Pods CKAD

Pods can contain multiple containers that share network and storage. Common patterns:

Pattern	Use Case
Init Container	Runs before main container (wait for DB, download config)
Sidecar	Runs alongside main container (log shipper, proxy)
Ambassador	Proxy outbound connections (localhost to external)
Adapter	Transform output for consumption (log formatting)

Init Container Example

apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  initContainers:
  - name: wait-for-db
    image: busybox
    command: ['sh', '-c', 'until nc -z db 5432; do sleep 2; done']
  containers:
  - name: app
    image: myapp:latest

Init Container Characteristics

Characteristic	Behavior
Execution Order	Run sequentially, not in parallel
Success Requirement	Must complete successfully (exit 0) before next init or app containers start
Resource Limits	Have separate resource limits from app containers
Features	Support all container features (volumes, env vars, security contexts)
Restart Policy	Always restart on failure (unlike Jobs)

Common Use Cases:

Wait for services to be ready (database, message queue)
Initialize configuration files from templates
Run database migrations before app starts
Generate certificates or keys
Clone Git repositories or download assets

# Check init container status
kubectl get pod <name>
kubectl describe pod <name> | grep -A 20 "Init Containers"

# View init container logs
kubectl logs <pod> -c <init-container-name>

Complete Multi-Container Patterns

Sidecar Pattern (Log Shipping):

apiVersion: v1
kind: Pod
metadata:
  name: web-with-logger
spec:
  containers:
  - name: nginx
    image: nginx
    volumeMounts:
    - name: logs
      mountPath: /var/log/nginx
  - name: log-aggregator
    image: fluentd
    volumeMounts:
    - name: logs
      mountPath: /var/log/nginx
  volumes:
  - name: logs
    emptyDir:

Adapter Pattern (Data Transformation):

apiVersion: v1
kind: Pod
metadata:
  name: app-with-adapter
spec:
  containers:
  - name: main-app
    image: myapp
    volumeMounts:
    - name: data
      mountPath: /data
  - name: adapter
    image: log-adapter
    volumeMounts:
    - name: data
      mountPath: /input
    command: ['sh', '-c', 'transform /input/logs.json']

Ambassador Pattern (Proxy):

apiVersion: v1
kind: Pod
metadata:
  name: app-with-proxy
spec:
  containers:
  - name: main-app
    image: myapp
    env:
    - name: DB_HOST
      value: localhost:5432
  - name: db-proxy
    image: pg-proxy
    ports:
    - containerPort: 5432

Static Pods CKA

Pods managed directly by kubelet on a node, not by the API server. Control plane components (kube-apiserver, etcd) run as static pods. Manifests live in /etc/kubernetes/manifests/.

Command	Description
`ls /etc/kubernetes/manifests/`	List static pod manifests
`kubectl get pods -n kube-system`	Static pods appear with node name suffix

Workload Controller Comparison CKA

Choose the right controller based on your application's requirements. Each controller optimises for different use cases.

Controller	Best For	Pod Identity	Storage
Deployment	Stateless apps (web servers, APIs)	Interchangeable	Shared (ReadWriteMany)
StatefulSet	Stateful apps (databases, message queues)	Stable (pod-0, pod-1)	Dedicated per Pod
DaemonSet	Node agents (logging, monitoring)	One per node	hostPath or emptyDir
Job	One-time tasks (migrations, batch)	Completes and exits	Ephemeral
CronJob	Scheduled tasks (backups, cleanup)	Creates Jobs on schedule	Ephemeral

Decision Guide

Ask these questions:

Does the app need stable network identity? → StatefulSet (databases need pod-0, pod-1 naming)
Should it run on every node? → DaemonSet (log collectors, monitoring agents)
Does it complete and exit? → Job (database migrations, batch processing)
Should it run on a schedule? → CronJob (nightly backups, cleanup tasks)
None of the above? → Deployment (most common, stateless apps)

2. Configuration

Applications need configuration. This section covers how to inject settings, store sensitive data securely, and control resource usage.

ConfigMap CKA

Store non-sensitive configuration data like application settings, feature flags, or configuration files. You can inject ConfigMap data into Pods as environment variables or mount it as files. Important note: changing a ConfigMap does not automatically restart running Pods. They will see the new values only after they restart.

Command	Description
`kubectl create cm <name> --from-literal=key=value`	Create from literal values
`kubectl create cm <name> --from-file=config.txt`	Create from file
`kubectl get cm <name> -o yaml`	View ConfigMap contents

Usage in Pod

# As environment variable
env:
- name: LOG_LEVEL
  valueFrom:
    configMapKeyRef:
      name: app-config
      key: log_level

# As mounted file
volumes:
- name: config
  configMap:
    name: app-config
volumeMounts:
- name: config
  mountPath: /etc/config

Secret CKA

Store sensitive data like passwords, API keys, and TLS certificates. Secrets are base64-encoded but not encrypted by default. Anyone with access to the cluster can read them. For production, use tools like Sealed Secrets or External Secrets Operator to encrypt secrets before storing them in Git. Like ConfigMaps, Secrets can be mounted as files or injected as environment variables.

Command	Description
`kubectl create secret generic <name> --from-literal=pass=secret`	Create generic secret
`kubectl create secret tls <name> --cert=tls.crt --key=tls.key`	Create TLS secret
`kubectl get secret <name> -o jsonpath='{.data.pass}' \| base64 -d`	Decode secret value

Namespace CKA

A way to divide a cluster into multiple virtual clusters. Use namespaces to separate different environments like development, staging, and production, or to isolate team resources. Each namespace has its own DNS scope. Services in different namespaces can still communicate using the full DNS name like <service>.<namespace>.svc.cluster.local.

Command	Description
`kubectl create ns <name>`	Create namespace
`kubectl get pods -n <namespace>`	List pods in specific namespace
`kubectl config set-context --current --namespace=<ns>`	Set default namespace

Resource Requests & Limits CKA

Control how much CPU and memory your containers can use. Requests tell the scheduler the minimum resources a container needs. The scheduler uses this to decide which node can run the Pod. Limits set the maximum resources a container can consume. If a container exceeds its memory limit, Kubernetes kills it with an OOMKilled error. If it exceeds its CPU limit, the CPU gets throttled but the container keeps running.

Example YAML

containers:
- name: app
  image: myapp
  resources:
    requests:
      memory: "128Mi"
      cpu: "250m"      # 0.25 CPU cores
    limits:
      memory: "256Mi"
      cpu: "500m"      # 0.5 CPU cores

ResourceQuota CKA

Limits aggregate resource consumption per namespace. Prevents one team from consuming all cluster resources. Can limit CPU, memory, storage, and object counts (pods, services, secrets).

Command	Description
`kubectl get resourcequota -n <ns>`	Show quota usage
`kubectl describe resourcequota <name>`	Show limits and current usage

Example YAML

apiVersion: v1
kind: ResourceQuota
metadata:
  name: dev-quota
  namespace: dev
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 8Gi
    limits.cpu: "8"
    limits.memory: 16Gi
    pods: "20"

LimitRange CKA

Sets default resource requests/limits for containers in a namespace. Prevents pods from being created without resource specs. Can also enforce min/max constraints.

Example YAML

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
spec:
  limits:
  - default:          # Default limits
      cpu: "500m"
      memory: "256Mi"
    defaultRequest:   # Default requests
      cpu: "100m"
      memory: "128Mi"
    type: Container

SecurityContext CKAD

Security settings at Pod or container level: run as specific user, drop capabilities, read-only filesystem, prevent privilege escalation. Critical for hardening production workloads.

Example YAML

spec:
  securityContext:           # Pod level
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
  containers:
  - name: app
    securityContext:         # Container level
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop: ["ALL"]

3. Observability & Debugging

When things go wrong, you need to understand what is happening. This section covers health checks, logs, and troubleshooting techniques.

Probes (Liveness, Readiness, Startup) CKA

Kubernetes uses health checks to monitor containers. Each probe serves a different purpose. Use liveness probes to detect when an application has crashed or deadlocked. Kubernetes restarts the container if this probe fails. Use readiness probes to know when an application is ready to accept traffic. Kubernetes removes the Pod from Service endpoints until it passes. Use startup probes for slow-starting applications. This probe disables the other probes until the container is ready, preventing premature restarts.

Probe	Purpose
livenessProbe	Is the container alive? Failure = container restart
readinessProbe	Is the container ready for traffic? Failure = removed from Service
startupProbe	For slow-starting apps. Disables other probes until success

Example YAML

containers:
- name: app
  livenessProbe:
    httpGet:
      path: /healthz
      port: 8080
    initialDelaySeconds: 10
    periodSeconds: 5
  readinessProbe:
    httpGet:
      path: /ready
      port: 8080
    initialDelaySeconds: 5
    periodSeconds: 3

Probe Configuration Options

Field	Default	Description
`initialDelaySeconds`	0	Wait time before first probe
`periodSeconds`	10	How often to probe
`timeoutSeconds`	1	Probe timeout
`successThreshold`	1	Consecutive successes to pass
`failureThreshold`	3	Consecutive failures to fail

Probe Types & Examples

HTTP GET (most common):

livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
    httpHeaders:
    - name: Custom-Header
      value: Awesome

TCP Socket:

livenessProbe:
  tcpSocket:
    port: 3306

Exec Command:

livenessProbe:
  exec:
    command:
    - cat
    - /tmp/healthy

Startup Probe (for slow-starting apps):

startupProbe:
  httpGet:
    path: /startup
    port: 8080
  failureThreshold: 30
  periodSeconds: 10  # 30 * 10 = 5 minutes max

Logging CKA

View container output to debug issues. Kubernetes captures stdout and stderr from containers. You can view logs from running Pods or from previous instances after a crash. In production, set up a log aggregation system like Loki or Elasticsearch because logs disappear when Pods are deleted.

Command	Description
`kubectl logs <pod>`	View pod logs
`kubectl logs <pod> -c <container>`	Logs from specific container
`kubectl logs <pod> -f --tail=100`	Follow last 100 lines

Events & Troubleshooting CKA

Events are the first place to look when something goes wrong. They show scheduling decisions, image pulls, crashes, and more. Events are kept for 1 hour by default.

Command	Description
`kubectl get events --sort-by='.lastTimestamp'`	Recent events sorted by time
`kubectl describe pod <name>`	Pod details + events (most useful!)
`kubectl get pods -o wide`	Show node placement and IPs

Troubleshooting Patterns CKA

Symptom	Check
Pod stuck `Pending`	Resources, node selectors, taints, PVC binding
Pod `CrashLoopBackOff`	Logs, command/args, probes, permissions
Pod `ImagePullBackOff`	Image name, tag, registry auth (imagePullSecrets)
Service not routing	Selector labels match, endpoints exist, port numbers

Detailed Troubleshooting Workflows

Pod Pending - Diagnostic Steps:

# 1. Check events for scheduling issues
kubectl describe pod <name> | grep -A 10 Events

# 2. Check resource constraints
kubectl describe node <node> | grep -A 5 "Allocated resources"

# 3. Check PVC binding status
kubectl get pvc
kubectl describe pvc <name>

# 4. Check node taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints

# 5. Verify node selectors match
kubectl get pod <name> -o jsonpath='\u007b.spec.nodeSelector\u007d'
kubectl get nodes --show-labels | grep <label>

CrashLoopBackOff - Diagnostic Steps:

# 1. Check logs from previous container
kubectl logs <pod> --previous

# 2. Check events for OOMKilled
kubectl describe pod <name> | grep -i "killed\|oom"

# 3. Verify command and args
kubectl get pod <name> -o jsonpath='\u007b.spec.containers[0].command\u007d\u007b.spec.containers[0].args\u007d'

# 4. Check probe configuration
kubectl describe pod <name> | grep -A 5 "Liveness\|Readiness"

# 5. Check resource limits
kubectl describe pod <name> | grep -A 3 "Limits\|Requests"

ImagePullBackOff - Diagnostic Steps:

# 1. Verify image name and tag
kubectl get pod <name> -o jsonpath='\u007b.spec.containers[0].image\u007d'

# 2. Check if image exists in registry
docker pull <image>  # or crictl pull on node

# 3. Check imagePullSecrets
kubectl get pod <name> -o jsonpath='\u007b.spec.imagePullSecrets\u007d'

# 4. Create registry secret if needed
kubectl create secret docker-registry regcred \
  --docker-server=<registry> \
  --docker-username=<user> \
  --docker-password=<pass>

Service Not Routing - Diagnostic Steps:

# 1. Check service selector matches pod labels
kubectl get svc <service> -o jsonpath='\u007b.spec.selector\u007d'
kubectl get pods -l <selector-key>=<selector-value> --show-labels

# 2. Verify endpoints exist
kubectl get endpoints <service>

# 3. Check port configuration
kubectl get svc <service> -o jsonpath='\u007b.spec.ports\u007d'
kubectl get pod <pod> -o jsonpath='\u007b.spec.containers[0].ports\u007d'

# 4. Test connectivity from within cluster
kubectl run debug --rm -it --image=busybox -- /bin/sh
wget -O- http://<service>.<namespace>.svc.cluster.local:<port>

4. Networking

Applications need to talk to each other. This section covers how Pods discover and communicate with other services, both inside and outside the cluster.

Service CKA

Pods come and go. Their IP addresses change constantly. Services solve this by providing a stable IP address and DNS name that always routes to healthy Pods. You define a Service with a label selector. Kubernetes automatically maintains a list of matching Pods and load balances traffic between them. If a Pod dies and a new one starts, the Service automatically updates to send traffic to the new Pod.

Type	Description
`ClusterIP`	Internal-only (default). Accessible within cluster only.
`NodePort`	Exposes on each node's IP at a static port (30000-32767).
`LoadBalancer`	Provisions external load balancer (cloud or MetalLB).
`ExternalName`	Maps to external DNS name (CNAME record).

Command	Description
`kubectl expose deploy <name> --port=80`	Create ClusterIP service
`kubectl get svc`	List services
`kubectl get endpoints <svc>`	Show which Pods back the service

Example YAML

apiVersion: v1
kind: Service
metadata:
  name: nginx
spec:
  type: ClusterIP
  selector:
    app: nginx        # Must match Pod labels
  ports:
  - port: 80          # Service port
    targetPort: 8080  # Container port

Ingress CKA

Layer 7 (HTTP/HTTPS) routing rules. Instead of exposing multiple NodePorts, you expose one Ingress controller (Traefik, Nginx) and route by hostname or path: grafana.example.com → Grafana Service.

Command	Description
`kubectl get ingress`	List ingress resources
`kubectl describe ingress <name>`	Show routing rules and backend services

Example YAML

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
spec:
  ingressClassName: nginx
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: app
            port:
              number: 80
  tls:
  - hosts:
    - app.example.com
    secretName: app-tls

NetworkPolicy CKA

Firewall rules for Pods. By default, all Pods can talk to all Pods. NetworkPolicies restrict this: "Only backend Pods can reach database on port 5432." Requires a CNI plugin that supports policies (Calico, Cilium).

Command	Description
`kubectl get netpol`	List network policies
`kubectl describe netpol <name>`	Show policy rules

Example: Deny all ingress

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all
spec:
  podSelector:     # Applies to all pods
  policyTypes:
  - Ingress          # Deny all incoming traffic

Example: Allow from specific pods

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: db-policy
spec:
  podSelector:
    matchLabels:
      app: database
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: backend
    ports:
    - port: 5432

DNS (CoreDNS) CKA

Every Service gets a DNS name automatically. CoreDNS runs as a Deployment in kube-system and resolves cluster DNS.

DNS Pattern	Example
Same namespace	`<service>`
Cross namespace	`<service>.<namespace>`
Fully qualified	`<service>.<namespace>.svc.cluster.local`

5. Storage

Pods are temporary. When they restart, they lose all data. This section covers how to store data that survives Pod restarts.

PersistentVolume (PV) CKA

A PersistentVolume represents actual storage in the cluster (NFS share, cloud disk, or local SSD). Each PV has properties: capacity (size), accessModes (how it can be mounted), and storageClassName (type of storage). When created, a PV waits for a PVC that matches these properties to claim it. With dynamic provisioning (StorageClasses), you don't create PVs manually. They're created automatically when a PVC requests them.

Command	Description
`kubectl get pv`	List PersistentVolumes
`kubectl describe pv <name>`	Show PV details and claim binding

Example YAML

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-data
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: /data/pv

PersistentVolumeClaim (PVC) CKA

A request for storage. When you create a PVC, Kubernetes looks for a PV that matches three criteria: (1) accessModes must match, (2) requested storage must fit within PV capacity, (3) storageClassName must match. If no PV exists and you specified a StorageClass, a new PV is created automatically. The PVC then binds to the PV, making it unavailable to other claims.

Command	Description
`kubectl get pvc`	List PVCs and their status
`kubectl describe pvc <name>`	Show bound PV and events

Example YAML

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-claim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: longhorn

How PVC Binds to PV

Static Provisioning (Admin creates PV):

Admin creates PV with: capacity=10Gi, accessModes=ReadWriteOnce, storageClassName=fast-ssd
User creates PVC requesting: storage=5Gi, accessModes=ReadWriteOnce, storageClassName=fast-ssd
Kubernetes matches all three criteria and binds PVC to PV

Dynamic Provisioning (StorageClass):

User creates PVC with: storageClassName=longhorn
Longhorn provisioner creates a new PV automatically
PVC binds to the newly created PV

Key Point: The Pod only references the PVC by name, never the PV directly.

StorageClass CKA

Defines how to create storage automatically. A StorageClass specifies which provisioner to use (Longhorn, AWS EBS, GCP PD, etc.) and parameters like replication or disk type. When a PVC requests storageClassName: longhorn, the Longhorn provisioner creates a new PV automatically. No manual disk allocation needed.

Command	Description
`kubectl get sc`	List StorageClasses
`kubectl describe sc <name>`	Show provisioner and parameters

StorageClass Examples

Longhorn (Distributed Storage):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: longhorn
provisioner: driver.longhorn.io
parameters:
  numberOfReplicas: "3"
  staleReplicaTimeout: "2880"
  fromBackup: ""
reclaimPolicy: Delete
volumeBindingMode: Immediate

AWS EBS (Cloud Provider):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gp3
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  encrypted: "true"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

Local SSD (Performance):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-ssd
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
# Requires pre-created PVs on nodes

Key Parameters:
• reclaimPolicy: Delete (default) or Retain
• volumeBindingMode: Immediate or WaitForFirstConsumer
• allowVolumeExpansion: Enable PVC resizing

Volume Types CKA

Beyond PVCs, Kubernetes supports ephemeral and special volume types:

Type	Use Case
`emptyDir`	Scratch space shared between containers. Deleted when Pod dies.
`hostPath`	Mounts directory from the node. Use sparingly (ties Pod to node).
`configMap` / `secret`	Mount ConfigMap or Secret as files.
`projected`	Combine multiple sources into one volume. Useful for service account tokens with additional metadata.

Using Volumes in Pods CKA

Volumes are defined at two levels in a Pod spec. Understanding how they connect is essential for working with storage.

Level	Field	Purpose
Pod spec	`volumes`	Defines the volume source (PVC, ConfigMap, Secret, etc.)
Container spec	`volumeMounts`	Mounts the volume into the container at a specific path

The name field is the link. The volumeMount references a volume by name, and that volume must be defined in the Pod's volumes array.

Complete Example: Multiple Volume Types

apiVersion: v1
kind: Pod
metadata:
  name: multi-volume-pod
spec:
  # Pod-level: Define the volumes
  volumes:
  - name: data                    # Reference name
    persistentVolumeClaim:
      claimName: my-pvc           # Actual PVC
  - name: config                  # Reference name
    configMap:
      name: app-config            # Actual ConfigMap
  - name: secrets                 # Reference name
    secret:
      secretName: app-secrets     # Actual Secret
  - name: cache                   # Reference name
    emptyDir:                   # Ephemeral storage

  containers:
  - name: app
    image: myapp:latest
    # Container-level: Mount the volumes
    volumeMounts:
    - name: data                  # Matches volumes[0].name
      mountPath: /data            # Where it appears in container
    - name: config                # Matches volumes[1].name
      mountPath: /etc/config      # Config files here
      readOnly: true              # ConfigMaps are read-only
    - name: secrets               # Matches volumes[2].name
      mountPath: /etc/secrets     # Secrets here
      readOnly: true              # Secrets are read-only
    - name: cache                 # Matches volumes[3].name
      mountPath: /tmp/cache       # Temp storage here

Key points: The volume names (data, config, etc.) must match between volumes and volumeMounts. One volume can be mounted in multiple containers, and each container can choose its own mountPath.

Access Modes CKA

Defines how many nodes can mount a volume simultaneously and in what mode. Choose based on your application architecture.

Mode	Meaning	Use Case
`ReadWriteOnce (RWO)`	Single node can mount read-write	Most common. Databases, single-instance apps.
`ReadOnlyMany (ROX)`	Multiple nodes can mount read-only	Shared config files, static assets.
`ReadWriteMany (RWX)`	Multiple nodes can mount read-write	Shared file storage (NFS, CephFS).

6. Scheduling

Not every node is suitable for every workload. Sometimes you need Pods on machines with SSDs, or you want to spread replicas across different availability zones. This section covers how to influence where Pods run.

nodeSelector CKA

The simplest way to control which nodes run your Pods. Add labels to nodes, then specify those labels in your Pod spec. The scheduler only places the Pod on nodes that match all the labels you specify. For example, you might label certain nodes as having SSD storage, then use nodeSelector to ensure database Pods only run on those nodes.

Example YAML

spec:
  nodeSelector:
    disktype: ssd
    zone: us-west-1a

Command	Description
`kubectl label node <name> disktype=ssd`	Add label to node
`kubectl get nodes --show-labels`	Show all node labels

Affinity & Anti-Affinity CKA

A more powerful alternative to nodeSelector with three types. Node affinity works like nodeSelector but supports expressions and soft preferences. Pod affinity places Pods near other Pods, useful for co-locating related services. Pod anti-affinity spreads Pods across different nodes, ensuring high availability by preventing all replicas from running on the same machine.

Type	Behavior
`requiredDuringScheduling`	Hard requirement - must match or Pod won't schedule
`preferredDuringScheduling`	Soft preference - scheduler tries but not guaranteed

Node Affinity Example

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: zone
            operator: In
            values:
            - us-west-1a
            - us-west-1b

Pod Anti-Affinity Example

# Spread replicas across nodes
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app: web
        topologyKey: kubernetes.io/hostname

Taints & Tolerations CKA

Taints and tolerations work together to control Pod placement. Taints are applied to nodes and prevent Pods from scheduling there. Tolerations are applied to Pods and allow them to ignore specific taints. Control plane nodes use taints to prevent regular workloads from running on them. You can also use taints to reserve certain nodes for specific workloads, like GPU machines for machine learning tasks.

Effect	Behavior
`NoSchedule`	New Pods won't schedule without toleration
`PreferNoSchedule`	Scheduler avoids but not guaranteed
`NoExecute`	Evicts existing Pods without toleration

Command	Description
`kubectl taint node <n> key=value:NoSchedule`	Add taint to node
`kubectl taint node <n> key:NoSchedule-`	Remove taint (note the minus)

Toleration Example

spec:
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"

7. Security

Security in Kubernetes involves controlling who can access resources and how workloads run. This section covers authentication, authorisation, and workload security. These concepts appear in almost every real deployment, making them essential to understand early.

RBAC (Role-Based Access Control) CKA

RBAC controls what users and applications can do in the cluster. The system works with four resources that define and assign permissions. First, a Role or ClusterRole defines what actions are allowed on which resources. Then, a RoleBinding or ClusterRoleBinding connects those permissions to users, groups, or ServiceAccounts. Think of it like giving someone a key card. The Role defines which doors the card opens, and the Binding gives the card to a specific person.

Resource	Scope	Purpose
`Role`	Namespace	Defines permissions within a namespace
`ClusterRole`	Cluster-wide	Defines cluster-wide permissions
`RoleBinding`	Namespace	Binds Role to users/groups/ServiceAccounts
`ClusterRoleBinding`	Cluster-wide	Binds ClusterRole cluster-wide

Command	Description
`kubectl auth can-i create pods`	Check if you can create pods
`kubectl auth can-i list secrets --as=system:serviceaccount:default:mysa`	Check permissions as ServiceAccount

Role + RoleBinding Example

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: dev
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: dev
subjects:
- kind: ServiceAccount
  name: dev-sa
  namespace: dev
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

ServiceAccount CKA

Gives Pods an identity to authenticate with the Kubernetes API. Every namespace automatically gets a default ServiceAccount, but you should create custom ones for applications that need specific permissions. Combine ServiceAccounts with RBAC Roles to control what each application can access. For example, a backup tool might need permission to read Secrets, while a monitoring agent might only need to read Pod metrics.

Command	Description
`kubectl create sa <name>`	Create ServiceAccount
`kubectl get sa`	List ServiceAccounts

Using ServiceAccount in Pod

spec:
  serviceAccountName: my-custom-sa
  automountServiceAccountToken: false  # Disable if not needed

RBAC Permissions Reference CKA

Every RBAC rule has three components: apiGroups (which API), resources (what objects), and verbs (what actions). Understanding these helps you create precise permissions without over-privileging.

apiGroups	Common Resources
`""` (core)	pods, services, nodes, configmaps, secrets, namespaces
`apps`	deployments, statefulsets, daemonsets, replicasets
`networking.k8s.io`	ingresses, networkpolicies
`rbac.authorization.k8s.io`	roles, clusterroles, rolebindings
`metrics.k8s.io`	nodes, pods (CPU/memory metrics from metrics-server)
`batch`	jobs, cronjobs
`storage.k8s.io`	storageclasses, volumeattachments

Verb	Action	Use Case
`get`	Read a single resource by name	View specific pod details
`list`	List all resources of a type	Show all pods in namespace
`watch`	Stream changes in real-time	Controllers watching for changes
`create`	Create new resources	Deploy new applications
`update`	Modify existing resources	Change pod labels or annotations
`patch`	Partially modify resources	Update specific fields only
`delete`	Delete a single resource	Remove pods or services

Common RBAC Patterns

Read-only access to pods:

- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]

Full admin on deployments:

- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["*"]

Metrics access (for monitoring tools):

- apiGroups: ["metrics.k8s.io"]
  resources: ["nodes", "pods"]
  verbs: ["get", "list"]

ConfigMap reader only:

- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "list"]

Testing Permissions

Always test permissions before deploying. Use kubectl auth can-i to verify:

# Check your own permissions
kubectl auth can-i get pods
kubectl auth can-i create deployments

# Check permissions for a ServiceAccount
kubectl auth can-i get pods \
  --as=system:serviceaccount:default:my-sa

# Check permissions in a specific namespace
kubectl auth can-i list secrets -n production \
  --as=system:serviceaccount:monitoring:prometheus

Tip: If you get "no" or a 403 error, check that the ServiceAccount has the right Role/ClusterRole bound to it.

8. Cluster Administration

These topics focus on managing the cluster infrastructure itself.

Control Plane Components CKA

The control plane is the brain of the cluster. These components run on the master nodes and manage the entire cluster. Understanding what each component does helps with troubleshooting cluster issues.

Component	Purpose
`kube-apiserver`	The front door. All kubectl commands and internal communication go through here.
`etcd`	The database. Stores all cluster configuration and state.
`kube-scheduler`	Decides which node should run each new Pod based on resources and constraints.
`kube-controller-manager`	Runs background loops that ensure the actual state matches the desired state.
`kubelet`	Runs on every node. Creates and manages containers, reports node health.
`kube-proxy`	Runs on every node. Manages network rules for Services.

kubeadm CKA

The standard tool for creating and managing Kubernetes clusters. While you can use k3s for a homelab, the CKA exam tests kubeadm extensively. It handles initializing the control plane, joining worker nodes, and upgrading cluster versions.

Command	Description
`kubeadm init`	Initialize control plane node
`kubeadm join`	Join worker node to cluster
`kubeadm upgrade plan`	Check available upgrades

etcd Backup & Restore CKA

etcd stores everything about your cluster. All deployments, services, configurations, and secrets live there. Without backups, a corrupted etcd means rebuilding your entire cluster from scratch. Regular snapshots are essential for production environments.

Command	Description
`ETCDCTL_API=3 etcdctl snapshot save backup.db`	Create snapshot backup
`ETCDCTL_API=3 etcdctl snapshot restore backup.db`	Restore from snapshot

Full Backup Command

ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

Node Maintenance CKA

Take nodes offline safely without disrupting running applications. First cordon the node to prevent new Pods from scheduling there. Then drain the node to gracefully evict all running Pods, giving them time to shut down cleanly. The Pods will be rescheduled on other nodes. Finally, uncordon the node when maintenance is complete to allow new Pods to run there again.

Command	Description
`kubectl cordon <node>`	Mark node unschedulable (no new Pods)
`kubectl drain <node> --ignore-daemonsets`	Evict Pods and cordon
`kubectl uncordon <node>`	Mark node schedulable again

9. Application Development

Advanced topics for building and deploying applications on Kubernetes.

Custom Resource Definitions (CRDs) CKAD

Extend Kubernetes with your own resource types. This is how tools like ArgoCD add Application resources and Prometheus adds ServiceMonitor resources to the cluster. A CRD defines the schema, and a Controller watches for those resources and takes action. Together they form an Operator, which can automate complex application management.

Command	Description
`kubectl get crd`	List all Custom Resource Definitions
`kubectl api-resources`	List all resources including CRDs

Example: ArgoCD Application CRD

ArgoCD is a GitOps tool that uses CRDs to manage deployments. This example shows an Application resource that tells ArgoCD to sync a folder from Git to your cluster:

apiVersion: argoproj.io/v1alpha1  # Custom API group from CRD
kind: Application                  # Custom resource type
metadata:
  name: my-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/myorg/myrepo.git
    targetRevision: HEAD            # Watch this Git branch
    path: apps/my-app               # Folder containing K8s manifests
  destination:
    server: https://kubernetes.default.svc
    namespace: production           # Deploy to this namespace
  syncPolicy:
    automated:
      prune: true                   # Delete resources removed from Git
      selfHeal: true                # Fix drift from desired state

Key points: This is not a built-in Kubernetes resource. The argoproj.io/v1alpha1 API group comes from the ArgoCD CRD. The ArgoCD controller watches for these resources and automatically syncs your cluster to match the Git repository.

Helm CKAD

The package manager for Kubernetes. Complex applications like databases or monitoring stacks might need dozens of YAML files. Helm packages these into charts with configurable values. You install a chart with a single command and customise it through values files. Helm tracks versions and supports rollbacks if an upgrade goes wrong.

Command	Description
`helm install <name> <chart>`	Install a chart
`helm upgrade <name> <chart>`	Upgrade release
`helm list`	List installed releases

Deployment Strategies CKAD

Strategy	Description
RollingUpdate	Default. Gradually replaces old Pods with new ones.
Recreate	Kill all old Pods, then create new ones (downtime).
Blue/Green	Run both versions, switch traffic at once (via Service selector).
Canary	Route small % of traffic to new version, gradually increase.

RollingUpdate Config

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Max pods over desired during update
      maxUnavailable: 0  # Max pods unavailable during update

Horizontal Pod Autoscaler (HPA) CKA

Automatically scales the number of Pods in a Deployment, StatefulSet, or ReplicaSet based on observed metrics like CPU or memory utilisation. HPA is essential for production workloads to handle traffic spikes without manual intervention. Requires metrics-server to be installed in the cluster.

Command	Description
`kubectl get hpa`	List HPAs and current metrics
`kubectl describe hpa <name>`	Show scaling events and conditions
`kubectl top pod`	View CPU/memory usage (needs metrics-server)

HPA Example

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: backend-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: backend
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Quick Reference

Essential commands and shortcuts for daily Kubernetes operations.

Resource Shortcuts

Shortcut	Full Name
`po`	pods
`svc`	services
`cm`	configmaps
`secret`	secrets
`ing`	ingresses
`pvc`	persistentvolumeclaims
`pv`	persistentvolumes
`deploy`	deployments
`sts`	statefulsets
`ds`	daemonsets
`rs`	replicasets
`ns`	namespaces
`netpol`	networkpolicies
`sa`	serviceaccounts

Output Formats

Flag	Description
`-o yaml`	Output as YAML
`-o json`	Output as JSON
`-o wide`	Extra columns (node, IP)
`-o name`	Just resource names
`--show-labels`	Include all labels

Common Flags

Flag	Description
`-n <namespace>`	Target specific namespace
`--all-namespaces` or `-A`	All namespaces
`-l <selector>`	Filter by label selector
`--selector=<selector>`	Same as -l
`--dry-run=client`	Preview changes without applying
`--field-selector=<field>`	Filter by field (status.phase=Running)

Essential Commands

Command	Purpose
`kubectl create -f manifest.yaml`	Create resources from file
`kubectl apply -f manifest.yaml`	Create or update resources
`kubectl delete -f manifest.yaml`	Delete resources from file
`kubectl get all -n <ns>`	List all resources in namespace
`kubectl describe <resource> <name>`	Detailed info and events
`kubectl logs <pod> --previous`	Logs from previous container instance
`kubectl exec -it <pod> -- /bin/sh`	Shell into running container
`kubectl cp <pod>:/path ./local`	Copy files from pod to local
`kubectl port-forward svc/<name> 8080:80`	Forward local port to service
`kubectl top pod`	Show CPU/memory usage (needs metrics-server)
`kubectl explain <resource>`	Show resource documentation
`kubectl api-resources`	List all available resources

My Architecture

Production-grade k3s homelab running on a repurposed laptop (32GB RAM, 512GB SSD) with Ubuntu Server 24.04 LTS.

Stack Overview

Layer	Technology
Kubernetes	k3s (lightweight distribution)
Remote Access	Tailscale (encrypted VPN)
GitOps	ArgoCD (Bootstrap Pattern)
Monitoring	kube-prometheus-stack (Prometheus, Grafana, Alertmanager)
Ingress	Traefik + Tailscale Operator
Secret Management	Sealed Secrets
Storage	Longhorn (distributed block storage)

Resource Usage

Component	K8s Resource	Purpose
Grafana	Deployment	Stateless web dashboard
Prometheus	StatefulSet	Metrics storage with persistent data
Alertmanager	StatefulSet	Alert routing with state persistence
ArgoCD	Deployment	GitOps controller
All Services	ClusterIP	Internal communication
Web UIs	Ingress	HTTPS routing via Tailscale
Metrics/Logs	PVC	Longhorn-backed persistent storage
Apps	Namespace	`monitoring`, `argocd`, `tailscale`
Grafana Dashboards	ConfigMap	Version-controlled configuration
Credentials	SealedSecret	Git-safe encrypted secrets

The GitOps Workflow

With the Bootstrap Pattern, deployments follow a simple flow: push YAML to Git, ArgoCD syncs automatically. No more kubectl apply commands.

Directory	Contents
`apps/`	Application manifests (Deployments, Services)
`infrastructure/`	System manifests (monitoring, ingress, storage)
`argocd/`	Application CRDs (pointers to the above folders)

The Bootstrap Pattern Explained

The magic happens through a recursive pattern. You apply one file manually. Everything else happens automatically.

Step 1: Apply bootstrap.yaml once

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: bootstrap
  namespace: argocd
spec:
  source:
    repoURL: https://github.com/your/repo
    path: argocd/              # Watch this folder
  destination:
    namespace: argocd
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

Step 2: ArgoCD reads argocd/ folder

Inside argocd/ are more Application CRDs. For example, argocd/monitoring.yaml:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: monitoring
  namespace: argocd
spec:
  source:
    repoURL: https://github.com/your/repo
    path: infrastructure/monitoring/  # Points to manifests
  destination:
    namespace: monitoring
  syncPolicy:
    automated:

The result: Bootstrap creates Applications that create more Applications. You push to Git, ArgoCD syncs everything. No more manual kubectl apply commands.