Back to Blog

kubectl Cheatsheet: A Top-Down Flow Map for DevOps & SREs

KuberneteskubectlDevOpsSRECheatsheet

I keep forgetting the same flags. So here is the one place I will look from now on - kubectl at the top, everything I reach for flowing down. The diagrams are written in Mermaid, which means GitHub renders them natively, my Astro blog renders them client-side, and any decent markdown viewer can do the same.

Heavy bias towards the day-to-day: get, describe, logs, exec, rollout, drain. The exotic stuff (alpha, kustomize edit, convert) is here too but lower density.


The whole thing in one picture

flowchart TD
    K(["<b>kubectl</b>"])

    K --> C1["Cluster &amp; Config"]
    K --> C2["View / Inspect"]
    K --> C3["Create / Apply"]
    K --> C4["Modify"]
    K --> C5["Delete"]
    K --> C6["Debug"]
    K --> C7["Rollout"]
    K --> C8["Scale"]
    K --> C9["Nodes"]
    K --> C10["Auth / RBAC"]
    K --> C11["Networking"]
    K --> C12["Output / Flags"]

    C1 --> C1a["cluster-info<br/>version<br/>api-resources<br/>config *"]
    C2 --> C2a["get<br/>describe<br/>top<br/>explain<br/>events"]
    C3 --> C3a["apply -f / -k<br/>create<br/>run<br/>expose"]
    C4 --> C4a["edit<br/>set<br/>patch<br/>replace<br/>label / annotate"]
    C5 --> C5a["delete -f<br/>delete RES NAME<br/>delete -l<br/>--force --grace-period=0"]
    C6 --> C6a["logs<br/>exec<br/>port-forward<br/>cp<br/>debug<br/>attach"]
    C7 --> C7a["rollout status<br/>rollout history<br/>rollout undo<br/>rollout restart<br/>rollout pause/resume"]
    C8 --> C8a["scale<br/>autoscale"]
    C9 --> C9a["cordon / uncordon<br/>drain<br/>taint<br/>top node"]
    C10 --> C10a["auth can-i<br/>auth whoami<br/>certificate"]
    C11 --> C11a["port-forward<br/>proxy"]
    C12 --> C12a["-o yaml/json/wide<br/>-o jsonpath<br/>-o custom-columns<br/>-l selector<br/>--field-selector<br/>-A / -n<br/>-w / --watch"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    classDef cat fill:#1f2a44,stroke:#5b8def,color:#fff;
    class K root;
    class C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11,C12 cat;

Each section below zooms into one of those branches.


1. Cluster & config

Where am I? What am I talking to? What can it do?

flowchart TD
    K(["kubectl"]) --> CI["cluster-info"]
    K --> VR["version"]
    K --> AR["api-resources"]
    K --> AV["api-versions"]
    K --> CFG["config"]

    CI --> CID["dump<br/><i>full diagnostic dump</i>"]
    VR --> VRS["--short<br/>--client"]

    AR --> ARN["--namespaced=true|false"]
    AR --> ARV["--verbs=get,list,watch"]
    AR --> ARO["-o name"]

    CFG --> CFV["view<br/>--minify --raw"]
    CFG --> CFC["current-context"]
    CFG --> CFG1["get-contexts"]
    CFG --> CFU["use-context NAME"]
    CFG --> CFS["set-context --current<br/>--namespace=NS"]
    CFG --> CFR["rename-context OLD NEW"]
    CFG --> CFD["delete-context NAME"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
kubectl cluster-info
kubectl version --short
kubectl api-resources --namespaced=true -o name
kubectl config get-contexts
kubectl config use-context prod-eu-west-1
kubectl config set-context --current --namespace=payments

Pin the namespace to the context once and stop typing -n payments for the rest of the day.


2. Viewing & inspecting

The bread-and-butter loop: getdescribelogs.

flowchart TD
    K(["kubectl"]) --> GET["get"]
    K --> DSC["describe"]
    K --> TOP["top"]
    K --> EXP["explain"]
    K --> EVT["events"]

    GET --> GR["RESOURCE [NAME]<br/>pods, deploy, svc, ing,<br/>cm, secret, pv, pvc,<br/>node, ns, sa, role, rb,<br/>hpa, pdb, job, cj, ds, sts"]
    GET --> GF["-o wide<br/>-o yaml / json<br/>-o name<br/>-o jsonpath<br/>-o custom-columns"]
    GET --> GS["-l app=foo<br/>--field-selector status.phase=Running<br/>--show-labels<br/>--sort-by=.metadata.creationTimestamp"]
    GET --> GW["-w / --watch<br/>--watch-only"]
    GET --> GA["-A / --all-namespaces<br/>-n NS"]

    DSC --> DR["RESOURCE NAME<br/><i>events, conditions, env</i>"]
    TOP --> TP["pod [-A] [--containers]"]
    TOP --> TN["node"]

    EXP --> EX["RESOURCE.field<br/>--recursive"]

    EVT --> EVS["--sort-by=.lastTimestamp<br/>--for pod/NAME<br/>--types=Warning"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
# Wide pod listing across the cluster, sorted by node
kubectl get pods -A -o wide --sort-by=.spec.nodeName

# Only failing pods
kubectl get pods -A --field-selector=status.phase!=Running,status.phase!=Succeeded

# Last 50 cluster events, newest first
kubectl get events -A --sort-by=.lastTimestamp | tail -n 50

# What fields does a Deployment have?
kubectl explain deployment.spec.strategy --recursive

# Per-pod CPU/mem
kubectl top pod -A --containers --sort-by=memory

describe is where the truth lives - look at the Events: section before anything else.


3. Creating & applying

flowchart TD
    K(["kubectl"]) --> APP["apply"]
    K --> CRT["create"]
    K --> RUN["run"]
    K --> EXP["expose"]
    K --> DIF["diff"]

    APP --> AF["-f file.yaml<br/>-f dir/<br/>-R recursive"]
    APP --> AK["-k overlay/<br/><i>kustomize</i>"]
    APP --> AS["--server-side<br/>--force-conflicts<br/>--field-manager=NAME"]
    APP --> APR["--prune -l app=foo<br/>--dry-run=client|server"]

    CRT --> CD["deployment NAME --image=IMG<br/>--replicas=N --port=P"]
    CRT --> CS["secret generic NAME<br/>--from-literal=k=v<br/>--from-file=path"]
    CRT --> CC["configmap NAME<br/>--from-literal --from-file --from-env-file"]
    CRT --> CSA["serviceaccount NAME"]
    CRT --> CRB["rolebinding NAME<br/>--role=R --user=U --serviceaccount=NS:SA"]
    CRT --> CJ["job NAME --image=IMG -- cmd"]
    CRT --> CCJ["cronjob NAME --image=IMG --schedule='*/5 * * * *'"]

    RUN --> RP["NAME --image=IMG<br/>--restart=Never<br/>--rm -it -- sh"]

    EXP --> EXD["deployment NAME<br/>--port=80 --target-port=8080<br/>--type=ClusterIP|NodePort|LoadBalancer"]

    DIF --> DF["-f file.yaml<br/><i>show drift before apply</i>"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
# The two commands that pay rent
kubectl apply -f manifests/
kubectl apply -k overlays/prod

# Server-side apply with explicit field ownership
kubectl apply -f deploy.yaml --server-side --field-manager=ci

# Throwaway debug shell on the cluster network
kubectl run tmp --rm -it --image=nicolaka/netshoot --restart=Never -- bash

# Generate manifests instead of creating - great for committing to git
kubectl create deployment web --image=nginx --replicas=3 --dry-run=client -o yaml > web.yaml

kubectl diff -f file.yaml before every apply against a real cluster. It will save you at least one outage per quarter.


4. Modifying in place

flowchart TD
    K(["kubectl"]) --> ED["edit"]
    K --> SET["set"]
    K --> PT["patch"]
    K --> RP["replace"]
    K --> LB["label"]
    K --> AN["annotate"]
    K --> SC["scale"]

    ED --> ER["RESOURCE NAME<br/>--output-patch<br/>--save-config"]

    SET --> SI["image deploy/NAME<br/>container=IMG"]
    SET --> SE["env deploy/NAME<br/>KEY=VAL --from=cm/NAME --from=secret/NAME"]
    SET --> SR["resources deploy/NAME<br/>--limits=cpu=1,memory=1Gi<br/>--requests=cpu=100m,memory=128Mi"]
    SET --> SS["serviceaccount deploy/NAME SA"]
    SET --> SSEL["selector svc NAME k=v"]

    PT --> PJ["-p '{...}' --type=strategic|merge|json"]
    RP --> RF["-f file.yaml<br/>--force <i>delete + recreate</i>"]

    LB --> LR["RESOURCE NAME k=v<br/>k- <i>remove</i><br/>--overwrite<br/>--all"]
    AN --> AR["RESOURCE NAME k=v<br/>k- <i>remove</i>"]

    SC --> SCR["deploy/NAME --replicas=N<br/>--current-replicas=N <i>guard</i>"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
# Rotate the image without touching git
kubectl set image deploy/web web=ghcr.io/me/web:v1.2.3

# Bump the JVM heap on a running pod's parent deployment
kubectl set resources deploy/api --requests=memory=512Mi --limits=memory=1Gi

# JSON-patch a single field
kubectl patch deploy/web --type=json \
  -p='[{"op":"replace","path":"/spec/replicas","value":5}]'

# Strategic merge patch from a file
kubectl patch deploy/web --patch-file patch.yaml

# Quarantine a pod for forensics - relabel so the Service stops sending traffic
kubectl label pod web-abc123 app- quarantined=true --overwrite

label with the dash suffix (app-) removes the label. That trick - relabel a misbehaving pod off the Service so it stops taking traffic but stays alive for kubectl exec - has saved me more than once.


5. Deleting

flowchart TD
    K(["kubectl"]) --> DEL["delete"]
    DEL --> DF["-f file.yaml<br/>-k overlay/"]
    DEL --> DR["RESOURCE NAME<br/>RESOURCE/NAME"]
    DEL --> DS["-l app=foo<br/>--field-selector=..."]
    DEL --> DA["--all<br/>--all-namespaces"]
    DEL --> DG["--grace-period=N<br/>--force <i>(0 + force = SIGKILL now)</i>"]
    DEL --> DC["--cascade=background|foreground|orphan"]
    DEL --> DW["--wait=true|false<br/>--timeout=30s"]
    DEL --> DN["--now <i>same as grace 1s</i>"]
    DEL --> DI["--ignore-not-found"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
# Force-evict a stuck pod (last resort - the kubelet may not have released resources)
kubectl delete pod web-abc123 --grace-period=0 --force

# Purge everything matching a label, in every namespace
kubectl delete deploy,svc,cm,secret -l owner=ephemeral -A

# Delete by file but ignore "not found" so it is idempotent in CI
kubectl delete -f manifests/ --ignore-not-found=true

# Orphan children instead of cascade-deleting them
kubectl delete deploy web --cascade=orphan

--cascade=orphan is invaluable when migrating ownership - delete the Deployment but keep the ReplicaSet and Pods alive, then re-adopt them.


6. Debugging pods

This is the section I open most often.

flowchart TD
    K(["kubectl"]) --> LOG["logs"]
    K --> EXC["exec"]
    K --> PF["port-forward"]
    K --> CP["cp"]
    K --> DBG["debug"]
    K --> ATC["attach"]

    LOG --> LP["POD<br/>deploy/NAME<br/>job/NAME"]
    LOG --> LF["-f / --follow"]
    LOG --> LC["-c CONTAINER<br/>--all-containers<br/>--prefix"]
    LOG --> LH["--previous <i>last crash</i>"]
    LOG --> LT["--tail=N<br/>--since=10m<br/>--since-time=RFC3339<br/>--timestamps"]
    LOG --> LSEL["-l app=foo<br/>--max-log-requests=N"]

    EXC --> EI["-it POD -- sh"]
    EXC --> EC2["-c CONTAINER"]
    EXC --> ES["-- env<br/>-- ps -ef<br/>-- cat /etc/...<br/>-- curl localhost:PORT"]

    PF --> PFP["POD / svc/NAME / deploy/NAME"]
    PF --> PFM["LOCAL:REMOTE<br/>:REMOTE <i>random local</i>"]
    PF --> PFA["--address 0.0.0.0"]

    CP --> CPS["NS/POD:/path ./local"]
    CP --> CPD["./local NS/POD:/path"]
    CP --> CPC["-c CONTAINER<br/>--retries=N"]

    DBG --> DBP["POD --image=busybox<br/>--target=CONTAINER <i>shared PID ns</i><br/>--share-processes"]
    DBG --> DBN["node/NODE --image=ubuntu<br/><i>chroot /host</i>"]
    DBG --> DBC["--copy-to=NEW --set-image=*=IMG<br/><i>clone &amp; tweak</i>"]

    ATC --> ATP["POD -c CONTAINER -i -t"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
# Logs from the previous container instance after a crashloop
kubectl logs deploy/api --previous --tail=200

# Live tail across every pod behind a Deployment
kubectl logs -f -l app=api --max-log-requests=20 --prefix

# Drop into a running container
kubectl exec -it deploy/api -c app -- bash

# Forward a Service port locally
kubectl port-forward svc/grafana 3000:80

# Copy a heap dump out of a pod for offline analysis
kubectl cp prod/api-7d-abc:/tmp/heap.hprof ./heap.hprof -c app

# Ephemeral debug container next to a running pod (no rebuild, no restart)
kubectl debug -it api-7d-abc --image=nicolaka/netshoot --target=app

# Debug the node itself
kubectl debug node/ip-10-0-1-23 -it --image=ubuntu

kubectl debug is the single most underused command. Distroless image with no shell? Attach a netshoot ephemeral container in the same PID namespace and you can strace the real process.


7. Rollouts

flowchart TD
    K(["kubectl"]) --> RO["rollout"]
    RO --> RS["status deploy/NAME<br/>-w / --watch<br/>--timeout=10m"]
    RO --> RH["history deploy/NAME<br/>--revision=N <i>show one</i>"]
    RO --> RU["undo deploy/NAME<br/>--to-revision=N"]
    RO --> RR["restart deploy/NAME<br/>ds/NAME  sts/NAME"]
    RO --> RP["pause deploy/NAME"]
    RO --> RZ["resume deploy/NAME"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
# Wait for a deploy to become healthy (CI uses this)
kubectl rollout status deploy/api --timeout=5m

# What did the last 5 deploys actually change?
kubectl rollout history deploy/api
kubectl rollout history deploy/api --revision=42

# Roll back one revision
kubectl rollout undo deploy/api

# Roll-restart all pods (reads new ConfigMap/Secret without changing the image)
kubectl rollout restart deploy/api

# Pause mid-rollout, change something, resume - bundles multiple set commands into one rollout
kubectl rollout pause deploy/api
kubectl set image deploy/api app=img:v2
kubectl set env  deploy/api LOG_LEVEL=debug
kubectl rollout resume deploy/api

Pause / change / resume is the cleanest way to apply a multi-field change as a single rollout instead of triggering N successive rollouts.


8. Scaling

flowchart TD
    K(["kubectl"]) --> SC["scale"]
    K --> AS["autoscale"]

    SC --> SCR["deploy/NAME --replicas=N<br/>rs/NAME  sts/NAME  rc/NAME"]
    SC --> SCG["--current-replicas=N<br/><i>refuse if mismatch</i>"]
    SC --> SCT["--timeout=30s"]

    AS --> ASR["deploy/NAME<br/>--min=N --max=M<br/>--cpu-percent=70"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
kubectl scale deploy/api --replicas=10
kubectl scale deploy/api --replicas=10 --current-replicas=5   # refuses if drifted
kubectl autoscale deploy/api --min=2 --max=20 --cpu-percent=70

--current-replicas is your concurrency guard. Use it in any script that scales by name without holding a lock.


9. Node operations

flowchart TD
    K(["kubectl"]) --> CD["cordon NODE"]
    K --> UC["uncordon NODE"]
    K --> DR["drain NODE"]
    K --> TN["taint nodes NODE"]
    K --> TPN["top node"]
    K --> GTN["get nodes"]

    DR --> DR1["--ignore-daemonsets"]
    DR --> DR2["--delete-emptydir-data"]
    DR --> DR3["--force <i>evict unmanaged pods</i>"]
    DR --> DR4["--grace-period=N<br/>--timeout=5m"]
    DR --> DR5["--pod-selector='app!=critical'"]
    DR --> DR6["--disable-eviction <i>skip PDB</i>"]

    TN --> T1["key=value:NoSchedule"]
    TN --> T2["key=value:PreferNoSchedule"]
    TN --> T3["key=value:NoExecute"]
    TN --> T4["key:effect- <i>remove taint</i>"]

    GTN --> GTN1["-o wide<br/>-l role=worker<br/>--show-labels"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
# Standard node drain - what you do before patching the host
kubectl cordon ip-10-0-1-23
kubectl drain ip-10-0-1-23 --ignore-daemonsets --delete-emptydir-data --timeout=10m
# ... do work ...
kubectl uncordon ip-10-0-1-23

# Add a taint so only tolerating workloads land here
kubectl taint nodes gpu-0 dedicated=ml:NoSchedule

# Remove that taint
kubectl taint nodes gpu-0 dedicated:NoSchedule-

# Pressure check
kubectl top nodes
kubectl get nodes -o wide --sort-by=.status.capacity.memory

Always test drain with --dry-run=server first if you have PDBs - it tells you which pods would block the drain before you commit.


10. Auth & RBAC

flowchart TD
    K(["kubectl"]) --> AU["auth"]
    K --> CRT["certificate"]

    AU --> AC["can-i VERB RESOURCE<br/>--namespace=NS<br/>--as=USER<br/>--as-group=GRP<br/>--all-namespaces<br/>--list"]
    AU --> AW["whoami"]
    AU --> AR["reconcile -f rbac.yaml<br/><i>create/update RBAC, prune extras</i>"]

    CRT --> CA["approve CSR-NAME"]
    CRT --> CD["deny CSR-NAME"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
# Can I do the thing?
kubectl auth can-i delete pods -n prod
kubectl auth can-i '*' '*' --all-namespaces

# Can the CI service account do the thing?
kubectl auth can-i create deployments -n prod \
  --as=system:serviceaccount:ci:deployer

# Who am I right now?
kubectl auth whoami

# List every permission I have in this namespace
kubectl auth can-i --list -n payments

--as impersonation against auth can-i is the safest way to debug RBAC without giving anyone production credentials.


11. Networking access

flowchart TD
    K(["kubectl"]) --> PF["port-forward"]
    K --> PRX["proxy"]

    PF --> PF1["pod/NAME<br/>svc/NAME<br/>deploy/NAME"]
    PF --> PF2["LOCAL:REMOTE<br/>:REMOTE <i>random local</i><br/>multiple pairs ok"]
    PF --> PF3["--address 0.0.0.0<br/><i>expose beyond localhost</i>"]

    PRX --> PR1["--port=8001<br/>--address=127.0.0.1"]
    PRX --> PR2["--accept-paths<br/>--reject-paths<br/>--api-prefix=/k8s"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
# Forward a Service port - works through the API server, no LB needed
kubectl port-forward svc/postgres 5432:5432

# Hit the API directly - everything kubectl does is just HTTP
kubectl proxy --port=8001 &
curl http://localhost:8001/api/v1/namespaces/default/pods

kubectl proxy plus curl is the cleanest way to learn the API. Watch the request kubectl actually makes with -v=8.


12. Output, selectors, and the flags you forget

flowchart TD
    K(["kubectl"]) --> OUT["-o / --output"]
    K --> SEL["selectors"]
    K --> NS["namespace"]
    K --> WAT["watch"]
    K --> DRY["dry-run"]
    K --> VRB["verbosity"]

    OUT --> O1["yaml | json<br/>name | wide"]
    OUT --> O2["jsonpath='{.items[*].metadata.name}'"]
    OUT --> O3["jsonpath-as-json"]
    OUT --> O4["go-template / go-template-file"]
    OUT --> O5["custom-columns=NAME:.metadata.name,IP:.status.podIP"]
    OUT --> O6["custom-columns-file=cols.txt"]

    SEL --> S1["-l key=value<br/>-l 'key in (a,b)'<br/>-l 'key!=value'<br/>-l 'key'  <i>exists</i><br/>-l '!key' <i>not exists</i>"]
    SEL --> S2["--field-selector status.phase=Running<br/>--field-selector spec.nodeName=NODE<br/>--field-selector metadata.namespace!=kube-system"]

    NS --> N1["-n NS<br/>-A / --all-namespaces"]

    WAT --> W1["-w / --watch<br/>--watch-only"]
    WAT --> W2["wait --for=condition=Ready pod/NAME<br/>--for=delete<br/>--timeout=5m"]

    DRY --> D1["--dry-run=client<br/>--dry-run=server<br/>--validate=strict"]

    VRB --> V1["-v=6 <i>requests</i><br/>-v=8 <i>request bodies</i><br/>-v=9 <i>also responses</i>"]

    classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
    class K root;
# Just the pod names
kubectl get pods -o jsonpath='{.items[*].metadata.name}'

# Pods + node assignment as a table
kubectl get pods -o custom-columns=NAME:.metadata.name,NODE:.spec.nodeName,IP:.status.podIP

# Wait until ready
kubectl wait --for=condition=Available deploy/api --timeout=5m
kubectl wait --for=delete pod/web-abc123 --timeout=2m

# Render a manifest server-side without persisting it - schema validation included
kubectl apply -f deploy.yaml --dry-run=server -o yaml

# What is kubectl actually sending? (debug RBAC, networking, webhooks)
kubectl get pods -v=8

kubectl wait is the right answer in scripts. Polling get in a loop is what people write before they discover it.


A few more that did not fit cleanly

# Render kustomize without applying
kubectl kustomize overlays/prod

# What changed since last apply on the server?
kubectl diff -f deploy.yaml

# Convert v1beta1 manifests to current API version
kubectl convert -f legacy.yaml --output-version apps/v1

# Plugin manager (krew) - for kubectl-tree, kubectl-neat, kubectl-stern, etc
kubectl krew install tree
kubectl tree deploy api

# Generate a bash completion
source <(kubectl completion bash)

# Alias I cannot live without
alias k=kubectl
complete -o default -F __start_kubectl k

How this page renders

The diagrams are written as standard ```mermaid fences in the markdown source. GitHub renders Mermaid natively, so the file in the repo renders identically there. On this site, a tiny script imports mermaid from a CDN, walks every pre.mermaid block, and swaps in an SVG. The same source, three viewers, no images committed to git.

If you spot a command I missed - especially a flag - send it my way and I will add it to the diagram.