kubectl Cheatsheet: A Top-Down Flow Map for DevOps & SREs
I keep forgetting the same flags. So here is the one place I will look from now on - kubectl at the top, everything I reach for flowing down. The diagrams are written in Mermaid, which means GitHub renders them natively, my Astro blog renders them client-side, and any decent markdown viewer can do the same.
Heavy bias towards the day-to-day: get, describe, logs, exec, rollout, drain. The exotic stuff (alpha, kustomize edit, convert) is here too but lower density.
The whole thing in one picture
flowchart TD
K(["<b>kubectl</b>"])
K --> C1["Cluster & Config"]
K --> C2["View / Inspect"]
K --> C3["Create / Apply"]
K --> C4["Modify"]
K --> C5["Delete"]
K --> C6["Debug"]
K --> C7["Rollout"]
K --> C8["Scale"]
K --> C9["Nodes"]
K --> C10["Auth / RBAC"]
K --> C11["Networking"]
K --> C12["Output / Flags"]
C1 --> C1a["cluster-info<br/>version<br/>api-resources<br/>config *"]
C2 --> C2a["get<br/>describe<br/>top<br/>explain<br/>events"]
C3 --> C3a["apply -f / -k<br/>create<br/>run<br/>expose"]
C4 --> C4a["edit<br/>set<br/>patch<br/>replace<br/>label / annotate"]
C5 --> C5a["delete -f<br/>delete RES NAME<br/>delete -l<br/>--force --grace-period=0"]
C6 --> C6a["logs<br/>exec<br/>port-forward<br/>cp<br/>debug<br/>attach"]
C7 --> C7a["rollout status<br/>rollout history<br/>rollout undo<br/>rollout restart<br/>rollout pause/resume"]
C8 --> C8a["scale<br/>autoscale"]
C9 --> C9a["cordon / uncordon<br/>drain<br/>taint<br/>top node"]
C10 --> C10a["auth can-i<br/>auth whoami<br/>certificate"]
C11 --> C11a["port-forward<br/>proxy"]
C12 --> C12a["-o yaml/json/wide<br/>-o jsonpath<br/>-o custom-columns<br/>-l selector<br/>--field-selector<br/>-A / -n<br/>-w / --watch"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
classDef cat fill:#1f2a44,stroke:#5b8def,color:#fff;
class K root;
class C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11,C12 cat;
Each section below zooms into one of those branches.
1. Cluster & config
Where am I? What am I talking to? What can it do?
flowchart TD
K(["kubectl"]) --> CI["cluster-info"]
K --> VR["version"]
K --> AR["api-resources"]
K --> AV["api-versions"]
K --> CFG["config"]
CI --> CID["dump<br/><i>full diagnostic dump</i>"]
VR --> VRS["--short<br/>--client"]
AR --> ARN["--namespaced=true|false"]
AR --> ARV["--verbs=get,list,watch"]
AR --> ARO["-o name"]
CFG --> CFV["view<br/>--minify --raw"]
CFG --> CFC["current-context"]
CFG --> CFG1["get-contexts"]
CFG --> CFU["use-context NAME"]
CFG --> CFS["set-context --current<br/>--namespace=NS"]
CFG --> CFR["rename-context OLD NEW"]
CFG --> CFD["delete-context NAME"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
kubectl cluster-info
kubectl version --short
kubectl api-resources --namespaced=true -o name
kubectl config get-contexts
kubectl config use-context prod-eu-west-1
kubectl config set-context --current --namespace=payments
Pin the namespace to the context once and stop typing -n payments for the rest of the day.
2. Viewing & inspecting
The bread-and-butter loop: get → describe → logs.
flowchart TD
K(["kubectl"]) --> GET["get"]
K --> DSC["describe"]
K --> TOP["top"]
K --> EXP["explain"]
K --> EVT["events"]
GET --> GR["RESOURCE [NAME]<br/>pods, deploy, svc, ing,<br/>cm, secret, pv, pvc,<br/>node, ns, sa, role, rb,<br/>hpa, pdb, job, cj, ds, sts"]
GET --> GF["-o wide<br/>-o yaml / json<br/>-o name<br/>-o jsonpath<br/>-o custom-columns"]
GET --> GS["-l app=foo<br/>--field-selector status.phase=Running<br/>--show-labels<br/>--sort-by=.metadata.creationTimestamp"]
GET --> GW["-w / --watch<br/>--watch-only"]
GET --> GA["-A / --all-namespaces<br/>-n NS"]
DSC --> DR["RESOURCE NAME<br/><i>events, conditions, env</i>"]
TOP --> TP["pod [-A] [--containers]"]
TOP --> TN["node"]
EXP --> EX["RESOURCE.field<br/>--recursive"]
EVT --> EVS["--sort-by=.lastTimestamp<br/>--for pod/NAME<br/>--types=Warning"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
# Wide pod listing across the cluster, sorted by node
kubectl get pods -A -o wide --sort-by=.spec.nodeName
# Only failing pods
kubectl get pods -A --field-selector=status.phase!=Running,status.phase!=Succeeded
# Last 50 cluster events, newest first
kubectl get events -A --sort-by=.lastTimestamp | tail -n 50
# What fields does a Deployment have?
kubectl explain deployment.spec.strategy --recursive
# Per-pod CPU/mem
kubectl top pod -A --containers --sort-by=memory
describe is where the truth lives - look at the Events: section before anything else.
3. Creating & applying
flowchart TD
K(["kubectl"]) --> APP["apply"]
K --> CRT["create"]
K --> RUN["run"]
K --> EXP["expose"]
K --> DIF["diff"]
APP --> AF["-f file.yaml<br/>-f dir/<br/>-R recursive"]
APP --> AK["-k overlay/<br/><i>kustomize</i>"]
APP --> AS["--server-side<br/>--force-conflicts<br/>--field-manager=NAME"]
APP --> APR["--prune -l app=foo<br/>--dry-run=client|server"]
CRT --> CD["deployment NAME --image=IMG<br/>--replicas=N --port=P"]
CRT --> CS["secret generic NAME<br/>--from-literal=k=v<br/>--from-file=path"]
CRT --> CC["configmap NAME<br/>--from-literal --from-file --from-env-file"]
CRT --> CSA["serviceaccount NAME"]
CRT --> CRB["rolebinding NAME<br/>--role=R --user=U --serviceaccount=NS:SA"]
CRT --> CJ["job NAME --image=IMG -- cmd"]
CRT --> CCJ["cronjob NAME --image=IMG --schedule='*/5 * * * *'"]
RUN --> RP["NAME --image=IMG<br/>--restart=Never<br/>--rm -it -- sh"]
EXP --> EXD["deployment NAME<br/>--port=80 --target-port=8080<br/>--type=ClusterIP|NodePort|LoadBalancer"]
DIF --> DF["-f file.yaml<br/><i>show drift before apply</i>"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
# The two commands that pay rent
kubectl apply -f manifests/
kubectl apply -k overlays/prod
# Server-side apply with explicit field ownership
kubectl apply -f deploy.yaml --server-side --field-manager=ci
# Throwaway debug shell on the cluster network
kubectl run tmp --rm -it --image=nicolaka/netshoot --restart=Never -- bash
# Generate manifests instead of creating - great for committing to git
kubectl create deployment web --image=nginx --replicas=3 --dry-run=client -o yaml > web.yaml
kubectl diff -f file.yaml before every apply against a real cluster. It will save you at least one outage per quarter.
4. Modifying in place
flowchart TD
K(["kubectl"]) --> ED["edit"]
K --> SET["set"]
K --> PT["patch"]
K --> RP["replace"]
K --> LB["label"]
K --> AN["annotate"]
K --> SC["scale"]
ED --> ER["RESOURCE NAME<br/>--output-patch<br/>--save-config"]
SET --> SI["image deploy/NAME<br/>container=IMG"]
SET --> SE["env deploy/NAME<br/>KEY=VAL --from=cm/NAME --from=secret/NAME"]
SET --> SR["resources deploy/NAME<br/>--limits=cpu=1,memory=1Gi<br/>--requests=cpu=100m,memory=128Mi"]
SET --> SS["serviceaccount deploy/NAME SA"]
SET --> SSEL["selector svc NAME k=v"]
PT --> PJ["-p '{...}' --type=strategic|merge|json"]
RP --> RF["-f file.yaml<br/>--force <i>delete + recreate</i>"]
LB --> LR["RESOURCE NAME k=v<br/>k- <i>remove</i><br/>--overwrite<br/>--all"]
AN --> AR["RESOURCE NAME k=v<br/>k- <i>remove</i>"]
SC --> SCR["deploy/NAME --replicas=N<br/>--current-replicas=N <i>guard</i>"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
# Rotate the image without touching git
kubectl set image deploy/web web=ghcr.io/me/web:v1.2.3
# Bump the JVM heap on a running pod's parent deployment
kubectl set resources deploy/api --requests=memory=512Mi --limits=memory=1Gi
# JSON-patch a single field
kubectl patch deploy/web --type=json \
-p='[{"op":"replace","path":"/spec/replicas","value":5}]'
# Strategic merge patch from a file
kubectl patch deploy/web --patch-file patch.yaml
# Quarantine a pod for forensics - relabel so the Service stops sending traffic
kubectl label pod web-abc123 app- quarantined=true --overwrite
label with the dash suffix (app-) removes the label. That trick - relabel a misbehaving pod off the Service so it stops taking traffic but stays alive for kubectl exec - has saved me more than once.
5. Deleting
flowchart TD
K(["kubectl"]) --> DEL["delete"]
DEL --> DF["-f file.yaml<br/>-k overlay/"]
DEL --> DR["RESOURCE NAME<br/>RESOURCE/NAME"]
DEL --> DS["-l app=foo<br/>--field-selector=..."]
DEL --> DA["--all<br/>--all-namespaces"]
DEL --> DG["--grace-period=N<br/>--force <i>(0 + force = SIGKILL now)</i>"]
DEL --> DC["--cascade=background|foreground|orphan"]
DEL --> DW["--wait=true|false<br/>--timeout=30s"]
DEL --> DN["--now <i>same as grace 1s</i>"]
DEL --> DI["--ignore-not-found"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
# Force-evict a stuck pod (last resort - the kubelet may not have released resources)
kubectl delete pod web-abc123 --grace-period=0 --force
# Purge everything matching a label, in every namespace
kubectl delete deploy,svc,cm,secret -l owner=ephemeral -A
# Delete by file but ignore "not found" so it is idempotent in CI
kubectl delete -f manifests/ --ignore-not-found=true
# Orphan children instead of cascade-deleting them
kubectl delete deploy web --cascade=orphan
--cascade=orphan is invaluable when migrating ownership - delete the Deployment but keep the ReplicaSet and Pods alive, then re-adopt them.
6. Debugging pods
This is the section I open most often.
flowchart TD
K(["kubectl"]) --> LOG["logs"]
K --> EXC["exec"]
K --> PF["port-forward"]
K --> CP["cp"]
K --> DBG["debug"]
K --> ATC["attach"]
LOG --> LP["POD<br/>deploy/NAME<br/>job/NAME"]
LOG --> LF["-f / --follow"]
LOG --> LC["-c CONTAINER<br/>--all-containers<br/>--prefix"]
LOG --> LH["--previous <i>last crash</i>"]
LOG --> LT["--tail=N<br/>--since=10m<br/>--since-time=RFC3339<br/>--timestamps"]
LOG --> LSEL["-l app=foo<br/>--max-log-requests=N"]
EXC --> EI["-it POD -- sh"]
EXC --> EC2["-c CONTAINER"]
EXC --> ES["-- env<br/>-- ps -ef<br/>-- cat /etc/...<br/>-- curl localhost:PORT"]
PF --> PFP["POD / svc/NAME / deploy/NAME"]
PF --> PFM["LOCAL:REMOTE<br/>:REMOTE <i>random local</i>"]
PF --> PFA["--address 0.0.0.0"]
CP --> CPS["NS/POD:/path ./local"]
CP --> CPD["./local NS/POD:/path"]
CP --> CPC["-c CONTAINER<br/>--retries=N"]
DBG --> DBP["POD --image=busybox<br/>--target=CONTAINER <i>shared PID ns</i><br/>--share-processes"]
DBG --> DBN["node/NODE --image=ubuntu<br/><i>chroot /host</i>"]
DBG --> DBC["--copy-to=NEW --set-image=*=IMG<br/><i>clone & tweak</i>"]
ATC --> ATP["POD -c CONTAINER -i -t"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
# Logs from the previous container instance after a crashloop
kubectl logs deploy/api --previous --tail=200
# Live tail across every pod behind a Deployment
kubectl logs -f -l app=api --max-log-requests=20 --prefix
# Drop into a running container
kubectl exec -it deploy/api -c app -- bash
# Forward a Service port locally
kubectl port-forward svc/grafana 3000:80
# Copy a heap dump out of a pod for offline analysis
kubectl cp prod/api-7d-abc:/tmp/heap.hprof ./heap.hprof -c app
# Ephemeral debug container next to a running pod (no rebuild, no restart)
kubectl debug -it api-7d-abc --image=nicolaka/netshoot --target=app
# Debug the node itself
kubectl debug node/ip-10-0-1-23 -it --image=ubuntu
kubectl debug is the single most underused command. Distroless image with no shell? Attach a netshoot ephemeral container in the same PID namespace and you can strace the real process.
7. Rollouts
flowchart TD
K(["kubectl"]) --> RO["rollout"]
RO --> RS["status deploy/NAME<br/>-w / --watch<br/>--timeout=10m"]
RO --> RH["history deploy/NAME<br/>--revision=N <i>show one</i>"]
RO --> RU["undo deploy/NAME<br/>--to-revision=N"]
RO --> RR["restart deploy/NAME<br/>ds/NAME sts/NAME"]
RO --> RP["pause deploy/NAME"]
RO --> RZ["resume deploy/NAME"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
# Wait for a deploy to become healthy (CI uses this)
kubectl rollout status deploy/api --timeout=5m
# What did the last 5 deploys actually change?
kubectl rollout history deploy/api
kubectl rollout history deploy/api --revision=42
# Roll back one revision
kubectl rollout undo deploy/api
# Roll-restart all pods (reads new ConfigMap/Secret without changing the image)
kubectl rollout restart deploy/api
# Pause mid-rollout, change something, resume - bundles multiple set commands into one rollout
kubectl rollout pause deploy/api
kubectl set image deploy/api app=img:v2
kubectl set env deploy/api LOG_LEVEL=debug
kubectl rollout resume deploy/api
Pause / change / resume is the cleanest way to apply a multi-field change as a single rollout instead of triggering N successive rollouts.
8. Scaling
flowchart TD
K(["kubectl"]) --> SC["scale"]
K --> AS["autoscale"]
SC --> SCR["deploy/NAME --replicas=N<br/>rs/NAME sts/NAME rc/NAME"]
SC --> SCG["--current-replicas=N<br/><i>refuse if mismatch</i>"]
SC --> SCT["--timeout=30s"]
AS --> ASR["deploy/NAME<br/>--min=N --max=M<br/>--cpu-percent=70"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
kubectl scale deploy/api --replicas=10
kubectl scale deploy/api --replicas=10 --current-replicas=5 # refuses if drifted
kubectl autoscale deploy/api --min=2 --max=20 --cpu-percent=70
--current-replicas is your concurrency guard. Use it in any script that scales by name without holding a lock.
9. Node operations
flowchart TD
K(["kubectl"]) --> CD["cordon NODE"]
K --> UC["uncordon NODE"]
K --> DR["drain NODE"]
K --> TN["taint nodes NODE"]
K --> TPN["top node"]
K --> GTN["get nodes"]
DR --> DR1["--ignore-daemonsets"]
DR --> DR2["--delete-emptydir-data"]
DR --> DR3["--force <i>evict unmanaged pods</i>"]
DR --> DR4["--grace-period=N<br/>--timeout=5m"]
DR --> DR5["--pod-selector='app!=critical'"]
DR --> DR6["--disable-eviction <i>skip PDB</i>"]
TN --> T1["key=value:NoSchedule"]
TN --> T2["key=value:PreferNoSchedule"]
TN --> T3["key=value:NoExecute"]
TN --> T4["key:effect- <i>remove taint</i>"]
GTN --> GTN1["-o wide<br/>-l role=worker<br/>--show-labels"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
# Standard node drain - what you do before patching the host
kubectl cordon ip-10-0-1-23
kubectl drain ip-10-0-1-23 --ignore-daemonsets --delete-emptydir-data --timeout=10m
# ... do work ...
kubectl uncordon ip-10-0-1-23
# Add a taint so only tolerating workloads land here
kubectl taint nodes gpu-0 dedicated=ml:NoSchedule
# Remove that taint
kubectl taint nodes gpu-0 dedicated:NoSchedule-
# Pressure check
kubectl top nodes
kubectl get nodes -o wide --sort-by=.status.capacity.memory
Always test drain with --dry-run=server first if you have PDBs - it tells you which pods would block the drain before you commit.
10. Auth & RBAC
flowchart TD
K(["kubectl"]) --> AU["auth"]
K --> CRT["certificate"]
AU --> AC["can-i VERB RESOURCE<br/>--namespace=NS<br/>--as=USER<br/>--as-group=GRP<br/>--all-namespaces<br/>--list"]
AU --> AW["whoami"]
AU --> AR["reconcile -f rbac.yaml<br/><i>create/update RBAC, prune extras</i>"]
CRT --> CA["approve CSR-NAME"]
CRT --> CD["deny CSR-NAME"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
# Can I do the thing?
kubectl auth can-i delete pods -n prod
kubectl auth can-i '*' '*' --all-namespaces
# Can the CI service account do the thing?
kubectl auth can-i create deployments -n prod \
--as=system:serviceaccount:ci:deployer
# Who am I right now?
kubectl auth whoami
# List every permission I have in this namespace
kubectl auth can-i --list -n payments
--as impersonation against auth can-i is the safest way to debug RBAC without giving anyone production credentials.
11. Networking access
flowchart TD
K(["kubectl"]) --> PF["port-forward"]
K --> PRX["proxy"]
PF --> PF1["pod/NAME<br/>svc/NAME<br/>deploy/NAME"]
PF --> PF2["LOCAL:REMOTE<br/>:REMOTE <i>random local</i><br/>multiple pairs ok"]
PF --> PF3["--address 0.0.0.0<br/><i>expose beyond localhost</i>"]
PRX --> PR1["--port=8001<br/>--address=127.0.0.1"]
PRX --> PR2["--accept-paths<br/>--reject-paths<br/>--api-prefix=/k8s"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
# Forward a Service port - works through the API server, no LB needed
kubectl port-forward svc/postgres 5432:5432
# Hit the API directly - everything kubectl does is just HTTP
kubectl proxy --port=8001 &
curl http://localhost:8001/api/v1/namespaces/default/pods
kubectl proxy plus curl is the cleanest way to learn the API. Watch the request kubectl actually makes with -v=8.
12. Output, selectors, and the flags you forget
flowchart TD
K(["kubectl"]) --> OUT["-o / --output"]
K --> SEL["selectors"]
K --> NS["namespace"]
K --> WAT["watch"]
K --> DRY["dry-run"]
K --> VRB["verbosity"]
OUT --> O1["yaml | json<br/>name | wide"]
OUT --> O2["jsonpath='{.items[*].metadata.name}'"]
OUT --> O3["jsonpath-as-json"]
OUT --> O4["go-template / go-template-file"]
OUT --> O5["custom-columns=NAME:.metadata.name,IP:.status.podIP"]
OUT --> O6["custom-columns-file=cols.txt"]
SEL --> S1["-l key=value<br/>-l 'key in (a,b)'<br/>-l 'key!=value'<br/>-l 'key' <i>exists</i><br/>-l '!key' <i>not exists</i>"]
SEL --> S2["--field-selector status.phase=Running<br/>--field-selector spec.nodeName=NODE<br/>--field-selector metadata.namespace!=kube-system"]
NS --> N1["-n NS<br/>-A / --all-namespaces"]
WAT --> W1["-w / --watch<br/>--watch-only"]
WAT --> W2["wait --for=condition=Ready pod/NAME<br/>--for=delete<br/>--timeout=5m"]
DRY --> D1["--dry-run=client<br/>--dry-run=server<br/>--validate=strict"]
VRB --> V1["-v=6 <i>requests</i><br/>-v=8 <i>request bodies</i><br/>-v=9 <i>also responses</i>"]
classDef root fill:#ffffff,stroke:#5b8def,color:#0b1220,stroke-width:3px,font-weight:bold;
class K root;
# Just the pod names
kubectl get pods -o jsonpath='{.items[*].metadata.name}'
# Pods + node assignment as a table
kubectl get pods -o custom-columns=NAME:.metadata.name,NODE:.spec.nodeName,IP:.status.podIP
# Wait until ready
kubectl wait --for=condition=Available deploy/api --timeout=5m
kubectl wait --for=delete pod/web-abc123 --timeout=2m
# Render a manifest server-side without persisting it - schema validation included
kubectl apply -f deploy.yaml --dry-run=server -o yaml
# What is kubectl actually sending? (debug RBAC, networking, webhooks)
kubectl get pods -v=8
kubectl wait is the right answer in scripts. Polling get in a loop is what people write before they discover it.
A few more that did not fit cleanly
# Render kustomize without applying
kubectl kustomize overlays/prod
# What changed since last apply on the server?
kubectl diff -f deploy.yaml
# Convert v1beta1 manifests to current API version
kubectl convert -f legacy.yaml --output-version apps/v1
# Plugin manager (krew) - for kubectl-tree, kubectl-neat, kubectl-stern, etc
kubectl krew install tree
kubectl tree deploy api
# Generate a bash completion
source <(kubectl completion bash)
# Alias I cannot live without
alias k=kubectl
complete -o default -F __start_kubectl k
How this page renders
The diagrams are written as standard ```mermaid fences in the markdown source. GitHub renders Mermaid natively, so the file in the repo renders identically there. On this site, a tiny script imports mermaid from a CDN, walks every pre.mermaid block, and swaps in an SVG. The same source, three viewers, no images committed to git.
If you spot a command I missed - especially a flag - send it my way and I will add it to the diagram.