GitOps Fundamentals
GitOps is a specific discipline of CD: a git repo holds the declared desired state of your system, and a controller running inside the cluster continuously reconciles actual state to match. No
kubectl applyfrom laptops. No state-changing access to prod for humans. Every change flows through PR → merge → controller. Rollback isgit revert.
The four principles (from OpenGitOps)
- Declarative — the whole system is described as code.
- Versioned and immutable — the desired state is stored in a source that supports immutable history (git).
- Pulled automatically — software agents pull the desired state.
- Continuously reconciled — agents observe actual state and converge it toward the desired state.
Missing any of the four → not GitOps. “We have a repo we apply from” is not GitOps; it’s CI/CD. The reconcile loop is the distinguishing feature.
The push vs pull distinction
Classic CI/CD pushes:
CI pipeline ──(kubectl apply, helm upgrade)──► cluster
- CI needs cluster credentials (blast radius problem).
- The cluster is the passive target; if something drifts, it stays drifted.
- Failure feedback is “the pipeline finished” — doesn’t mean the workload is healthy.
GitOps pulls:
┌──────────────┐
CI ──► │ git repo │ ◄──── controller (in cluster)
└──────────────┘ │
▼
cluster state
│
▼
continuously
reconciled
- CI never touches the cluster. It updates the git repo.
- Cluster credentials live nowhere outside the cluster.
- Drift (someone
kubectl edited something) is auto-corrected or flagged. - Healthy state is observable — the controller reports sync + health.
The two-repo pattern
The common layout is two repos: app source and deployment manifests.
app-repo gitops-repo (aka "config" / "manifests")
┌─────────────┐ ┌─────────────────────┐
│ src/ │ CI: build → │ apps/ │
│ Dockerfile │ ──────► │ myapp/ │
│ tests/ │ update tag │ base/ │
└─────────────┘ │ overlays/ │
│ dev/ │
│ prod/ │
└─────────────────────┘
▲
│ pull + reconcile
┌─────────┴───────┐
│ ArgoCD / Flux │
│ controller │
└─────────┬───────┘
▼
Kubernetes
Flow for a code change:
- Dev merges a PR in
app-repo. - CI builds a container image tagged
ghcr.io/acme/myapp:sha-abc123. - CI updates the image tag in
gitops-repo/apps/myapp/overlays/dev/kustomization.yamland opens a PR (or auto-commits for dev). - PR merged → controller sees the change → pulls new manifests → applies → health-checks.
Prod typically gates on human PR review in step 3.
The two main tools
| ArgoCD | Flux | |
|---|---|---|
| Project | Intuit / CNCF | Weaveworks / CNCF |
| Model | Application CRD, UI-first | GitRepository + Kustomization/HelmRelease, CLI-first |
| UI | Strong, full visibility & history | Minimal (use Weave GitOps / Capacitor) |
| Multi-cluster | ApplicationSet + hub-and-spoke | Multi-tenant via Kustomization sources |
| Helm support | Direct + parameter override | Via HelmRelease |
| Best for | Teams wanting a dashboard + drag-and-drop feel | GitOps-native / CLI-first teams |
Both are CNCF-graduated. Functionally interchangeable for most use-cases. ArgoCD is more popular in enterprises that want the UI; Flux is preferred by teams that want the minimal declarative approach.
A minimal ArgoCD Application
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/acme/gitops-repo.git
targetRevision: main
path: apps/myapp/overlays/prod
destination:
server: https://kubernetes.default.svc
namespace: myapp-prod
syncPolicy:
automated:
prune: true # remove resources no longer declared
selfHeal: true # revert manual drift
syncOptions:
- CreateNamespace=trueTranslate: “watch apps/myapp/overlays/prod/ on main; apply it to the namespace; prune dropped resources; auto-revert drift.”
A minimal Flux setup
# GitRepository — tells Flux where to pull from
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: gitops-repo
namespace: flux-system
spec:
interval: 1m
url: https://github.com/acme/gitops-repo.git
ref:
branch: main
---
# Kustomization — tells Flux what to apply
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: myapp
namespace: flux-system
spec:
interval: 5m
path: ./apps/myapp/overlays/prod
prune: true
sourceRef:
kind: GitRepository
name: gitops-repoImage automation — who updates the tag?
A design decision: how does a new image tag get into the gitops repo?
- CI opens a PR — CI, after a successful build, edits the manifest and opens a PR. Human reviews + merges. Most auditable.
- CI auto-commits — CI pushes directly to the gitops repo. Fine for dev; questionable for prod.
- Image automation controller — Flux Image Automation or ArgoCD Image Updater watches the registry for new tags matching a pattern, updates the manifest automatically.
- Promotion via commit — a human / tool copies the tag from
dev/tostaging/toprod/overlays.
Mature setups combine these: image automation for dev (fast feedback), PR-based promotion to staging + prod.
Rendered manifests pattern
A recent preference among experienced teams: instead of committing raw Helm / Kustomize sources + letting the controller render, you commit the fully-rendered YAML.
- Source-of-truth repo: Helm/Kustomize templates.
- Rendered repo: CI runs
helm template/kustomize buildand commits the output. - Controller watches the rendered repo.
Pros: what-you-see-is-what-you-apply (PR diff = the actual change). Cons: more CI plumbing.
Worth it for big installations where Helm-value changes can be surprisingly wide-reaching.
Secrets in GitOps
You can’t commit secrets to git. Three approaches:
| Approach | How |
|---|---|
| Sealed Secrets (Bitnami) | Encrypt secret client-side with a cluster public key; commit the ciphertext; controller decrypts in cluster. Simple. |
| SOPS (Mozilla) + age / KMS | Encrypt YAML in place; Flux/ArgoCD decrypt on reconcile. Can encrypt just the data: field. Broadly used. |
| External Secrets Operator | Commit a ExternalSecret resource that points at Vault / AWS Secrets Manager / GCP Secret Manager; operator syncs the real Secret. Preferred if you already have a secret store. |
Detailed in Secrets Management.
Drift detection
A killer feature of GitOps controllers: they diff actual vs declared on every reconcile interval.
- Someone
kubectl scaled a deployment outside of git → controller sees drift → either auto-reverts (selfHeal: true) or reports it. - Someone deleted a
ConfigMapin-cluster → controller re-creates it. - Manifest exists in git but never landed → controller reports
OutOfSync.
This single property replaces a lot of audit and drift tooling.
Rollback
Rollback in GitOps is git revert:
git revert abc123
git pushController notices the commit, reconciles, workload returns to previous state.
Advantages:
- Full audit trail — you can see who reverted what, when, why.
- Easy to explain to auditors.
- Works the same for apps, infra, policies — same tool chain.
GitOps for infrastructure (not just apps)
GitOps started with K8s apps but the model extends:
- Crossplane — define cloud resources as K8s CRDs → GitOps provisions VPCs, S3 buckets, RDS.
- Cluster API — cluster lifecycle (create / upgrade / scale) declared in manifests.
- Argo CD + Terraform controllers — bridge git → terraform apply (still trickier than K8s-native).
The direction of travel: everything in the platform is a declarative custom resource, reconciled by an operator. Infra code becomes less Terraform, more Kubernetes.
When GitOps is overkill
- Single-developer projects, few services → PR-based
kubectl applywith a makefile is fine. - Very dynamic workloads (short-lived jobs, batch CronJobs) — still do it, but state may churn more than is useful to commit.
- Non-Kubernetes systems — GitOps tooling is K8s-centric. For VMs, lean on Terraform + CI + drift detection rather than forcing a GitOps controller.
Common mistakes
- Manual
kubectl applypersisting — old habits die hard. Block it with RBAC; give humans read-only prod. - One monolithic app repo = monolithic gitops repo. Organise by team / tenancy, not by CI repo.
- Forgetting
prune: true→ deleted manifests stay deployed. - Hard-coded cluster URLs in manifests. Use
ApplicationSettemplates / per-env overlays. - Image tags =
latest. The controller has nothing to diff; updates happen invisibly. Always use immutable tags (SHA or semver). - Running GitOps controller from inside the cluster it manages, with no bootstrap plan. How do you bring the controller back if the cluster dies? Script it; test the rebuild at least once.