Observability

Definition

Three pillars: metrics, logs, traces. Observability = ability to ask new questions about your system without shipping new code.

Where it appears

🌐 Networking

  • SNMP — counters, interface stats
  • NetFlow / sFlow / IPFIX — flow records
  • Packet capture — tcpdump, Wireshark
  • Streaming telemetry — gNMI, OpenConfig

🐧 Linux

  • journald / syslog — structured logs
  • Prometheus node_exporter — host metrics
  • eBPF — bpftrace, bcc, Pixie

☁️ Cloud

  • AWS — CloudWatch, VPC Flow Logs, X-Ray, CloudTrail (audit)
  • Azure — Monitor, Log Analytics, Network Watcher, Application Insights

📦 Containers

  • Prometheus + Grafana
  • OpenTelemetry — unified metrics/logs/traces
  • Jaeger / Tempo — distributed tracing
  • Loki / Elasticsearch — log aggregation

🔐 Cybersecurity

  • SIEM — Splunk, Sentinel, Elastic, Wazuh
  • SOAR — orchestration
  • IDS logs — Suricata, Zeek

See also