5 Commits

Author SHA1 Message Date
Gonçalo Rodrigues
d4ccff518e feat: switch to gugagr.xyz with TLS via Let's Encrypt (#39)
Adds Traefik Helm release (kube-system) with ACME HTTP-01 challenge
configured for Let's Encrypt, replacing the k3s-disabled bundled Traefik.

Migrates all hostnames from *.homelab.local to *.gugagr.xyz and upgrades
all ingresses to HTTPS with certresolver=letsencrypt annotations.

Adds var.domain (default homelab.local) to Terraform so the domain is
a single config point for monitoring and Gitea ingresses.

Gateway reads DOMAIN env var at runtime — falls back to homelab.local
so local k3d dev continues to work without changes.

Co-authored-by: Gonçalo Rodrigues <guga@Goncalos-MacBook-Pro.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-26 21:45:19 +01:00
Gonçalo Rodrigues
8436295bbc feat(infra): gate observability stack behind var.enable_monitoring (#38)
Adds enable_monitoring variable (default true) that controls whether
Prometheus/Grafana, Loki, Fluent Bit, and Jaeger are deployed.
Setting it to false saves ~1.5 GB RAM, making the stack viable on
a 2–4 GB VPS without touching the application services.

Also caps MongoDB WiredTiger cache at 256 MB (--wiredTigerCacheSizeGB=0.25)
so it doesn't balloon on memory-constrained hosts.

Co-authored-by: Gonçalo Rodrigues <guga@Goncalos-MacBook-Pro.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-26 17:44:14 +01:00
Gonçalo Rodrigues
079ffae90b fix(infra): remove double-dollar escape in Fluent Bit label_keys
In Terraform quoted strings $var is literal — only ${var} triggers
interpolation. The $$ was passing through as literal $$kube_* to
Fluent Bit, causing a record accessor syntax error on startup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-20 15:23:46 +01:00
Gonçalo Rodrigues
99ed992d98 obs: request access log middleware + Loki label enrichment (#36)
Adds two targeted observability improvements across all homelab services.

pkg/logger/access.go (new)
  HTTP access log middleware that logs one structured line per request:
    method, path, status, ms, trace_id
  The trace_id comes from the OTel span already in context (created by
  trace.Middleware which runs outside this one), so each log entry in
  Loki has a clickable link into Jaeger. Health/metrics endpoints are
  excluded to avoid noise. Level is ERROR for 5xx, WARN for 4xx, INFO
  otherwise.

pkg/setup/setup.go
  Wire the new middleware between trace.Middleware (which creates the
  span) and metrics.Middleware:
    trace → AccessMiddleware → metrics → mux
  Order matters: span must exist before AccessMiddleware reads it.

infrastructure/terraform/monitoring.tf
  Fluent Bit was shipping all container logs to Loki with a single
  static label (job=fluent-bit), making it impossible to filter logs
  by service. Added a `nest/lift` filter that flattens the kubernetes
  metadata block to top-level fields (kube_namespace_name,
  kube_container_name, …), then promoted those as Loki label_keys.
  After this change you can query:
    {kube_namespace_name="finance"} |= "trace_id"
  and LogQL will only return finance-api logs.

Co-authored-by: Gonçalo Rodrigues <guga@Goncalos-MacBook-Pro.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-20 15:15:06 +01:00
Gonçalo Rodrigues
13b7149614 First Commit 2026-06-13 11:25:23 +01:00