8.4 KiB
AGENTS.md
Repo map
apps/<namespace>/services/<name>/ # one service per directory
├── main/ # Go service entrypoint (only if Go)
│ ├── main.go
│ └── handler.go
├── Dockerfile # build context = project root
├── Makefile # single include line (see below)
├── k8s/ # deployment.yaml, service.yaml, ingress.yaml
└── package.json # only for Astro frontend services
infrastructure/
├── k3d/k3d.sh # cluster create/delete
├── Makefile/service.mk # shared build/deploy targets
├── terraform/ # All infrastructure (MongoDB, monitoring, namespaces)
└── mongodb/deploy.sh # unused standalone script (Terraform-managed now)
pkg/ # shared Go packages (logger, setup, auth, mongo)
packages/ui/ # @homelab/ui Astro primitive library
Commands
# Full dev cycle (requires running `make up` first)
make dev # k3d cluster → terraform infra → build+deploy all services
# Cluster lifecycle
make up # create k3d cluster
make down # delete k3d cluster
make infra # terraform apply + Traefik metrics + copy MongoDB secret
# Service lifecycle (run from any service dir)
make build-deploy # docker build → k3d import → kubectl apply
# Bulk operations
make deploy-all # build+load+deploy every discovered service
make restart-all # rollout restart all deployments
Build conventions
- Docker build context is project root, not the service directory. The
Dockerfilereferences paths relative to root. - Go services: listen on
:8080(set bysetup.Default). K8s Service maps80 → 8080. - Astro services: Node build → nginx serving
/diston port 80. - Image naming:
homelab/<service-name>:latest(inferred from directory name byservice.mk). imagePullPolicy: IfNotPresenton all deployments — images loaded viak3d image import.- Go base image:
golang:1.25-alpinebuilder →alpine:3.21runtime. - Node base image:
node:26-alpinebuilder →nginx:alpine.
Service Makefiles
Every per-service Makefile is a single include:
# Go service:
PROJECT_ROOT := ../../../../
include ../../../../infrastructure/Makefile/service.mk
# Astro:
PROJECT_ROOT = $(abspath ../../../..)
SERVICE_DIR = .
include ../../../../infrastructure/Makefile/service.mk
SERVICE_NAME and NAMESPACE are auto-inferred (NAMESPACE from apps/<name>/... path; SERVICE_NAME from directory name). Infers Go vs Node by presence of package.json.
Observability
Traces (OpenTelemetry OTLP gRPC)
OTEL_EXPORTER_OTLP_ENDPOINT=jaeger.monitoring.svc:4317set on gateway, users, example-service deploymentspkg/traceprovides OTLP gRPC trace exporter + HTTP middleware (creates spans per request)- Jaeger all-in-one deployed in
monitoringnamespace, ingress atjaeger.homelab.local - Every service uses
trace.Middleware(metrics.Middleware(mux))viasetup.Run
Metrics (Prometheus)
pkg/metricsexposes:http_requests_total{method,path,status},http_request_duration_seconds{method,path},http_requests_in_flight/metricsendpoint added automatically bysetup.Runviapromhttp.Handler()- Go runtime metrics from default Prometheus registry
- ServiceMonitors (with
release: kpslabel required by Prometheus operator):gateway(auth) — scrapes:http/metricsusers(auth) — scrapes:http/metricsexample-service(test) — scrapes:http/metricstraefik(monitoring) — scrapes:9100/metrics
- Prometheus operator selects ServiceMonitors via
serviceMonitorSelector.matchLabels.release: kps
Traefik Metrics
- HelmChartConfig in
kube-systemenables prometheus metrics on port 9100 - Traefik service patched to expose
metricsport 9100 - ServiceMonitor in
monitoringnamespace scrapes it
Auth system
- Traefik ForwardAuth:
auth-forward-authMiddleware inauthnamespace. Any Ingress can use it via annotationtraefik.ingress.kubernetes.io/router.middlewares: auth-forward-auth@kubernetescrd. - The
/verifyendpoint on gateway returns a 302 redirect to login (not 401) so unauthenticated browser users get redirected seamlessly. - Cookie is set with
Domain: homelab.localso it works on all subdomains. - Gateway calls the users service internal via
USERS_SERVICE=http://users(port 80). - Users service auto-seeds admin on first startup from
ADMIN_EMAIL/ADMIN_PASSWORDenv vars.
Frontend
- npm workspaces at root. Shared primitives in
packages/ui/(@homelab/ui), consumed via Vite alias (not workspace exports) to avoid.astroresolution issues across packages. - Tailwind v4:
@source "../"in shared CSS so JIT scanspackages/ui/for class usage. - App-specific components go in
apps/<app>/services/ui/src/components/, not inpackages/ui/.
Infra (Terraform at infrastructure/terraform/)
Architecture (local-exec for all native K8s resources)
The Terraform Kubernetes provider (hashicorp/kubernetes v2.32.0 and v2.38.0) hangs on all write operations (Create) against k3d v1.33.6's API server. The helm provider works fine. Therefore:
- Namespaces:
terraform_data+local-execwithkubectl create namespace --dry-run=client -o yaml | kubectl apply -f - - MongoDB Secret, Service, StatefulSet:
terraform_data+local-execwith inline YAML piped tokubectl apply -f - - Helm releases (kube-prometheus-stack, Jaeger, Loki, Fluent Bit):
helm_releaseresource — works fine - random_password: used for MongoDB root password and Grafana admin password
- Provider auth: explicit client certificate/key/CA from k3d kubeconfig (decoded from
client-certificate-data/client-key-data/cluster-ca-data),0.0.0.0→127.0.0.1in server URL.config_pathcauses provider crash.insecure=trueconflicts withcluster_ca_certificate.
Terraform state contents
- 5×
terraform_data(auth, home, test, monitoring, mongodb namespaces + mongodb_secret, mongodb_service, mongodb_statefulset via local-exec) - 2×
random_password(mongodb, grafana) - 4×
helm_release(kube-prometheus-stack, jaeger, loki, fluent-bit)
Monitoring stack
- kube-prometheus-stack (Prometheus + Grafana), Jaeger v2, Loki, Fluent Bit — all via
helm_releaseintomonitoringnamespace. - Prometheus operator selects ServiceMonitors via
serviceMonitorSelector.matchLabels.release: kps. - Grafana:
admin/ password inkps-grafanaK8s Secret, ingress atgrafana.homelab.local. - Jaeger: OTLP gRPC
jaeger.monitoring.svc:4317, OTLP HTTP:4318, UI atjaeger.homelab.local. - Traefik metrics: Pre-enabled in k3d, but the
metricsport must be added to the Traefik service manually (kubectl patch svc -n kube-system traefik). AServiceMonitorinmonitoringnamespace scrapes it.
MongoDB
- StatefulSet
mongo:8deployed byterraform_data(local-exec kubectl apply) - Secret
mongodbinmongodbnamespace withMONGO_INITDB_ROOT_PASSWORD,MONGO_URI,MONGO_DB - MongoDB secret is copied to
auth,finance,testnamespaces (as bothmongodbandmongonames) viainfrastructure/copy-mongo-secret.sh(run bymake infra) - Deployments reference it as
mongoviaenvFrom.secretRef
Traefik metrics
- HelmChartConfig in
infrastructure/traefik-metrics.sh(applied bymake infra) - Requires
kubectl patch svc traefik -n kube-systemto add metrics port
DNS
All subdomains must resolve to 127.0.0.1. Currently configured in /etc/hosts. Run via sudo:
sudo sed -i '' '/homelab.local/d' /etc/hosts && \
echo '127.0.0.1 homelab.local auth.homelab.local grafana.homelab.local jaeger.homelab.local finance.homelab.local' | \
sudo tee -a /etc/hosts
Known issues
kube-prometheus-stack upgrade hangs
Any change to the kube_prometheus_stack helm_release (even create_namespace: false → true) triggers an upgrade that hangs for 2+ minutes due to CRD processing. Workaround: avoid changing it, or set create_namespace = true and leave it unchanged. If stuck in pending-upgrade, rollback via helm rollback kps <revision> -n monitoring.
Local dev
k3dcluster must be running (k3d cluster listto check).- No lint/typecheck/test commands exist yet.
- No CI, no pre-commit hooks.