observability-metrics
Validate and operate application /health, /metrics, and logging behavior. Use when adding metrics, changing observability auth, debugging production-like issues, or verifying deployment health.
Version
1.1.0
Maturity
draft
Repository
agent-skills
License
Proprietary
Skill metadata
SKILL.md
Observability metrics
Use this skill when
- Adding a metric, a health probe, or touching
/metrics//healthbehaviour. - Changing auth posture for observability endpoints.
- Verifying that a deployment is reachable and reporting sane telemetry.
- Debugging production-like issues where the question is whether telemetry itself is healthy.
Do not use this skill when
- The core risk is secret handling or request-handling safety (use
security-basics). - The work is wiring new deployment infrastructure rather than validating telemetry endpoints.
Inputs to gather
- The service port and any token needed to reach
/metricsin the current environment. - Whether metrics auth is enabled in this tier.
- The specific metric name or endpoint response being validated.
First move
curlthe/healthand/metricsendpoints directly and read the status/body before assuming application-level issues.
Standard endpoints
| Endpoint | Purpose | Default auth |
|---|---|---|
GET /health |
Liveness/readiness check | Public |
GET /metrics |
Prometheus-format operational metrics | Optional — protect in production |
Validation
curl -i http://localhost:<PORT>/health
curl -i http://localhost:<PORT>/metrics
# If metrics auth is enabled:
curl -i -H "Authorization: Bearer $AUTH_TOKEN" http://localhost:<PORT>/metrics
/health→ expect200 OK/metrics→ expect200 OKwith Prometheus text format (# HELP,# TYPElines)
Security
/metrics exposes error rates, latency, queue depths, and operational state — treat it as sensitive.
- Enable auth protection for
/metricsin production (env var or reverse proxy ACL). /healthcan remain public — it must be reachable by load balancer health checks.- Structured logs must not contain secrets, auth tokens, full URLs with credentials, or PII.
Adding a metric
- Define the metric using your metrics library (e.g.
prometheus.NewCounter,prometheus.NewHistogram). - Register it during app initialisation — not per-request.
- Instrument the code path where the metric is recorded.
- Start the server and verify the metric appears in
GET /metricsoutput.
Troubleshooting
| Symptom | Fix |
|---|---|
/health returns non-200 |
Check DB connectivity and app startup logs |
/metrics returns 401 |
Pass Authorization: Bearer <token> or check METRICS_AUTH_ENABLED setting |
| Expected metric not in output | Confirm it was registered at startup; confirm the instrumented code path executed |
Guardrails
- Protect
/metricsin production; treat it as sensitive operational data. - Keep
/healthreachable by load balancers and free of internal-state leakage. - Register metrics at startup, not per request, to avoid duplicate-registration panics.
- Never let structured logs carry secrets, tokens, full credential URLs, or PII.
Support files
- Read
references/examples.mdwhen you need concrete user utterances, expected behaviour, or a model answer shape to mirror. - Read
references/edge-cases.mdwhen the request is a near miss, partially matches this skill, or the first attempt fails.