rcourtman
7204a76e3a
refactor(error-handling): log rename failure in logging rotation
...
rotateLocked() silently swallowed rename errors during log rotation.
Now reports failures to stderr (can't use log package from within the
logging package itself).
2026-02-11 14:08:53 +00:00
rcourtman
df43f08cf2
fix: address linting issues and test adjustments
...
cmd/eval/main.go:
- Fix fmt.Errorf format string lint warning (use %s instead of bare string)
internal/logging/logging_test.go:
- Update tests to account for LogBroadcaster wrapper in baseWriter
- Use string representation checks instead of direct pointer comparison
- Verify both the underlying writer and broadcaster are present
2026-02-01 23:27:11 +00:00
rcourtman
bb8fbfb411
feat(backend): implement real-time log broadcasting and handlers
2026-01-30 19:01:58 +00:00
rcourtman
c6bd8cb74c
Improve internal package test coverage
2025-12-29 17:25:21 +00:00
rcourtman
6a8258be14
chore: remove dead code and unused exports
...
Remove ~900 lines of unused code identified by static analysis:
Go:
- internal/logging: Remove 10 unused functions (InitFromConfig, New,
FromContext, WithLogger, etc.) that were built but never integrated
- cmd/pulse-sensor-proxy: Remove 7 dead validation functions for a
removed command execution feature
- internal/metrics: Remove 8 unused notification metric functions and
10 Prometheus metrics that were never wired up
Frontend:
- Delete ActivationBanner.tsx stub component
- Remove unused exports: stopMetricsSampler, getSamplerStatus,
formatSpeedCompact, parseMetricKey, getResourceAlerts
2025-11-27 13:17:39 +00:00
rcourtman
611740087c
style: fix additional staticcheck warnings
...
- Lowercase error messages (ST1005)
- Use context.Background() instead of nil (SA1012)
- Fix rand.Intn(1) which always returns 0 (SA4030)
- Remove unnecessary nil check before len() (S1009)
2025-11-27 09:21:11 +00:00
rcourtman
0dc0235f77
chore: remove dead code and unused files
...
Remove 604 lines of unreachable code identified by deadcode analysis:
- internal/config/credentials.go: unused credential resolver
- internal/config/registration.go: unused registration config
- internal/monitoring/poller.go: unused channel-based polling (keep types)
- internal/api/middleware.go: unused TimeoutHandler, JSONHandler, NewAPIError, ValidationError
- internal/api/security.go: unused IsLockedOut, SecurityHeaders
- internal/api/auth.go: unused min helper
- internal/config/config.go: unused SaveConfig
- internal/config/client_helpers.go: unused CreatePBSConfigFromFields
- internal/logging/logging.go: unused NewRequestID
2025-11-27 00:05:04 +00:00
rcourtman
01f7d81d38
style: fix gofmt formatting inconsistencies
...
Run gofmt -w to fix tab/space inconsistencies across 33 files.
2025-11-26 23:44:36 +00:00
rcourtman
2786afdff0
feat: comprehensive diagnostics and observability improvements
...
Upgrade diagnostics infrastructure from 5/10 to 8/10 production readiness
with enhanced metrics, logging, and request correlation capabilities.
**Request Correlation**
- Wire request IDs through context in middleware
- Return X-Request-ID header in all API responses
- Enable downstream log correlation across request lifecycle
**HTTP/API Metrics** (18 new Prometheus metrics)
- pulse_http_request_duration_seconds - API latency histogram
- pulse_http_requests_total - request counter by method/route/status
- pulse_http_request_errors_total - error counter by type
- Path normalization to control label cardinality
**Per-Node Poll Metrics**
- pulse_monitor_node_poll_duration_seconds - per-node timing
- pulse_monitor_node_poll_total - success/error counts per node
- pulse_monitor_node_poll_errors_total - error breakdown per node
- pulse_monitor_node_poll_last_success_timestamp - freshness tracking
- pulse_monitor_node_poll_staleness_seconds - age since last success
- Enables multi-node hotspot identification
**Scheduler Health Metrics**
- pulse_scheduler_queue_due_soon - ready queue depth
- pulse_scheduler_queue_depth - by instance type
- pulse_scheduler_queue_wait_seconds - time in queue histogram
- pulse_scheduler_dead_letter_depth - failed task tracking
- pulse_scheduler_breaker_state - circuit breaker state
- pulse_scheduler_breaker_failure_count - consecutive failures
- pulse_scheduler_breaker_retry_seconds - time until retry
- Enable alerting on DLQ spikes, breaker opens, queue backlogs
**Diagnostics Endpoint Caching**
- pulse_diagnostics_cache_hits_total - cache performance
- pulse_diagnostics_cache_misses_total - cache misses
- pulse_diagnostics_refresh_duration_seconds - probe timing
- 45-second TTL prevents thundering herd on /api/diagnostics
- Thread-safe with RWMutex
- X-Diagnostics-Cached-At header shows cache freshness
**Debug Log Performance**
- Gate high-frequency debug logs behind IsLevelEnabled() checks
- Reduces CPU waste in production when debug disabled
- Covers scheduler loops, poll cycles, API handlers
**Persistent Logging**
- File logging with automatic rotation
- LOG_FILE, LOG_MAX_SIZE, LOG_MAX_AGE, LOG_COMPRESS env vars
- MultiWriter sends logs to both stderr and file
- Gzip compression support for rotated logs
Files modified:
- internal/api/diagnostics.go (caching layer)
- internal/api/middleware.go (request IDs, HTTP metrics)
- internal/api/http_metrics.go (NEW - HTTP metric definitions)
- internal/logging/logging.go (file logging with rotation)
- internal/monitoring/metrics.go (node + scheduler metrics)
- internal/monitoring/monitor.go (instrumentation, debug gating)
Impact: Dramatically improved production troubleshooting with per-node
visibility, scheduler health metrics, persistent logs, and cached
diagnostics. Fast incident response now possible for multi-node deployments.
2025-10-21 12:37:39 +00:00
rcourtman
7d422d2909
feat: add professional logging with runtime configuration and performance optimization
...
Implements structured logging package with LOG_LEVEL/LOG_FORMAT env support, debug level guards for hot paths, enriched error messages with actionable context, and stack trace capture for production debugging. Improves observability and reduces log overhead in high-frequency polling loops.
2025-10-20 15:13:38 +00:00