rcourtman
85ffe10aed
docs: add Mermaid diagrams to improve visual documentation
...
Enhance documentation with six Mermaid diagrams to better explain
complex system implementations:
- Adaptive polling lifecycle flowchart showing enqueue→execute→feedback
cycle with scheduler, priority queue, and worker interactions
- Circuit breaker state machine diagram illustrating Closed↔Open↔Half-open
transitions with triggers and recovery paths
- Temperature proxy architecture diagram highlighting trust boundaries,
security controls, and data flow between host/container/cluster
- Sensor proxy request flow sequence diagram showing auth, rate limiting,
validation, and SSH execution pipeline
- Alert webhook pipeline flowchart detailing template resolution, URL
rendering, HTTP dispatch, and retry logic
- Script library workflow diagram illustrating dev→test→bundle→distribute
lifecycle emphasizing modular design
These visualizations make it easier for operators and contributors to
understand Pulse's sophisticated architectural patterns.
2025-10-21 10:40:33 +00:00
rcourtman
fd0a4f2b0a
docs: update documentation for v4.24.0 features
...
Updates documentation to reflect features implemented in recent commits:
**Security & API Enhancements:**
- Rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining, Retry-After)
- Audit logging for rollback actions and scheduler health
- Runtime logging configuration tracking
**Scheduler Health API:**
- Document new v4.24.0 endpoint features
- Per-instance circuit breaker status
- Dead-letter queue tracking
- Staleness metrics
- Enhanced response format with backward compatibility
**Version & Health Endpoints:**
- Updated /api/version response fields
- Optional health endpoint fields
- Deployment type and update availability
**Configuration & Installation:**
- HTTP config fetch via PULSE_INIT_CONFIG_URL
- Updated environment variable documentation
- Enhanced FAQ entries
**Monitoring & Operations:**
- Adaptive polling architecture documentation
- Rollback procedure references
- Production deployment guidance
All documentation changes align with implemented features from commits:
- 656ae0d25 (PMG test fix)
- dec85a4ef (PBS/PMG stubs + HTTP config)
- Earlier commits: scheduler health API, rollback, rate limiting
2025-10-20 16:08:10 +00:00
rcourtman
160adeb3b8
feat: add scheduler health API endpoint (Phase 2 Task 8)
...
Task 8 of 10 complete. Exposes read-only scheduler health data including:
- Queue depth and distribution by instance type
- Dead-letter queue inspection (top 25 tasks with error details)
- Circuit breaker states (instance-level)
- Staleness scores per instance
New API endpoint:
GET /api/monitoring/scheduler/health (requires authentication)
New snapshot methods:
- StalenessTracker.Snapshot() - exports all staleness data
- TaskQueue.Snapshot() - queue depth & per-type distribution
- TaskQueue.PeekAll() - dead-letter task inspection
- circuitBreaker.State() - exports state, failures, retryAt
- Monitor.SchedulerHealth() - aggregates all health data
Documentation updated with API spec, field descriptions, and usage examples.
2025-10-20 15:13:38 +00:00
rcourtman
5fbdf6099f
docs: add adaptive polling architecture guide (Phase 2 Task 10)
...
Comprehensive documentation for Phase 2 adaptive polling:
- Architecture overview with component diagram
- Configuration guide (env vars, defaults, feature flag)
- Prometheus metrics reference (7 new metrics)
- Circuit breaker & backoff behavior explanation
- Dead-letter queue operational guidance
- Rollout plan (dev/QA → staged → full)
- Troubleshooting guide for common issues
Task 10 of 10 complete. Phase 2: 8/10 tasks implemented (80%).
2025-10-20 15:13:37 +00:00