Update docs to reflect the simplified temperature monitoring architecture: - Remove references to pulse-sensor-proxy throughout - Update TEMPERATURE_MONITORING.md to focus on unified agent approach - Update CONFIGURATION.md, DEPLOYMENT_MODELS.md, FAQ.md - Remove SECURITY_CHANGELOG.md (proxy-specific security notes) - Clarify current recommended setup in various guides
3.6 KiB
📉 Adaptive Polling
Pulse uses an adaptive scheduler to optimize polling based on instance health and activity.
🧠 Architecture
- Scheduler: Calculates intervals based on health/staleness.
- Priority Queue: Min-heap keyed by
NextRun. - Circuit Breaker: Prevents hot loops on failing instances using success/failure counters.
- Backoff: Exponential retry delays (5s min to 5m max).
- Worker Pool: One worker per configured instance (PVE/PBS/PMG), capped at 10.
- Global Concurrency Cap: At most 2 polling cycles run at once to avoid resource spikes.
🔬 Implementation Details (Developer Info)
Staleness Scoring
The AdaptiveScheduler (internal/monitoring/scheduler.go) relies on the StalenessTracker to compute a StalenessScore (0.0 to 1.0) based on how long it has been since the last successful poll.
0.0= fresh (recent success)1.0= very stale or never succeeded
The staleness score is normalized against AdaptivePollingMaxInterval (default 5 minutes).
The scheduler applies Exponential Smoothing (alpha 0.6) and a small jitter (5%) to avoid oscillation.
Additional influences:
- Error penalty: retries tighten the interval based on the error count.
- Queue stretch: large queues gently stretch intervals to avoid overload.
Circuit Breaker Recovery
The circuitBreaker (internal/monitoring/circuit_breaker.go) follows a standard state machine:
- Closing the Circuit: One successful poll moves Half-Open → Closed and resets failure count.
- Backoff Calculation: Retries use exponential backoff starting at 5s (multiplier 2, jitter 0.2) capped at 5m.
- Transient vs. Permanent:
- Transient errors (retryable) are retried up to 5 times before moving to the Dead Letter Queue.
- Permanent errors move directly to the Dead Letter Queue.
Note: When AdaptivePollingMaxInterval is set to 15 seconds or less, the retry backoff is shortened (750ms initial, 4s max) to keep fast feedback loops during tight polling windows.
⚙️ Configuration
Adaptive polling is disabled by default.
UI
There is currently no dedicated UI for adaptive polling in v5.
Environment Variables
| Variable | Default | Description |
|---|---|---|
ADAPTIVE_POLLING_ENABLED |
false |
Enable/disable. |
ADAPTIVE_POLLING_BASE_INTERVAL |
10s |
Healthy poll rate. |
ADAPTIVE_POLLING_MIN_INTERVAL |
5s |
Active/busy rate. |
ADAPTIVE_POLLING_MAX_INTERVAL |
5m |
Idle/backoff rate. |
system.json
You can also set adaptivePollingEnabled (and related interval fields) in system.json and restart Pulse.
📊 Metrics
Exposed at :9091/metrics.
| Metric | Type | Description |
|---|---|---|
pulse_monitor_poll_total |
Counter | Total poll attempts. |
pulse_monitor_poll_duration_seconds |
Histogram | Poll latency. |
pulse_monitor_poll_staleness_seconds |
Gauge | Age since last success. |
pulse_monitor_poll_queue_depth |
Gauge | Queue size. |
pulse_monitor_poll_errors_total |
Counter | Error counts by category. |
pulse_scheduler_queue_due_soon |
Gauge | Tasks due in the next 12 seconds. |
⚡ Circuit Breaker
| State | Trigger | Recovery |
|---|---|---|
| Closed | Normal operation. | — |
| Open | ≥3 failures. | Backoff (max 5m). |
| Half-open | Retry window elapsed. | Success = Closed; Fail = Open. |
Dead Letter Queue: After 5 transient or 1 permanent failure, tasks move to DLQ (30m retry).
🩺 Health API
GET /api/monitoring/scheduler/health (Auth required)
Returns:
- Queue depth & breakdown.
- Dead-letter tasks.
- Circuit breaker states.
- Per-instance staleness.