This change addresses intermittent "Guest details unavailable" and "Disk stats
unavailable" errors affecting users with large VM deployments (50+ VMs) or
high-load Proxmox environments.
Changes:
- Increased default guest agent timeouts (3-5s → 10-15s) to better handle
environments under load
- Added automatic retry logic (1 retry by default) for transient timeout failures
- Made all timeouts and retry count configurable via environment variables:
* GUEST_AGENT_FSINFO_TIMEOUT (default: 15s)
* GUEST_AGENT_NETWORK_TIMEOUT (default: 10s)
* GUEST_AGENT_OSINFO_TIMEOUT (default: 10s)
* GUEST_AGENT_VERSION_TIMEOUT (default: 10s)
* GUEST_AGENT_RETRIES (default: 1)
- Added comprehensive documentation in VM_DISK_MONITORING.md with configuration
examples for different deployment scenarios
These improvements allow Pulse to gracefully handle intermittent API timeouts
without immediately displaying errors, while remaining configurable for
different network conditions and environment sizes.
Fixes: https://github.com/rcourtman/Pulse/discussions/592