The HTTP mode installer now includes 127.0.0.1/32 in allowed_source_subnets
to permit self-monitoring queries from localhost. This fixes 403 Forbidden
errors when nodes query their own sensor-proxy instance.
Related to HTTP mode implementation for external PVE hosts.
## HTTP Server Fixes
- Add source IP middleware to enforce allowed_source_subnets
- Fix missing source subnet validation for external HTTP requests
- HTTP health endpoint now respects subnet restrictions
## Installer Improvements
- Auto-configure allowed_source_subnets with Pulse server IP
- Add cluster node hostnames to allowed_nodes (not just IPs)
- Fix node validation to accept both hostnames and IPs
- Add Pulse server reachability check before installation
- Add port availability check for HTTP mode
- Add automatic rollback on service startup failure
- Add HTTP endpoint health check after installation
- Fix config backup and deduplication (prevent duplicate keys)
- Fix IPv4 validation with loopback rejection
- Improve registration retry logic with detailed errors
- Add automatic LXC bind mount cleanup on uninstall
## Temperature Collection Fixes
- Add local temperature collection for self-monitoring nodes
- Fix node identifier matching (use hostname not SSH host)
- Fix JSON double-encoding in HTTP client response
Related to #XXX (temperature monitoring fixes)
When cluster IPC validation fails (due to systemd hardening), the proxy
falls back to allowlist-based validation. The installer now automatically
populates allowed_nodes with:
- Cluster mode: All discovered cluster member IPs
- Standalone mode: localhost IP addresses (including 127.0.0.1/localhost)
- Fallback mode: localhost IPs when pvecm unavailable
This ensures out-of-the-box temperature monitoring works on fresh installs
without manual configuration.
Root cause: pulse-sensor-proxy runs with strict systemd hardening that prevents
access to Proxmox corosync IPC (abstract UNIX sockets). When pvecm fails with
IPC errors, the code incorrectly treated it as "standalone mode" and only
discovered localhost addresses, rejecting legitimate cluster members and external
nodes.
Changes:
1. **Distinguish IPC failures from true standalone mode**
- Detect ipcc_send_rec and access control list errors specifically
- These indicate a cluster exists but isn't accessible (LXC, systemd restrictions)
- Return error to disable cluster validation instead of misusing standalone logic
2. **Graceful degradation when cluster validation fails**
- When cluster IPC is unavailable, fall through to permissive mode
- Log debug message suggesting allowed_nodes configuration
- Allows requests to proceed rather than blocking all temperature monitoring
3. **Improve local address discovery for true standalone nodes**
- Use Go's native net.Interfaces() instead of shelling out to 'ip addr'
- More reliable and works with AF_NETLINK restrictions
- Add helpful logging when only hostnames are discovered
4. **Systemd hardening adjustments**
- Add AF_NETLINK to RestrictAddressFamilies (for net.Interfaces())
- Remove RemoveIPC=true (attempted fix for corosync, insufficient)
- Add ReadWritePaths=-/run/corosync (optional path, corosync uses abstract sockets anyway)
Result: Temperature monitoring now works in:
- Clustered Proxmox hosts (falls back to permissive when IPC blocked)
- LXC containers (correctly detects IPC failure, allows requests)
- Standalone nodes (proper local address discovery with IPs)
Workaround for maximum security: Configure allowed_nodes in /etc/pulse-sensor-proxy/config.yaml
when cluster validation cannot be used.
Root cause: The systemd service hardening blocked AF_NETLINK sockets,
preventing IP address discovery on standalone nodes. The proxy could
only discover hostnames, causing node_not_cluster_member rejections
when users configured Pulse with IP addresses.
Changes:
1. Add AF_NETLINK to RestrictAddressFamilies in all systemd services
- pulse-sensor-proxy.service
- install-sensor-proxy.sh (both modes)
- pulse-sensor-cleanup.service
2. Replace shell-based 'ip addr' with Go native net.Interfaces() API
- More reliable and doesn't require external commands
- Works even with strict systemd restrictions
- Properly filters loopback, link-local, and down interfaces
3. Improve error logging and user guidance
- Warn when no IP addresses can be discovered
- Provide clear instructions about allowed_nodes workaround
- Include address counts in logs for debugging
This fix ensures standalone Proxmox nodes can properly validate
temperature requests by IP address without requiring manual
allowed_nodes configuration.
- Replace unreliable git fetch --dry-run check
- Use git rev-parse to compare local and remote commits
- Prevents false warnings about diverged branches
- Check VERSION file matches before triggering workflow
- Validate working directory is clean
- Confirm on main branch and up to date
- Load release notes from /tmp/release_notes_X.Y.Z.md
- Prevents wasting CI time on misconfigured releases
Users were abandoning Pulse due to catastrophic temperature monitoring setup failures. This commit addresses the root causes:
**Problem 1: Silent Failures**
- Installations reported "SUCCESS" even when proxy never started
- UI showed green checkmarks with no temperature data
- Zero feedback when things went wrong
**Problem 2: Missing Diagnostics**
- Service failures logged only in journald
- Users saw "Something going on with the proxy" with no actionable guidance
- No way to troubleshoot from error messages
**Problem 3: Standalone Node Issues**
- Proxy daemon logged continuous pvecm errors as warnings
- "ipcc_send_rec" and "Unknown error -1" messages confused users
- These are expected for non-clustered/LXC setups
**Solutions Implemented:**
1. **Health Gate in install.sh (lines 1588-1629)**
- Verify service is running after installation
- Check socket exists on host
- Confirm socket visible inside container via bind mount
- Fail loudly with specific diagnostics if any check fails
2. **Actionable Error Messages in install-sensor-proxy.sh (lines 822-877)**
- When service fails to start: dump full systemctl status + 40 lines of logs
- When socket missing: show permissions, service status, and remediation command
- Include common issues checklist (missing user, permission errors, lm-sensors, etc.)
- Direct link to troubleshooting docs
3. **Better Standalone Node Detection in ssh.go (lines 585-595)**
- Recognize "Unknown error -1" and "Unable to load access control list" as LXC indicators
- Log at INFO level (not WARN) since this is expected behavior
- Clarify message: "using localhost for temperature collection"
**Impact:**
- Eliminates "green checkmark but no temps" scenario
- Users get immediate actionable feedback on failures
- Standalone/LXC installations work silently without error spam
- Reduces support burden from #571 (15+ comments of user frustration)
Related to #571
Snap-installed Docker does not automatically create a docker group,
causing permission denied errors when the pulse-docker service user
tries to access /var/run/docker.sock.
Changes:
- Auto-detect Snap Docker installations
- Create docker group if missing when Snap Docker is detected
- Restart Snap Docker after group creation to refresh socket ACLs
- Add socket access validation before starting the service
- Handle symlinked Docker sockets in systemd unit ReadWritePaths
- Document troubleshooting steps in DOCKER_MONITORING.md
Bare metal installations couldn't serve Windows host agent downloads because
the Windows and macOS binaries weren't included in the universal tarball. The
download endpoint would return 404 when Windows users tried to install the
host agent from a bare metal Pulse deployment (Proxmox LXC, Debian VM, etc.).
Changes:
- build-release.sh: Copy Windows/macOS host agent binaries into universal tarball
- build-release.sh: Create symlinks for Windows binaries without .exe extension
- validate-release.sh: Add Windows 386 binary and symlink to Docker validation
- validate-release.sh: Add explicit validation that universal tarball contains all Windows/macOS binaries
The universal tarball now matches the Docker image, ensuring both deployment
methods can serve the complete set of downloadable binaries for the /download/
endpoint.
Replace mv with install command to ensure correct SELinux context.
The mv command preserves the user_tmp_t label from /tmp, which
prevents systemd from executing the binary on SELinux systems.
The install command creates a new file with the correct label for
/usr/local/bin. Added automatic restorecon call for SELinux systems
to ensure policy compliance.
Related to #688
Following best practices for release format transitions:
- build-release.sh now generates both formats from same sha256sum run
- Workflow uploads both checksums.txt and individual .sha256 files
- Validation ensures both formats exist and match
This provides a safe transition period for users with older install scripts
while maintaining the cleaner checksums.txt format going forward. After 2-3
releases when most users have updated scripts, we can remove .sha256 generation.
Related: Install script already supports both formats (falls back gracefully).
Linux host-agent binaries don't have separate archives - they're included in
the main pulse-v*.tar.gz files. Only macOS and Windows have separate archives.
Removed validation checks for standalone binaries that are no longer
uploaded to GitHub releases. These binaries are only needed in Docker
images for the /download/ endpoint.
Updated required assets list to include all versioned tarballs/zips
instead of standalone binaries.
Removed:
- Individual .sha256 files (checksums.txt already contains all checksums)
- Standalone binaries without version numbers (users should download versioned tarballs/zips)
Standalone binaries are only needed in Docker images for the /download/ endpoint.
GitHub releases should only contain versioned archives for user downloads.
This reduces release assets from ~54 files to ~19 files per release.
Users don't care about CI/CD improvements, release workflows, build
processes, or testing infrastructure. Only include user-visible changes.
Related to #671
Commit hashes clutter the release notes and aren't useful for end users.
Only include issue references when explicitly mentioned in commits.
Related to #671
Remove # symbol from commit hash references so GitHub auto-links them.
Format: (abc123) instead of (#abc123)
Issue references still use #: (#123)
Related to #671
- Use exact template format from v4.28.0 and prior releases
- Include all standard sections: New Features, Bug Fixes, Improvements, Breaking Changes
- Add complete installation instructions (systemd, Docker, Manual Binary, Helm)
- Include Downloads section with all artifact types
- Add Notes section for important highlights and upgrade considerations
- Ensure LLM outputs format exactly matching previous releases
Related to #671 (automated release workflow)
- Create scripts/generate-release-notes.sh to auto-generate release notes from git commits
- Supports both Anthropic Claude and OpenAI APIs
- Uses Claude Haiku 4.5 (claude-haiku-4-5-20251001) for cost efficiency ($1/$5 per million tokens)
- Falls back to OpenAI gpt-4o-mini if Anthropic key not available
- Integrates into release workflow between validation and release creation
- Compares current version with previous git tag to generate changelog
- Outputs categorized, user-friendly release notes with installation instructions
Workflow now automatically:
1. Finds previous release tag
2. Analyzes all commits since last release
3. Generates structured release notes via LLM
4. Uses generated notes for draft release body
Requires ANTHROPIC_API_KEY or OPENAI_API_KEY in GitHub secrets.
Related to #671 (automated release workflow)
The script does pushd into RELEASE_DIR, so tarball paths should not include
the RELEASE_DIR prefix. Also fixed checksum validation glob patterns to
exclude .sha256 files from matching.
Tarballs are created with ./bin/pulse paths (relative from inside staging dir)
but validation was looking for bin/pulse paths. Updated all tar -tzf checks
to use correct ./ prefix.
The validation script was looking for tarballs in the current directory
instead of the release/ directory, causing all validations to fail.
Now properly prepends $RELEASE_DIR to all file paths.
This commit introduces a comprehensive GitHub Actions workflow for
creating releases, ensuring all artifacts are validated before upload.
Changes:
- Add .github/workflows/release.yml: Manual workflow_dispatch trigger
that builds, validates, and creates draft releases
- Update scripts/validate-release.sh: Add --skip-docker flag to allow
validation without Docker image checks
Key features:
- Validation runs BEFORE any assets are uploaded
- If validation fails, no release is created
- checksums.txt and artifacts come from the same build
- No manual steps between validation and upload
- Checksums uploaded first, then all other assets
- Creates draft release for manual review before publishing
The workflow ensures that checksums.txt cannot drift from binaries
by running the entire build-validate-upload pipeline atomically.
This fixes two bugs that prevented temperature monitoring from working
after running install-sensor-proxy.sh on LXC deployments:
1. CRITICAL: Pulse service not restarted after systemd override
- The installer wrote PULSE_SENSOR_PROXY_SOCKET env var to systemd
drop-in and ran daemon-reload, but never restarted Pulse service
- Running Pulse instances continued using old environment variables
- Temperatures wouldn't work until manual Pulse restart
- Now: Automatically restart Pulse if running after writing override
2. Added guard to check if Pulse service exists before configuring
- Installer would write systemd override even if Pulse not installed
- Left orphaned drop-in files that confused users
- Now: Check if pulse.service exists, warn and skip if not found
3. MINOR: Fix inconsistent Docker mount instructions
- docker-compose.yml showed :ro (read-only) mount
- Installer output showed :rw (read-write) mount
- Changed installer to match compose file (:ro is correct and secure)
Impact: Users in #600 reported "socketFound=false" even after running
installer successfully. This was because Pulse never picked up the new
socket path without a restart.
Adds build support for 32-bit Windows (windows-386) for pulse-host-agent.
Changes:
- Add windows-386 build to Dockerfile host-agent build section
- Add windows-386 binary copy and symlink to Dockerfile
- Add windows-386 build to build-release.sh
- Add windows-386 zip package to release artifacts
- Include windows-386 binary in standalone binary copies
This enables pulse-host-agent to run on 32-bit Windows systems, which are still relevant in legacy/industrial monitoring environments through late 2025.
Adds build support for 32-bit x86 (i386/i686) and ARMv6 (older Raspberry Pi models) architectures across all agents and install scripts.
Changes:
- Add linux-386 and linux-armv6 to build-release.sh builds array
- Update Dockerfile to build docker-agent, host-agent, and sensor-proxy for new architectures
- Update all install scripts to detect and handle i386/i686 and armv6l architectures
- Add architecture normalization in router download endpoints
- Update update manager architecture mapping
- Update validate-release.sh to expect 24 binaries (was 18)
This enables Pulse agents to run on older/legacy hardware including 32-bit x86 systems and Raspberry Pi Zero/Zero W devices.
Fixes two critical bugs in refresh_smart_cache() that prevented SMART
temperature collection from working:
1. Invalid smartctl parameter: Changed -n standby,after to -n standby
The 'after' parameter is not valid in smartctl 7.4 and causes:
"INVALID ARGUMENT TO -n: standby,after"
Valid syntax is standby[,STATUS[,STATUS2]] where STATUS must be numeric.
2. Broken process detection: Replaced exec -a with lock file approach
The original exec -a pulse-sensor-wrapper-refresh bash line replaced
the subshell with a new bash process that had no script to run, causing
the function to exit immediately without collecting any SMART data.
New approach uses a lock file ($CACHE_DIR/smart-refresh.lock) with
trap-based cleanup to prevent concurrent refresh operations.
Credits to @ZaDarkSide for identifying these issues in PR #672.
The download endpoint had a dangerous fallback that silently served the
wrong binary when the requested platform/arch combination was missing.
If a Docker image shipped without Windows binaries, the installer would
receive a Linux ELF instead of a Windows PE, causing ERROR_BAD_EXE_FORMAT.
Changes:
- Download handler now operates in strict mode when platform+arch are
specified, returning 404 instead of serving mismatched binaries
- PowerShell installer validates PE header (MZ signature)
- PowerShell installer verifies PE machine type matches requested arch
- PowerShell installer fetches and verifies SHA256 checksums
- PowerShell installer shows diagnostic info: OS arch, download URL,
file size for better troubleshooting
This prevents silent failures and provides clear error messages when
binaries are missing or corrupted.
The self-heal timer runs 'systemctl list-unit-files | grep -q' every hour.
When grep matches and exits early, systemctl logs "Failed to print table:
Broken pipe" to syslog. This is cosmetic but floods Proxmox logs and
confuses operators.
Changes:
- Redirect stderr from systemctl to /dev/null
- Prevents the broken pipe message from reaching syslog
- Self-heal functionality unchanged
This addresses the concern raised in discussion #628.
Windows 11 25H2 ships exclusively on ARM64 hardware. When users on ARM64
attempt to install the host agent, the Service Control Manager fails to
load the amd64 binary with ERROR_BAD_EXE_FORMAT, surfaced as "The Pulse
Host Agent is not compatible with this Windows version".
Changes:
- Dockerfile: Build pulse-host-agent-windows-arm64.exe alongside amd64
- Dockerfile: Copy windows-arm64 binary and create symlink for download endpoint
- install-host-agent.ps1: Use RuntimeInformation.OSArchitecture to detect ARM64
- build-release.sh: Build darwin-amd64, darwin-arm64, windows-amd64, windows-arm64
- build-release.sh: Package Windows binaries as .zip archives
- validate-release.sh: Check for windows-arm64 binary and symlink
- validate-release.sh: Add architecture validation for all darwin/windows variants
The installer now correctly detects ARM64 and downloads the appropriate binary.
Extends temperature monitoring to collect SMART temps for SATA/SAS disks,
addressing issue #652 where physical disk temperatures showed as empty.
Architecture:
- Deploys pulse-sensor-wrapper.sh as SSH forced command on Proxmox nodes
- Wrapper collects both CPU/GPU temps (sensors -j) and disk temps (smartctl)
- Implements 30-min cache with background refresh to avoid performance impact
- Uses smartctl -n standby,after to skip sleeping drives without waking them
- Returns unified JSON: {sensors: {...}, smart: [...]}
Backend changes:
- Add DiskTemp model with device, serial, WWN, temperature, lastUpdated
- Extend Temperature model with SMART []DiskTemp field and HasSMART flag
- Add WWN field to PhysicalDisk for reliable disk matching
- Update parseSensorsJSON to handle both legacy and new wrapper formats
- Rewrite mergeNVMeTempsIntoDisks to match SMART temps by WWN → serial → devpath
- Preserve legacy NVMe temperature support for backward compatibility
Performance considerations:
- SMART data cached for 30 minutes per node to avoid excessive smartctl calls
- Background refresh prevents blocking temperature requests
- Respects drive standby state to avoid spinning up idle arrays
- Staggered disk scanning with 0.1s delay to avoid saturating SATA controllers
Install script:
- Deploys wrapper to /usr/local/bin/pulse-sensor-wrapper.sh
- Updates SSH forced command from "sensors -j" to wrapper script
- Backward compatible - falls back to direct sensors output if wrapper missing
Testing note:
- Requires real hardware with smartmontools installed for full functionality
- Empty smart array returned gracefully when smartctl unavailable
- Legacy sensor-only nodes continue working without changes
The checksum generation was including pulse-host-agent-v*-darwin-arm64.tar.gz
twice: once from the *.tar.gz pattern and once from the pulse-host-agent-*
pattern. Fixed by using extglob to exclude .tar.gz and .sha256 files from
the agent binary patterns since tarballs are already matched separately.