Commit Graph

24 Commits

Author SHA1 Message Date
rcourtman
dd9bd65a2e fix: Add hasCPU/hasNVMe flags to prevent false 'no CPU sensor' errors
Addresses #101

v4.23.0 introduced a regression where systems with only NVMe temperatures
(no CPU sensor) would display "No CPU sensor" in the UI. This was caused
by the Available flag being set to true when NVMe temps existed, even
without CPU data, triggering the error message in the frontend.

Backend changes:
- Add HasCPU and HasNVMe boolean fields to Temperature model
- Extend CPU sensor detection to support more chip types: zenpower,
  k8temp, acpitz, it87 (case-insensitive matching)
- HasCPU is set based on CPU chip detection (coretemp, k10temp, etc.),
  not value thresholds
- This prevents false negatives when sensors report 0°C during resets
- CPU temperature values now accepted even when 0 (checked with !IsNaN
  instead of > 0)
- extractTempInput returns NaN instead of 0 when no data found
- Available flag means "any temperature data exists" for backward compatibility
- Update mock generator to properly set the new flags
- Add unit tests for NVMe-only and 0°C scenarios to prevent regression
- Removed amd_energy from CPU chip list (power sensor, not temperature)

Frontend changes:
- Add hasCPU and hasNVMe optional fields to Temperature interface
- Update NodeSummaryTable to check hasCPU flag with fallback to available
  for backward compatibility with older API responses
- Update NodeCard temperature display logic with same fallback pattern
- Systems with only NVMe temps now show "-" instead of error message
- Fallback ensures UI works with both old and new API responses

Testing:
- All unit tests pass including NVMe-only and 0°C test cases
- Fix prevents false "no CPU sensor" errors when sensors temporarily report 0°C
- Fix eliminates false "no CPU sensor" errors for NVMe-only systems
2025-10-13 10:17:17 +00:00
rcourtman
97066d8351 docs: Update socket paths and add monitoring section to TEMPERATURE_MONITORING.md
Updated documentation to reflect new directory-level bind mount architecture:
- Changed socket path from /var/run/pulse-temp-proxy.sock to /run/pulse-temp-proxy/pulse-temp-proxy.sock
- Updated LXC bind mount syntax to directory-level (create=dir instead of create=file)
- Added "Monitoring the Proxy" section with manual monitoring commands
- Documents systemd restart-on-failure reliance for v1
- Notes future pulse-watchdog integration planned

Related to #528
2025-10-12 22:42:38 +00:00
rcourtman
2244ff0314 fix: Change RuntimeDirectoryMode to 0775 for container access
The pulse user in the container (UID 1001) needs to access the
/run/pulse-temp-proxy directory owned by root:root. Changed from
0770 (owner+group only) to 0775 (add world read+execute) so the
pulse user can access the socket.

Related to #528
2025-10-12 22:39:18 +00:00
rcourtman
c7bb76c12e fix: Switch proxy socket to directory-level bind mount for stability
Fixes LXC bind mount issue where socket-level mounts break when the
socket is recreated by systemd. Following Codex's recommendation to
bind mount the directory instead of the file.

Changes:
- Socket path: /run/pulse-temp-proxy/pulse-temp-proxy.sock
- Systemd: RuntimeDirectory=pulse-temp-proxy (auto-creates /run/pulse-temp-proxy)
- Systemd: RuntimeDirectoryMode=0770 for group access
- LXC mount: Bind entire /run/pulse-temp-proxy directory
- Install script: Upgrades old socket-level mounts to directory-level
- Install script: Detects and handles bind mount changes

This survives socket recreations and container restarts. The directory
mount persists even when systemd unlinks/recreates the socket file.

Related to #528
2025-10-12 22:33:53 +00:00
rcourtman
58971aa982 feat: Add --local-binary flag to install-temp-proxy.sh for pre-release testing
Allows testing the proxy installation with a locally-built binary instead
of requiring a GitHub release. This enables proper E2E testing before
shipping a release.

Usage:
  ./install-temp-proxy.sh --ctid 112 --local-binary /path/to/pulse-temp-proxy

The script will:
- Use the provided binary instead of downloading from GitHub
- Still handle all setup (systemd service, SSH keys, bind mounts)
- Allow full integration testing without a release

This solves the chicken-and-egg problem of needing a release to test
the installation process.

Related to #528
2025-10-12 22:25:49 +00:00
rcourtman
dd468a9e26 docs: Update temperature monitoring security notice for proxy architecture
Replaced outdated security warnings with accurate information about
the pulse-temp-proxy architecture:

- Removed scary 'legacy feature' and 'compromised container' warnings
- Explains secure proxy architecture for containerized deployments
- Notes that SSH keys are stored on Proxmox host (not in container)
- Clarifies container compromise does not expose credentials
- Includes information for both containerized and native installs
- More factual and less alarmist tone

The old message implied temperature monitoring was insecure for
containers, which is no longer true with pulse-temp-proxy.

Related to #528
2025-10-12 22:11:12 +00:00
rcourtman
5c1dec14e1 fix: Remove duplicate sshPublicKey argument in PVE setup script
The setup script generator was passing sshPublicKey twice but only
using it once, causing a Go fmt.Sprintf formatting error that leaked
into the generated bash script as '%!(EXTRA string=...)'.

This resulted in bash syntax errors when running the setup script.

Fixes #528
2025-10-12 22:01:16 +00:00
rcourtman
47116bedb5 docs: Add comprehensive Operations & Troubleshooting section
Addresses operational documentation gaps for pulse-temp-proxy:

- Service management (restart, stop, start, enable/disable)
- Log locations and viewing commands
- SSH key rotation procedures (recommended every 90 days)
- Key revocation when nodes leave cluster
- Failure modes (proxy down, socket issues, pvecm absent, off-cluster)
- Known limitations (one per host, cluster membership, cross-cluster)
- Common issues with troubleshooting steps
- Diagnostic info collection for bug reports

This provides operators with everything they need to manage the proxy service
in production environments.
2025-10-12 21:50:55 +00:00
rcourtman
1b65e792e2 build: Add pulse-temp-proxy to release build process
Updates build script and release checklist to include pulse-temp-proxy binaries:
- Build pulse-temp-proxy for all architectures (amd64, arm64, armv7)
- Include in tarballs alongside pulse and pulse-docker-agent
- Copy standalone binaries to release/ for install-temp-proxy.sh
- Update release checklist to upload standalone binaries as assets

This ensures install-temp-proxy.sh can download binaries from GitHub releases.
2025-10-12 21:46:48 +00:00
rcourtman
6d4694f019 security: Add SO_PEERCRED authentication to temperature proxy
Addresses security concern raised in code review:
- Socket permissions changed from 0666 to 0660
- Added SO_PEERCRED verification to authenticate connecting processes
- Only allows root (UID 0) or proxy's own user
- Prevents unauthorized processes from triggering SSH key rollout
- Documented passwordless root SSH requirement for clusters

This prevents any process on the host or in other containers from
accessing the proxy RPC endpoints.
2025-10-12 21:42:22 +00:00
rcourtman
e7bc338891 feat: Implement secure temperature proxy for containerized deployments
Addresses #528

Introduces pulse-temp-proxy architecture to eliminate SSH key exposure in containers:

**Architecture:**
- pulse-temp-proxy runs on Proxmox host (outside LXC/Docker)
- SSH keys stored on host filesystem (/var/lib/pulse-temp-proxy/ssh/)
- Pulse communicates via unix socket (bind-mounted into container)
- Proxy handles cluster discovery, key rollout, and temperature fetching

**Components:**
- cmd/pulse-temp-proxy: Standalone Go binary with unix socket RPC server
- internal/tempproxy: Client library for Pulse backend
- scripts/install-temp-proxy.sh: Idempotent installer for existing deployments
- scripts/pulse-temp-proxy.service: Systemd service for proxy

**Integration:**
- Pulse automatically detects and uses proxy when socket exists
- Falls back to direct SSH for native installations
- Installer automatically configures proxy for new LXC deployments
- Existing LXC users can upgrade by running install-temp-proxy.sh

**Security improvements:**
- Container compromise no longer exposes SSH keys
- SSH keys never enter container filesystem
- Maintains forced command restrictions
- Transparent to users - no workflow changes

**Documentation:**
- Updated TEMPERATURE_MONITORING.md with new architecture
- Added verification steps and upgrade instructions
- Preserved legacy documentation for native installs
2025-10-12 21:35:35 +00:00
rcourtman
c8e3c93516 fix: Add security gates for containerized temperature monitoring
Addresses #528

- Added opt-in confirmation prompt to setup script with security notice
- Added runtime warning when containerized Pulse uses SSH temperature monitoring
- Documented security considerations and hardening recommendations
- Users must explicitly confirm understanding before enabling in containers
2025-10-12 21:01:25 +00:00
rcourtman
bebe5efc3d fix: Setup script now verifies temperature SSH connectivity from Pulse
When Pulse runs in a container (LXC/Docker), the setup script would claim
temperature monitoring was enabled on cluster nodes, but Pulse couldn't
actually SSH to them. The script ran on the Proxmox host which could SSH
fine, but didn't verify connectivity from Pulse itself.

Changes:
- Added /api/system/verify-temperature-ssh endpoint that tests SSH from Pulse
- Setup script now calls this endpoint after configuring cluster nodes
- Detects when Pulse is containerized and provides ProxyJump config instructions
- Shows clear success/failure status for each node

Addresses #528
2025-10-12 20:36:48 +00:00
rcourtman
ce602a989a fix: Build multi-arch docker-agent binaries in Docker image
When Pulse runs in Docker, ARM users couldn't download the docker-agent
because only the host architecture binary was built. The Dockerfile now
builds amd64, arm64, and armv7 binaries and includes them at /opt/pulse/bin/
so the download endpoint can serve all architectures.
2025-10-12 20:01:10 +00:00
rcourtman
ee96d11194 fix: Update modal now properly detects when update completes
The update progress modal was stuck showing 'initializing' even after the
backend restarted and websocket reconnected. Users could see the connection
status badge reconnecting behind the modal, but the modal never cleared.

Now the modal:
- Watches websocket connection status during update
- Detects when backend disconnects and reconnects
- Verifies health after reconnection
- Automatically reloads the page when update is complete
- Shows clearer messaging about restart progress
v4.23.0
2025-10-12 18:25:38 +00:00
rcourtman
a1ba3c00c1 fix: Prevent caching of Docker agent install script and binaries
Add no-cache headers to both the install script and agent binary download endpoints to prevent browsers and curl from serving stale cached versions. This ensures users always get the latest install script with URL normalization fixes for trailing slash issues.

Fixes #528
2025-10-12 18:04:57 +00:00
rcourtman
c18cf3d4b8 Fix node config API to preserve fields on partial updates
The PUT /api/config/nodes/{id} endpoint was corrupting node configurations
when making partial updates (e.g., updating just monitorPhysicalDisks):

- Authentication fields (tokenName, tokenValue, password) were being cleared
  when updating unrelated settings
- Name field was being blanked when not included in request
- Monitor* boolean fields were defaulting to false

Changes:
- Only update name field if explicitly provided in request
- Only switch authentication method when auth fields are explicitly provided
- Preserve existing auth credentials on non-auth updates
- Applied fix to all node types (PVE, PBS, PMG)

Also enables physical disk monitoring by default (opt-out instead of opt-in)
and preserves disk data between polling intervals.
2025-10-12 17:50:55 +00:00
rcourtman
a328dbd8e6 chore: bump version to v4.23.0 2025-10-12 16:35:48 +00:00
rcourtman
18a88cb4cc Improve NVMe temperature handling 2025-10-12 16:06:55 +00:00
rcourtman
2163d6f5a8 Use guest meminfo available for VM memory usage 2025-10-12 11:03:56 +00:00
rcourtman
7ba89da12c Fix docker agent installer for BusyBox wget (#255) 2025-10-12 10:57:19 +00:00
rcourtman
274f36daa8 Improve dashboard responsiveness and temperature handling 2025-10-12 10:34:06 +00:00
rcourtman
a74baed121 feat: capture Proxmox memory snapshots in diagnostics 2025-10-12 10:25:43 +00:00
rcourtman
f46ff1792b Fix settings security tab navigation v4.22.0 2025-10-11 23:29:47 +00:00