Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-02-18 00:17:39 +01:00

Author	SHA1	Message	Date
courtmanr@gmail.com	6803556dec	feat: auto-remove legacy agents during unified installation	2025-11-25 12:56:31 +00:00
courtmanr@gmail.com	7a204eab52	feat: add managed agents list and cleanup legacy scripts	2025-11-25 12:54:13 +00:00
courtmanr@gmail.com	92f8426ee7	feat: unify agent installation UI and scripts	2025-11-25 12:23:22 +00:00
courtmanr@gmail.com	930c086556	WIP: Save all pending changes including frontend updates and unified agent scaffolding	2025-11-25 11:27:07 +00:00
courtmanr@gmail.com	3ec7b401a3	Improve installer UX with pauses and popups on failure Fixes #755. Adds interactive pauses and graphical popups (where available) to installer scripts when critical errors occur, ensuring troubleshooting guides are readable. Also clarifies 'build from source' instructions.	2025-11-25 11:17:37 +00:00
courtmanr@gmail.com	bddb90229b	Improve setup script clarity: reduce verbosity and fix confusing messages	2025-11-25 10:13:20 +00:00
courtmanr@gmail.com	0c6fd01ff2	Improve setup script output by hiding irrelevant Docker/proxy info	2025-11-25 10:01:41 +00:00
courtmanr@gmail.com	7c69b75363	Fix checksum verification on macOS by replacing awk with grep	2025-11-25 09:36:21 +00:00
courtmanr@gmail.com	0c4b295ac7	refactor(scripts): replace legacy install-docker-agent.sh with bundled v2 script	2025-11-25 08:36:24 +00:00
courtmanr@gmail.com	7e8d7d1b5f	fix(scripts): improve checksum verification robustness against whitespace	2025-11-25 08:24:26 +00:00
courtmanr@gmail.com	193ef979ad	chore: remove unnecessary development files and docs - Remove CLEANUP_TODO.md and MIGRATION_SCAFFOLDING.md (internal notes) - Remove temporary scripts: copy_and_run.sh, work.sh - Remove AI assistant utility scripts: backup-claude-md.sh, codex-router.sh These files were used during development but don't belong in the repository.	2025-11-24 23:09:22 +00:00
courtmanr@gmail.com	c91add36d2	fix: filter out qdevice from cluster node discovery	2025-11-24 22:54:58 +00:00
olagrasli	9e18986558	Update install-docker-agent.sh to handle log_error and calculated checksum containing linebreak Added missing function log_error Updated checksum check to handle \r at the end of calculated_checksum	2025-11-24 21:09:44 +01:00
courtmanr@gmail.com	82ba508b59	chore: remove outdated docs, update cleanup script and release workflow	2025-11-24 19:14:54 +00:00
courtmanr@gmail.com	450081a8b0	Fix workflow name in trigger-release.sh	2025-11-24 18:10:13 +00:00
courtmanr@gmail.com	4168eb41f8	Fix host agent registration verification issues (#746 ) - Change default server listen addresses to empty string (listen on all interfaces including IPv6) - Add short hostname matching fallback in host lookup API to handle FQDN vs short name mismatches - Implement retry loop (30s) in both Windows and Linux/macOS installers for registration verification - Fix lint errors: remove unnecessary fmt.Sprintf and nil checks before len() This resolves the 'Installer could not yet confirm host registration with Pulse' warning by addressing timing issues, hostname matching, and network connectivity.	2025-11-24 14:28:09 +00:00
courtmanr@gmail.com	4640633430	Improve agent update logging and installer warnings (related to #737 )	2025-11-23 22:07:37 +00:00
courtmanr@gmail.com	64a509e3da	Fix install-host-agent.sh function order, remove duplicate, and improve dev serving	2025-11-23 12:27:11 +00:00
courtmanr@gmail.com	a5fbe52a59	Fix pvecm status parsing for QDevice flags (#738 )	2025-11-22 23:44:01 +00:00
rcourtman	d0d7a3dcbd	Fix mp mount detection pattern for pulse-sensor-proxy The grep pattern was looking for 'pulse-sensor-proxy' as a standalone string, but the actual mount line contains paths like: mp0: /run/pulse-sensor-proxy,mp=/mnt/pulse-proxy,replicate=0 This caused the removal logic to never execute, leaving the old mp mount in place and preventing the migration to lxc.mount.entry format. Changed pattern to match either path component: - /pulse-sensor-proxy (source path) - /mnt/pulse-proxy (mount point) Also removed space after colon in pattern to match actual format. This completes the fix for temperature proxy setup on LXC containers.	2025-11-22 22:34:26 +00:00
rcourtman	3858397f76	Fix LXC config modification for Proxmox pmxcfs filesystem The /etc/pve/ directory is a clustered FUSE filesystem (pmxcfs) managed by Proxmox. Direct modifications using sed -i or echo >> don't work reliably on this filesystem, and LXC config files contain snapshot sections that must be preserved. Changes: - Use temp file approach: copy config, modify temp, copy back to trigger sync - Only modify main config section (before first [snapshot] marker) - Properly handle both mp mount removal and lxc.mount.entry addition - Apply fix to both install.sh and install-sensor-proxy.sh This fixes temperature proxy setup failures where the socket mount entry wasn't being persisted to the container configuration. Related to #628	2025-11-22 22:19:00 +00:00
rcourtman	596bdbfb13	Handle standby SMART temps and capture disk identity	2025-11-22 07:35:13 +00:00
rcourtman	78ffb14493	Prevent token manager auth swap and fix docker agent perms (Related to #740 )	2025-11-22 07:18:42 +00:00
rcourtman	a3d88ed7fe	Guard host-agent installs on noexec filesystems (Related to #718 )	2025-11-21 23:00:47 +00:00
rcourtman	3b85436c0f	Related to #738 : make pulse proxy mount migration-safe	2025-11-21 21:29:14 +00:00
rcourtman	28c0d3d39c	Harden release validation for host agent downloads (related to #735 )	2025-11-21 10:47:53 +00:00
rcourtman	408e113f35	Add TrueNAS SCALE persistence for host agent (Related to #718 )	2025-11-21 10:07:14 +00:00
rcourtman	2e10447773	Initialize ObservedValues in Windows installer	2025-11-20 21:01:44 +00:00
rcourtman	f0166dcab6	fix(installer): handle legacy sensor-proxy config commands	2025-11-20 20:33:51 +00:00
courtmanr@gmail.com	37b1517bd8	feat: implement atomic config management in sensor proxy	2025-11-20 19:01:24 +00:00
rcourtman	f8e59839ba	Add agent-id support to host agent installers (Related to #721 )	2025-11-20 18:14:18 +00:00
courtmanr@gmail.com	c8b4d4a0d8	Implement sensor proxy installation and configuration updates	2025-11-20 13:23:21 +00:00
courtmanr@gmail.com	d8e2b40086	Fix macOS build for sensor-proxy and improve hot-dev script	2025-11-20 12:28:01 +00:00
courtmanr@gmail.com	212484885f	Improve login page UI and fix hot-dev script for macOS	2025-11-20 12:21:49 +00:00
courtmanr@gmail.com	11477546f8	Update config persistence, crypto, and dev script	2025-11-20 11:46:20 +00:00
rcourtman	bd0c47ed1b	Improve token collision handling and installer subnet support	2025-11-20 09:45:36 +00:00
rcourtman	3c5a1b273c	Improve Windows installer arch detection (related to #723 )	2025-11-20 09:37:45 +00:00
rcourtman	7d0bbaf961	WIP: Fix temperature proxy registration persistence (incomplete) This commit contains multiple fixes for temperature proxy registration, but the core issue remains unresolved. ## What's Fixed: 1. Added config pointer and reloadFunc to TemperatureProxyHandlers 2. Added SetConfig method to keep handler in sync with router config changes 3. Added config reload after registration to prevent monitor from overwriting 4. Fixed installer port conflict detection and duplicate YAML key issues 5. Added comprehensive debug logging throughout registration flow ## What's Still Broken: The TemperatureProxyURL, TemperatureProxyToken, and TemperatureProxyControlToken fields are NOT persisting to nodes.enc after SaveNodesConfig is called. Debug logs confirm: - HandleRegister correctly updates nodesConfig.PVEInstances[matchedIndex] - The correct data is passed to SaveNodesConfig (verified in logs) - SaveNodesConfig completes without errors - Config reload executes successfully - BUT after Pulse restart, the fields are empty when loaded from disk The bug is in SaveNodesConfig serialization or file writing logic itself. Related files: - internal/api/temperature_proxy.go: Registration handler - internal/config/persistence.go: SaveNodesConfig implementation - internal/config/config.go: PVEInstance struct definition	2025-11-19 20:12:19 +00:00
rcourtman	714c2b753d	fix(sensor-proxy): ensure correct config.yaml permissions after modifications Fixed bug where config.yaml would end up with root:root 600 permissions after the installer modified it, causing service startup failures with "permission denied" errors. Root cause: Two code paths modified config.yaml without resetting ownership: 1. ensure_control_plane_config() - used mktemp (creates root-owned file), then mv'd it over config.yaml without chown/chmod 2. HTTP mode configuration - appended to config.yaml without resetting perms Fix: Added chown/chmod after both modifications: - Line 1601-1602: After control-plane config update - Line 1860-1861: After HTTP mode config append Now config.yaml maintains pulse-sensor-proxy:pulse-sensor-proxy 644 permissions after all modifications, allowing the service to start correctly. This bug was discovered during repair logic testing - the service failed to start after the installer ran, even though the fmt.Sprintf argument alignment fix was working correctly.	2025-11-19 14:53:44 +00:00
courtmanr@gmail.com	c4e76a7c97	fix: local dev setup, encryption key generation, and pnpm lockfile	2025-11-19 14:48:09 +00:00
rcourtman	e205473a4b	fix(docker): always reinstall sensor-proxy to refresh tokens/config Same turnkey UX issue as PVE quick setup: when local sensor-proxy socket exists and validates, install-docker.sh would skip reinstallation (line 498). This left stale control-plane tokens and URLs. Fix: Always reinstall when LOCAL_PROXY_EXISTED_AT_START=true - Track socket existence at script start - Change message: 'will refresh to update tokens/config' - Set SKIP_INSTALLATION=false to force reinstall - Installer is idempotent (Phase 2), safe to rerun This completes the turnkey repair fix across all entry points (PVE quick setup and Docker installer).	2025-11-19 13:37:06 +00:00
rcourtman	497f94f4e8	feat(sensor-proxy): improve turnkey setup experience with Pulse restart handling - Update installer to use v4.32.0 Phase 2 binaries with file-based config - Add automatic detection of Pulse service (systemd/hot-dev/docker) - Add --restart-pulse flag for automatic Pulse restart in dev/test environments - Default behavior shows clear instructions to restart Pulse manually (safe for production) - Add prominent restart notice with command suggestions based on detected deployment - Improve UX by making restart step impossible to miss Related to Phase 2 sensor-proxy architecture improvements	2025-11-19 12:44:07 +00:00
rcourtman	5c2379b4b4	fix(sensor-proxy): eliminate race in migration script Stop pulse-sensor-proxy service before modifying config.yaml to prevent races with the running daemon or concurrent CLI commands. The migration script now: 1. Stops service if running 2. Updates config atomically (temp + rename in same directory) 3. Restarts service if it was running This achieves complete architectural isolation - ALL config file writers are now coordinated (either through Phase 2 CLI locking or by ensuring the service is stopped during modification). Addresses final Codex review feedback.	2025-11-19 11:04:58 +00:00
rcourtman	0177e438e5	fix(sensor-proxy): ensure migrate script atomic write is same-filesystem Create temp file in same directory as config.yaml to ensure mv is truly atomic (won't degrade to copy+unlink on different filesystems). Added comments noting this is a legacy migration script with minor race risk (no file locking) that should be deprecated once all users upgrade to v4.32+.	2025-11-19 11:02:14 +00:00
rcourtman	d6084e29dd	fix(sensor-proxy): fix remaining unsafe config writers 1. Self-heal script: Add BINARY_PATH variable so CLI migration actually runs - Previously logged "Binary not available" and skipped migration 2. migrate-sensor-proxy-control-plane.sh: Use atomic write (temp + rename) - Prevents partial writes if script is interrupted - Reduces race window with running service These were the remaining gaps identified by Codex review. NOTE: migrate-sensor-proxy-control-plane.sh still uses Python manipulation instead of the Phase 2 CLI, but as a one-time migration script for upgrades from v4.31, the atomic write provides sufficient protection. Future versions can deprecate this script entirely.	2025-11-19 10:59:54 +00:00
rcourtman	d554c9dbb2	fix(sensor-proxy): eliminate all uncoordinated config writers Remove all code paths that manipulate config files without Phase 2 locking: 1. Installer: Remove ensure_allowed_nodes_file_reference() call (line 1674) - Migration now handled exclusively by config migrate-to-file 2. Installer: Make migration failures fatal in update_allowed_nodes() - Prevents fallback to unsafe Python manipulation 3. Daemon sanitizer: Remove os.WriteFile() call - Now only sanitizes in-memory copy, doesn't write back to disk - Logs warning instructing admin to run `config migrate-to-file` 4. Self-heal script: Replace 132 lines of Python with CLI call - sanitize_allowed_nodes() now calls `config migrate-to-file` - Eliminates uncoordinated Python-based config rewriting All config mutations now flow exclusively through Phase 2 CLI with atomic operations and file locking. No code paths remain that can create duplicate allowed_nodes blocks. Addresses Codex review feedback on Phase 2 gaps.	2025-11-19 10:55:01 +00:00
rcourtman	28cd487889	feat(sensor-proxy): complete Phase 2 with CLI-based config migration Add `config migrate-to-file` command and update installer to eliminate all shell/Python config manipulation, ensuring atomic operations throughout. Changes: - Add `config migrate-to-file` command to atomically migrate inline allowed_nodes blocks to file-based configuration - Update installer's update_allowed_nodes() to call CLI exclusively - Simplify migrate_inline_allowed_nodes_to_file() to use CLI - Remove dependency on Python/sed for config manipulation - Implement dual-file locking (config.yaml + allowed_nodes.yaml) to prevent race conditions during migration All config mutations now flow through the Phase 2 CLI with: - File locking (flock) - Atomic writes (temp + rename + fsync) - Proper YAML parsing/generation This completes Phase 2 architecture and eliminates the root cause of config corruption issues. Related to prior commits: `53dec6010`, `3dc073a28`, `804a638ea`, `131666bc1`	2025-11-19 10:35:49 +00:00
rcourtman	1162a208cc	fix(sensor-proxy): critical Phase 2 locking and validation fixes Fixes critical issues found by Codex code review: 1. Fixed file locking race condition (CRITICAL) - Lock file was being replaced by atomic rename, invalidating the lock - New approach: lock a separate `.lock` file that persists across renames - Ensures concurrent writers (installer + self-heal timer) are properly serialized - Without this fix, corruption was still possible despite Phase 2 2. Fixed validation to honor configured allowed_nodes_file path - validate command now uses loadConfig() to read actual config - Respects allowed_nodes_file setting instead of assuming default path - Prevents false positives/negatives when path is customized 3. Allow empty allowed_nodes lists - Empty lists are valid (admin may clear for security, or rely on IPC validation) - validate no longer fails on empty lists - set-allowed-nodes --replace with zero nodes now supported - Critical for operational flexibility 4. Installer error propagation - update_allowed_nodes failures now exit installer with error - Prevents silent failures that leave stale allowlists - Self-heal will abort instead of masking CLI errors Technical Details: - withLockedFile() now locks `<path>.lock` instead of target file - Lock held for entire duration of read-modify-write-rename - atomicWriteFile() completes while lock is still held - Empty lists represented as `allowed_nodes: []` in YAML Testing: ✅ Lock file created and persists across operations ✅ Empty list can be written with --replace ✅ Validation passes with empty lists ✅ Config path from allowed_nodes_file honored ✅ Concurrent operations properly serialized These fixes ensure Phase 2 actually eliminates corruption by design. Identified by Codex code review Related to Phase 2 commit `3dc073a28`	2025-11-19 09:47:43 +00:00
rcourtman	0565781655	feat(sensor-proxy): Phase 2 - atomic config management with CLI Implements bullet-proof configuration management to completely eliminate allowed_nodes corruption by design. This builds on Phase 1 (file-only mode) by replacing all shell/Python config manipulation with proper Go tooling. New Features: - `pulse-sensor-proxy config validate` - parse and validate config files - `pulse-sensor-proxy config set-allowed-nodes` - atomic node list updates - File locking via flock prevents concurrent write races - Atomic writes (temp file + rename) ensure consistency - systemd ExecStartPre validation prevents startup with bad config Architectural Changes: 1. Installer now calls config CLI instead of embedded Python/shell scripts 2. All config mutations go through single authoritative writer 3. Deduplication and normalization handled in Go (reuses existing logic) 4. Sanitizer kept as noisy failsafe (warns if corruption still occurs) Implementation Details: - New cmd/pulse-sensor-proxy/config_cmd.go with cobra commands - withLockedFile() wrapper ensures exclusive access - atomicWriteFile() uses temp + rename pattern - Installer update_allowed_nodes() simplified to CLI calls - Both systemd service modes include ExecStartPre validation Why This Works: - Single code path for all writes (no shell/Python divergence) - File locking serializes self-heal timer + manual installer runs - Validation gate prevents proxy from starting with corrupt config - CLI uses same YAML parser as the daemon (guaranteed compatibility) Phase 2 Benefits: - Corruption impossible by design (not just detected and fixed) - No more Python dependency for config management - Atomic operations prevent partial writes - Clear error messages on validation failures The defensive sanitizer remains active but now logs loudly if triggered, allowing us to confirm Phase 2 eliminates corruption in production before removing the safety net entirely. This completes the fix for the recurring temperature monitoring outages. Related to Phase 1 commit `53dec6010`	2025-11-19 09:37:49 +00:00
rcourtman	5f4143f0ab	fix(sensor-proxy): eliminate allowed_nodes config corruption Phase 1 hotfix to address recurring config file corruption that causes 99% of temperature monitoring failures. The root cause was the installer oscillating between inline and file-based allowlist modes, creating duplicate `allowed_nodes:` keys in config.yaml. Changes: - Force file-based allowlist mode exclusively (refuse versions < v4.31.1) - Add automatic migration from inline to file-based config - Remove inline mode code path from update_allowed_nodes() - Migration runs on every install/self-heal to clean up existing corruption The self-heal timer runs every 5 minutes and was the primary source of corruption when version detection failed or encountered edge cases. This eliminates the dual code paths and ensures config.yaml is never edited for allowlist changes - only /etc/pulse-sensor-proxy/allowed_nodes.yaml is modified. Phase 2 (next release) will implement proper Go-based config management with atomic writes, locking, and systemd validation to prevent corruption by design. Related to recurring temperature monitoring outages	2025-11-19 09:21:54 +00:00

1 2 3 4 5

220 Commits