Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-02-18 00:17:39 +01:00

Author	SHA1	Message	Date
rcourtman	b72fc2ab79	docs: align sensor proxy config with current defaults	2025-11-20 12:40:01 +00:00
rcourtman	e39c6a3660	docs(sensor-proxy): comprehensive config management documentation Adds complete documentation for the new sensor-proxy config management CLI implemented in Phase 2. Addresses user-facing aspects of the corruption fix. New Documentation: - docs/operations/sensor-proxy-config-management.md (469 lines) - Complete operations runbook for config management - Full CLI reference with examples - Migration guide from inline config - Architecture explanation - Common operational tasks - Troubleshooting guide - Best practices and automation Updated Documentation: - cmd/pulse-sensor-proxy/README.md - Configuration Management CLI section - Allowed Nodes File format - Enhanced troubleshooting - Config corruption recovery - docs/TEMPERATURE_MONITORING.md - Config validation failure troubleshooting - Configuration Management quick reference - Cross-links to detailed docs - docs/TROUBLESHOOTING.md - Sensor proxy config validation errors - Comprehensive diagnosis steps - Automatic and manual recovery - README.md & docs/README.md - Added new runbook to operations index - Positioned for discoverability Coverage: - Both CLI commands fully documented - Phase 1 & Phase 2 architecture explained - Migration path from pre-v4.31.1 - Config corruption recovery procedures - Safe config editing practices - Automation examples - Troubleshooting all failure modes Documentation Quality: - Cross-linked from 5 different documents - Clear examples for common use cases - Target audience: system administrators - Follows project documentation style - Production-ready This completes the sensor-proxy config corruption fix by providing users with comprehensive guidance for the new config management system. Related to Phase 2 commits `3dc073a28`, `804a638ea`, `131666bc1`	2025-11-19 10:01:33 +00:00
rcourtman	c176f9eb51	Document proxy control-plane refresh	2025-11-18 14:31:08 +00:00
rcourtman	f9341ae1fc	Improve temperature proxy workflow	2025-11-17 14:25:46 +00:00
rcourtman	47d5c14aef	Improve temperature proxy control-plane flow	2025-11-15 21:49:51 +00:00
rcourtman	4752a9baff	docs: reference log forwarding runbook in sensor proxy guides	2025-11-14 10:37:09 +00:00
rcourtman	25ae527c95	Clarify sensor proxy HTTPS workflow in docs	2025-11-14 00:48:41 +00:00
rcourtman	c0942d93f0	Explain HTTPS-first temperature architecture	2025-11-14 00:45:20 +00:00
rcourtman	61f011af1d	Improve temperature proxy diagnostics and tests	2025-11-13 22:31:53 +00:00
rcourtman	1a3abf7f3f	Fix pulse-host-agent temperature collection on all Linux distros (related to #661 ) The temperature collection in pulse-host-agent was broken on all Linux distributions due to an incorrect platform check. Root cause: - collectTemperatures() checked `if a.platform != "linux"` at agent.go:316 - normalisePlatform() returns the raw distro name from gopsutil (debian, ubuntu, pve) - This caused temperature collection to be skipped on ALL Linux hosts Fix: - Changed check to `if runtime.GOOS != "linux"` which correctly identifies Linux - runtime.GOOS returns "linux" regardless of distribution Also fixed documentation typo: - Changed "Servers tab" to "Hosts tab" in HOST_AGENT.md and TEMPERATURE_MONITORING.md - Reported by user in issue #661 comments Testing: - Verified build succeeds - Confirmed runtime.GOOS returns "linux" on Linux systems Related to #661	2025-11-08 10:25:01 +00:00
rcourtman	2b7492ac59	feat: Add temperature collection to pulse-host-agent (related to #661 ) Implements temperature monitoring in pulse-host-agent to support Docker-in-VM deployments where the sensor proxy socket cannot cross VM boundaries. Changes: - Create internal/sensors package with local collection and parsing - Add temperature collection to host agent (Linux only, best-effort) - Support CPU package/core, NVMe, and GPU temperature sensors - Update TEMPERATURE_MONITORING.md with Docker-in-VM setup instructions - Update HOST_AGENT.md to document temperature feature The host agent now automatically collects temperature data on Linux systems with lm-sensors installed. This provides an alternative path for temperature monitoring when running Pulse in a VM, avoiding the unix socket limitation. Temperature collection is best-effort and fails gracefully if lm-sensors is not available, ensuring other metrics continue to be reported. Related to #661	2025-11-07 22:54:40 +00:00
rcourtman	52bc23b850	docs: Fix remaining :rw mount references to :ro Updates all remaining references to read-write socket mounts in TEMPERATURE_MONITORING.md to use read-only (:ro) mounts for security. Changes: - Manual installation section - Docker-only responsibilities section - Ansible playbook example All socket mounts should be :ro to prevent container tampering.	2025-11-07 17:14:47 +00:00
rcourtman	f9dc2f6466	docs: Add comprehensive security audit documentation Adds complete documentation for 2025-11-07 security audit and hardening: - SECURITY_AUDIT_2025-11-07.md: Full professional audit report - 9 security issues identified and fixed (4 critical, 4 medium, 1 low) - Detailed findings, remediations, and testing - Security posture improved from B+ to A - 85%+ reduction in exploitable attack surface - SECURITY_CHANGELOG.md: Detailed changelog with migration guide - Complete implementation details for all fixes - Configuration examples - Backwards compatibility notes - New metrics and features - DEPLOYMENT_CHECKLIST.md: Step-by-step deployment guide - Pre-deployment backup procedures - Deployment steps for Docker and LXC - Verification procedures - Rollback procedures - Troubleshooting guide - Success criteria - README.md: Updated with security hardening highlights - Links to audit report - Key security features added Audit performed by Claude (Sonnet 4.5) + Codex collaboration. All implementations by Codex based on Claude specifications. 100% remediation rate (9/9 issues fixed). 17 new tests added, all passing. Related to security audit 2025-11-07.	2025-11-07 17:10:21 +00:00
rcourtman	48fabdd827	Improve Docker temperature monitoring documentation for clarity (related to #600 ) Updated the Quick Start for Docker section in TEMPERATURE_MONITORING.md to be more user-friendly and address common setup issues: - Added clear explanation of why the proxy is needed (containers can't access hardware) - Provided concrete IP example instead of placeholder - Showed full docker-compose.yml context with proper YAML structure - Added sudo to commands where needed - Updated docker-compose commands to v2 syntax with note about v1 - Expanded verification steps with clearer success indicators - Added reminder to check container name in verification commands These improvements should help users who encounter blank temperature displays due to missing proxy installation or bind mount configuration.	2025-11-07 15:09:42 +00:00
rcourtman	becda56897	Fix critical rollback download URL bug and doc inconsistencies Issues found during systematic audit after #642: 1. CRITICAL BUG - Rollback downloads were completely broken: - Code constructed: pulse-linux-amd64 (no version, no .tar.gz) - Actual asset name: pulse-v4.26.1-linux-amd64.tar.gz - This would cause 404 errors on all rollback attempts - Fixed: Construct correct tarball URL with version - Added: Extract tarball after download to get binary 2. TEMPERATURE_MONITORING.md referenced non-existent v4.27.0: - Changed to use /latest/download/ for future-proof docs 3. API.md example had wrong filename format: - Changed pulse-linux-amd64.tar.gz to pulse-v4.30.0-linux-amd64.tar.gz - Ensures example matches actual release asset naming The rollback bug would have affected any user attempting to roll back to a previous version via the UI or API.	2025-11-06 14:25:32 +00:00
rcourtman	dfe960deb4	Fix container SSH detection and improve troubleshooting for issue #617 Related to #617 This fixes a misconfiguration scenario where Docker containers could attempt direct SSH connections (producing [preauth] log spam) instead of using the sensor proxy. Changes: - Fix container detection to check PULSE_DOCKER=true in addition to system.InContainer() heuristics (both temperature.go and config_handlers.go) - Upgrade temperature collection log from Error to Warn with actionable guidance about mounting the proxy socket - Add Info log when dev mode override is active so operators understand the security posture - Add troubleshooting section to docs for SSH [preauth] logs from containers The container detection was inconsistent - monitor.go checked both flags but temperature.go and config_handlers.go only checked InContainer(). Now all locations consistently check PULSE_DOCKER \|\| InContainer().	2025-11-06 09:57:53 +00:00
rcourtman	a5e3469da8	Add comprehensive automation documentation for temperature proxy installation This addresses the need for users who deploy Pulse via infrastructure-as-code tools (Ansible, Terraform, Salt, Puppet) to have scriptable, well-documented installation procedures. Changes: Comprehensive Automation Section: - Documented all installer script flags and options - Required: --ctid (LXC) or --standalone (Docker) - Optional: --quiet, --pulse-server, --version, --local-binary, --skip-restart - Documented idempotency, exit codes, and non-interactive behavior Real-World Examples: - Ansible playbook for LXC deployments - Ansible playbook for Docker deployments (includes docker-compose.yml management) - Terraform null_resource example with remote-exec - Manual step-by-step configuration (no script) Configuration Documentation: - Complete YAML config file format with all options - Environment variable overrides (PULSE_SENSOR_PROXY_ALLOWED_SUBNETS, etc.) - Example systemd service overrides - Rate limiting, metrics, ACL, and subnet configuration Quick Reference: - Added link at top of doc for automation users to jump directly to automation section - Clear examples of re-running after changes (adding nodes, upgrading versions) Key Features for Automation: - --quiet flag for non-interactive execution - Idempotent design (safe to re-run) - Verifiable exit codes - Environment variable configuration - Local binary support (no internet required) This makes it straightforward for infrastructure teams to integrate Pulse temperature monitoring into their existing automation workflows without relying on interactive scripts or manual steps.	2025-11-05 18:18:04 +00:00
rcourtman	a1fb79ae6a	Fix temperature proxy documentation and setup script for Docker vs LXC clarity This addresses confusion around temperature monitoring setup for Docker deployments where users expected a turnkey experience similar to LXC. The core issue: The setup script and documentation suggested that temperature monitoring was "automatically configured" for all containerized deployments, but in reality only LXC containers have a fully automatic setup. Docker requires manual steps. Changes: Setup Script (config_handlers.go): - Fixed "unknown environment" path to show separate instructions for LXC vs Docker - Docker instructions now correctly show --standalone flag (was incorrectly showing --ctid) - Added docker-compose.yml bind mount instructions inline - Added restart command for Docker deployments Documentation (TEMPERATURE_MONITORING.md): - Added prominent "Deployment-Specific Setup" callout at the top - Clarified that LXC is fully automatic, Docker requires manual steps - Reorganized "Setup (Automatic)" section to clearly distinguish: - LXC: Fully turnkey (no manual steps) - Docker: Manual proxy installation required - Node configuration: Works for both - Updated "Host-side responsibilities" to specify it's Docker-only - Fixed architecture benefits to reflect LXC vs Docker differences Why this matters: - LXC setup script auto-detects the container and runs install-sensor-proxy.sh --ctid - Docker deployments can't be auto-detected and require --standalone flag - Users running Docker were getting incorrect instructions (--ctid instead of --standalone) - Documentation suggested everything was automatic, leading to confusion Now the documentation and setup script accurately reflect that: - LXC = Turnkey (automatic) - Docker = Manual steps required (but well-documented) - Native = Direct SSH (no proxy) Related to GitHub Discussion #605	2025-11-05 18:18:04 +00:00
rcourtman	26144ae558	Improve temperature proxy setup guidance for Docker deployments This addresses GitHub Discussion #605 where users were unclear about configuring the temperature proxy when running Pulse in Docker. Changes: install-sensor-proxy.sh: - Add Docker-specific post-install instructions when --standalone flag is used - Show required docker-compose.yml bind mount configuration - Provide verification commands for Docker deployments - Link to full documentation for troubleshooting TEMPERATURE_MONITORING.md: - Add prominent "Quick Start for Docker Deployments" section at the top - Move Docker instructions earlier in the document for better visibility - Provide complete 4-step setup process with verification commands These changes ensure Docker users immediately see: 1. How to install the proxy on the Proxmox host 2. What bind mount to add to docker-compose.yml 3. How to restart and verify the setup 4. Where to find detailed troubleshooting The installer now provides actionable next steps instead of just confirming installation, reducing confusion for containerized deployments.	2025-11-05 18:18:04 +00:00
rcourtman	d52ac6d8b5	Fix CSRF token validation and improve token management - Add Access-Control-Expose-Headers to allow frontend to read X-CSRF-Token response header - Implement proactive CSRF token issuance on GET requests when session exists but CSRF cookie is missing - Ensures frontend always has valid CSRF token before making POST requests - Fixes 403 Forbidden errors when toggling system settings This resolves CSRF validation failures that occurred when CSRF tokens expired or were missing while valid sessions existed.	2025-11-05 09:23:44 +00:00
rcourtman	5c4be1921c	chore: snapshot current changes	2025-11-02 22:47:55 +00:00
rcourtman	ff4dc49ae4	Update Pulse install flow and related components	2025-10-21 19:58:53 +00:00
rcourtman	ddc9a7a068	docs: comprehensive documentation for rate limit fix and configurability Document the pulse-sensor-proxy rate limiting bug fix and new configurability across all relevant documentation: TEMPERATURE_MONITORING.md: - Added 'Rate Limiting & Scaling' section with symptom diagnosis - Included sizing table for 1-3, 4-10, 10-20, and 30+ node deployments - Provided tuning formula: interval_ms = polling_interval / node_count TROUBLESHOOTING.md: - Added 'Temperature data flickers after adding nodes' section - Step-by-step diagnosis using limiter metrics and scheduler health - Quick fix with config example CONFIGURATION.md: - Added pulse-sensor-proxy/config.yaml reference section - Documented rate_limit.per_peer_interval_ms and per_peer_burst fields - Included defaults and example override pulse-sensor-proxy-runbook.md: - Updated quick reference with new defaults (1 req/sec, burst 5) - Added 'Rate Limit Tuning' procedure with 4 deployment profiles - Included validation steps and monitoring commands TEMPERATURE_MONITORING_SECURITY.md: - Updated rate limiting section with new defaults - Added configurable overrides guidance - Documented security considerations for production deployments Related commits: - `46b8b8d08`: Initial rate limit fix (hardcoded defaults) - `ca534e2b6`: Made rate limits configurable via YAML - `e244da837`: Added guidance for large deployments (30+ nodes)	2025-10-21 11:36:07 +00:00
rcourtman	c91b7874ac	docs: comprehensive v4.24.0 documentation audit and updates Complete documentation overhaul for Pulse v4.24.0 release covering all new features and operational procedures. Documentation Updates (19 files): P0 Release-Critical: - Operations: Rewrote ADAPTIVE_POLLING_ROLLOUT.md as GA operations runbook - Operations: Updated ADAPTIVE_POLLING_MANAGEMENT_ENDPOINTS.md with DEFERRED status - Operations: Enhanced audit-log-rotation.md with scheduler health checks - Security: Updated proxy hardening docs with rate limit defaults - Docker: Added runtime logging and rollback procedures P1 Deployment & Integration: - KUBERNETES.md: Runtime logging config, adaptive polling, post-upgrade verification - PORT_CONFIGURATION.md: Service naming, change tracking via update history - REVERSE_PROXY.md: Rate limit headers, error pass-through, v4.24.0 verification - PROXY_AUTH.md, OIDC.md, WEBHOOKS.md: Runtime logging integration - TROUBLESHOOTING.md, VM_DISK_MONITORING.md, zfs-monitoring.md: Updated workflows Features Documented: - X-RateLimit-* headers for all API responses - Updates rollback workflow (UI & CLI) - Scheduler health API with rich metadata - Runtime logging configuration (no restart required) - Adaptive polling (GA, enabled by default) - Enhanced audit logging - Circuit breakers and dead-letter queue Supporting Changes: - Discovery service enhancements - Config handlers updates - Sensor proxy installer improvements Total Changes: 1,626 insertions(+), 622 deletions(-) Files Modified: 24 (19 docs, 5 code) All documentation is production-ready for v4.24.0 release.	2025-10-20 17:20:13 +00:00
Richard Courtman	02701ca22b	fix: gracefully handle standalone node cleanup limitation - Cleanup script now detects forced command restriction on standalone nodes - Logs helpful message explaining limitation (security by design) - Does not fail when standalone nodes cannot be cleaned up - Documents that standalone node cleanup is limited by forced command security - Automatic cleanup works fully for cluster nodes - Manual cleanup command provided for standalone nodes if needed	2025-10-18 07:34:18 +00:00
Richard Courtman	b328a09e45	docs: add automatic cleanup documentation for node removal	2025-10-18 07:03:42 +00:00
Richard Courtman	de3bb47930	fix: improve turnkey temperature monitoring for standalone nodes - Fix script input handling to work with standard curl \| bash pattern by prioritizing /dev/tty - Add Raspberry Pi temperature sensor support (cpu_thermal chip and generic temp sensors) - Add comprehensive documentation for turnkey standalone node setup - Fix printf formatting error in setup script	2025-10-18 06:51:56 +00:00
rcourtman	a5d4d57097	docs: implement Codex recommendations for temperature monitoring Add comprehensive documentation improvements based on architectural review: 1. Enhanced Known Limitations section: - Document single proxy failure mode - Explain sensors output parsing brittleness with mitigation steps - Clarify cluster discovery dependencies and fallback options - Describe SSH fan-out scaling considerations for large clusters 2. Documented SSH key rotation workflow: - Promote automated rotation script as recommended approach - Include dry-run, execution, and rollback examples - Provide manual fallback process - Reference existing pulse-proxy-rotate-keys.sh script 3. Added Future Improvements roadmap: - Proxmox API integration (when available) - Agent-based architecture option - SNMP/IPMI support - Schema validation - Caching and throttling - Automated rotation timer - Health check endpoint Instrumentation verified: proxy already has comprehensive Prometheus metrics (RPC/SSH requests, latency, queue depth, rate limiting) and structured logging.	2025-10-17 12:03:31 +00:00
rcourtman	07fe382553	docs: update temperature monitoring guide to reflect removed UI button - Replace references to 'Ensure cluster keys' button with instructions to re-run setup script - Update troubleshooting section for new cluster nodes - The setup script already handles SSH key distribution automatically	2025-10-17 11:46:31 +00:00
rcourtman	d3d4b9811a	docs: add manual pulse-sensor-proxy install steps	2025-10-13 19:36:50 +00:00
rcourtman	fcd8b62705	refactor: Rename install-temp-proxy.sh to install-sensor-proxy.sh Complete the pulse-sensor-proxy rename by updating the installer script name and all references to it. Updated: - Renamed scripts/install-temp-proxy.sh → scripts/install-sensor-proxy.sh - Updated all documentation references - Updated install.sh references - Updated build-release.sh comments	2025-10-13 13:23:53 +00:00
rcourtman	b952444837	refactor: Rename pulse-temp-proxy to pulse-sensor-proxy The name "temp-proxy" implied a temporary or incomplete implementation. The new name better reflects its purpose as a secure sensor data bridge for containerized Pulse deployments. Changes: - Renamed cmd/pulse-temp-proxy/ to cmd/pulse-sensor-proxy/ - Updated all path constants and binary references - Renamed environment variables: PULSE_TEMP_PROXY_* to PULSE_SENSOR_PROXY_* - Updated systemd service and service account name - Updated installation, rotation, and build scripts - Renamed hardening documentation - Maintained backward compatibility for key removal during upgrades	2025-10-13 13:17:05 +00:00
rcourtman	97066d8351	docs: Update socket paths and add monitoring section to TEMPERATURE_MONITORING.md Updated documentation to reflect new directory-level bind mount architecture: - Changed socket path from /var/run/pulse-temp-proxy.sock to /run/pulse-temp-proxy/pulse-temp-proxy.sock - Updated LXC bind mount syntax to directory-level (create=dir instead of create=file) - Added "Monitoring the Proxy" section with manual monitoring commands - Documents systemd restart-on-failure reliance for v1 - Notes future pulse-watchdog integration planned Related to #528	2025-10-12 22:42:38 +00:00
rcourtman	47116bedb5	docs: Add comprehensive Operations & Troubleshooting section Addresses operational documentation gaps for pulse-temp-proxy: - Service management (restart, stop, start, enable/disable) - Log locations and viewing commands - SSH key rotation procedures (recommended every 90 days) - Key revocation when nodes leave cluster - Failure modes (proxy down, socket issues, pvecm absent, off-cluster) - Known limitations (one per host, cluster membership, cross-cluster) - Common issues with troubleshooting steps - Diagnostic info collection for bug reports This provides operators with everything they need to manage the proxy service in production environments.	2025-10-12 21:50:55 +00:00
rcourtman	6d4694f019	security: Add SO_PEERCRED authentication to temperature proxy Addresses security concern raised in code review: - Socket permissions changed from 0666 to 0660 - Added SO_PEERCRED verification to authenticate connecting processes - Only allows root (UID 0) or proxy's own user - Prevents unauthorized processes from triggering SSH key rollout - Documented passwordless root SSH requirement for clusters This prevents any process on the host or in other containers from accessing the proxy RPC endpoints.	2025-10-12 21:42:22 +00:00
rcourtman	e7bc338891	feat: Implement secure temperature proxy for containerized deployments Addresses #528 Introduces pulse-temp-proxy architecture to eliminate SSH key exposure in containers: Architecture: - pulse-temp-proxy runs on Proxmox host (outside LXC/Docker) - SSH keys stored on host filesystem (/var/lib/pulse-temp-proxy/ssh/) - Pulse communicates via unix socket (bind-mounted into container) - Proxy handles cluster discovery, key rollout, and temperature fetching Components: - cmd/pulse-temp-proxy: Standalone Go binary with unix socket RPC server - internal/tempproxy: Client library for Pulse backend - scripts/install-temp-proxy.sh: Idempotent installer for existing deployments - scripts/pulse-temp-proxy.service: Systemd service for proxy Integration: - Pulse automatically detects and uses proxy when socket exists - Falls back to direct SSH for native installations - Installer automatically configures proxy for new LXC deployments - Existing LXC users can upgrade by running install-temp-proxy.sh Security improvements: - Container compromise no longer exposes SSH keys - SSH keys never enter container filesystem - Maintains forced command restrictions - Transparent to users - no workflow changes Documentation: - Updated TEMPERATURE_MONITORING.md with new architecture - Added verification steps and upgrade instructions - Preserved legacy documentation for native installs	2025-10-12 21:35:35 +00:00
rcourtman	c8e3c93516	fix: Add security gates for containerized temperature monitoring Addresses #528 - Added opt-in confirmation prompt to setup script with security notice - Added runtime warning when containerized Pulse uses SSH temperature monitoring - Documented security considerations and hardening recommendations - Users must explicitly confirm understanding before enabling in containers	2025-10-12 21:01:25 +00:00
rcourtman	18a88cb4cc	Improve NVMe temperature handling	2025-10-12 16:06:55 +00:00
rcourtman	f46ff1792b	Fix settings security tab navigation	2025-10-11 23:29:47 +00:00

39 Commits