Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-02-18 00:17:39 +01:00

Author	SHA1	Message	Date
rcourtman	6753727a04	docs: update API documentation and config file references Comprehensive documentation updates: API.md: - Add /api/security/change-password endpoint - Add AI provider test endpoints - Add assistant chat & session management endpoints - Add legacy chat sessions endpoints - Add alert investigation and patrol autonomy endpoints - Add findings & investigations endpoints - Add approvals & command execution endpoints - Add remediation plans endpoints - Add intelligence & forecasting endpoints - Add knowledge base endpoints - Add debug endpoint - Add Socket.IO compatibility endpoint Config files: - Document sso.enc, ai_chat_sessions.json - Document profile-versions.json, profile-changelog.json, profile-deployments.json	2026-02-01 23:26:42 +00:00
rcourtman	ec802c4864	Update documentation with configuration and deployment details - CONFIGURATION.md: Add comprehensive system.json keys table with descriptions for all polling, discovery, and UI settings - DEPLOYMENT_MODELS.md: Document audit signing key, agent profile files, org metadata, and multi-tenant storage layout - METRICS_HISTORY.md: Update resourceType values, add maxPoints param, document Pro license requirement for ranges beyond 7d - MULTI_TENANT.md: Add storage layout and migration section, remove completed TODO items from backlog - CENTRALIZED_MANAGEMENT.md: Update links and clarify architecture - API.md: Update endpoint documentation - UNIFIED_AGENT.md: Document --version and --self-test flags	2026-02-01 22:24:48 +00:00
rcourtman	6873913e64	fix: install script and docs improvements - Fixed --disable-docker not being passed to systemd service file. Related to #1151 - Added init: true requirement to HTTPS/TLS docs for Docker. Related to #1166	2026-01-26 20:48:57 +00:00
rcourtman	f1c2d7c12c	docs: add logging overrides to configuration reference Document LOG_FILE, LOG_MAX_SIZE, LOG_MAX_AGE, and LOG_COMPRESS environment variables for log file configuration.	2026-01-22 00:44:33 +00:00
rcourtman	0ca6001bad	docs: update documentation after sensor proxy deprecation Update docs to reflect the simplified temperature monitoring architecture: - Remove references to pulse-sensor-proxy throughout - Update TEMPERATURE_MONITORING.md to focus on unified agent approach - Update CONFIGURATION.md, DEPLOYMENT_MODELS.md, FAQ.md - Remove SECURITY_CHANGELOG.md (proxy-specific security notes) - Clarify current recommended setup in various guides	2026-01-21 12:00:59 +00:00
rcourtman	ee63d438cc	docs: standardize markdown syntax and remove deprecated sensor-proxy docs	2026-01-20 09:43:49 +00:00
rcourtman	035436ad6e	fix: add mutex to prevent concurrent map writes in Docker agent CPU tracking The agent was crashing with 'fatal error: concurrent map writes' when handleCheckUpdatesCommand spawned a goroutine that called collectOnce concurrently with the main collection loop. Both code paths access a.prevContainerCPU without synchronization. Added a.cpuMu mutex to protect all accesses to prevContainerCPU in: - pruneStaleCPUSamples() - collectContainer() delete operation - calculateContainerCPUPercent() Related to #1063	2026-01-15 21:10:55 +00:00
rcourtman	80729408c1	docs: add RBAC endpoints, OIDC group mapping, and update Pro terminology - Add RBAC/role management endpoints to API.md - Document OIDC group-to-role mapping feature in OIDC.md - Add missing config files to CONFIGURATION.md (audit.db, AI files) - Add OIDC_GROUP_ROLE_MAPPINGS env var documentation - Fix "enterprise" -> "Pro" terminology in TROUBLESHOOTING.md - Refocus TEMPERATURE_MONITORING.md on agent method, collapse legacy proxy docs	2026-01-10 13:59:50 +00:00
rcourtman	695ced6273	docs: Add API token scopes and kiosk mode documentation Documents all available token scopes, UI presets, and step-by-step instructions for setting up kiosk mode with read-only dashboard tokens. Related to #1055	2026-01-08 10:27:15 +00:00
rcourtman	3f0808e9f9	docs: comprehensive core and Pro documentation overhaul - Major updates to README.md and docs/README.md for Pulse v5 - Added technical deep-dives for Pulse Pro (docs/PULSE_PRO.md) and AI Patrol (docs/AI.md) - Updated Prometheus metrics documentation and Helm schema for metrics separation - Refreshed security, installation, and deployment documentation for unified agent models - Cleaned up legacy summary files	2026-01-07 17:38:27 +00:00
rcourtman	9cfcdbb247	fix: Use per-node shared flag for storage deduplication The storage deduplication logic only checked cluster config's Shared flag, but this required the cluster config API call to succeed. When the per-node storage API already returns shared=1 (as the user verified), we should use that directly. Now we check three sources for shared storage detection: 1. Per-node API shared flag (storage.Shared) 2. Cluster config shared flag (if available) 3. Storage type heuristics (NFS, RBD, PBS, etc.) Related to #1049	2026-01-07 10:16:23 +00:00
rcourtman	dcdbee3c5c	feat: Add in-app help system with HelpIcon component Add contextual help icons throughout the UI to improve feature discoverability. Users can click (?) icons to see explanations with examples for settings they might not understand. - HelpIcon component with click-to-open popover - Centralized help content registry in /content/help/ - FeatureTip component for dismissible contextual tips - Help added to: alert delay, AI endpoints, update channel	2026-01-07 09:22:23 +00:00
rcourtman	d71754743c	docs: Add PULSE_DISABLE_DOCKER_UPDATE_ACTIONS documentation - Add to DOCKER.md configuration table and new 'Disabling Update Features' section - Add to CONFIGURATION.md monitoring overrides table - Clarify difference between disabling update detection vs hiding buttons	2026-01-02 10:35:04 +00:00
rcourtman	3b433b1336	fix(agent): support PULSE_AGENT_CONNECT_URL and improve detection	2025-12-19 17:01:58 +00:00
rcourtman	968e0a7b3d	fix: reduce syslog flooding by downgrading routine logs to debug level Addresses issue #861 - syslog flooded on docker host Many routine operational messages were being logged at INFO level, causing excessive log volume when monitoring multiple VMs/containers. These messages are now logged at DEBUG level: - Guest threshold checking (every guest, every poll cycle) - Storage threshold checking (every storage, every poll cycle) - Host agent linking messages - Filesystem inclusion in disk calculation - Guest agent disk usage replacement - Polling start/completion messages - Alert cleanup and save messages Users can set LOG_LEVEL=debug to see these messages if needed for troubleshooting. The default INFO level now produces significantly less log output. Also updated documentation in CONFIGURATION.md and DOCKER.md to: - Clarify what each log level includes - Add tip about using LOG_LEVEL=warn for minimal logging	2025-12-18 23:27:32 +00:00
rcourtman	65829983b5	v5: gate legacy sensor-proxy and prune dev docs	2025-12-18 21:51:25 +00:00
rcourtman	2b48b0a459	feat: add --kube-include-all-deployments flag for Kubernetes agent Adds IncludeAllDeployments option to show all deployments, not just problem ones (where replicas don't match desired). This provides parity with the existing --kube-include-all-pods flag. - Add IncludeAllDeployments to kubernetesagent.Config - Add --kube-include-all-deployments flag and PULSE_KUBE_INCLUDE_ALL_DEPLOYMENTS env var - Update collectDeployments to respect the new flag - Add test for IncludeAllDeployments functionality - Update UNIFIED_AGENT.md documentation Addresses feedback from PR #855	2025-12-18 20:58:30 +00:00
rcourtman	5d165fc055	docs: Fix CONFIGURATION.md - logFormat not in system.json The logFormat setting is only available via LOG_FORMAT environment variable, not in system.json. Updated the example and added a note clarifying this. Also added LOG_FORMAT to the environment variables table.	2025-12-02 23:43:45 +00:00
rcourtman	8a54156632	docs: Remove LXC references from CONFIGURATION.md	2025-12-02 23:37:11 +00:00
courtmanr@gmail.com	3c92c38b27	Update docs with missing config, API endpoints, and Docker Compose	2025-12-02 20:46:21 +00:00
courtmanr@gmail.com	f4c2bd7c35	Implement UI toggle for Hide Local Login (related to issue #750 )	2025-11-25 08:14:19 +00:00
courtmanr@gmail.com	2e62dc15b3	Refactor core docs (INSTALL, CONFIGURATION, DOCKER) to be modern and concise	2025-11-25 00:13:07 +00:00
rcourtman	9c6c8cc0a0	Add OIDC CA bundle support	2025-11-22 09:44:03 +00:00
rcourtman	51b368ddc1	feat: make PVE polling interval configurable (related to #467 )	2025-11-18 21:30:04 +00:00
rcourtman	bffc8f3f83	docs: add auto-update runbook	2025-11-14 01:05:06 +00:00
rcourtman	25ae527c95	Clarify sensor proxy HTTPS workflow in docs	2025-11-14 00:48:41 +00:00
rcourtman	f9dc2f6466	docs: Add comprehensive security audit documentation Adds complete documentation for 2025-11-07 security audit and hardening: - SECURITY_AUDIT_2025-11-07.md: Full professional audit report - 9 security issues identified and fixed (4 critical, 4 medium, 1 low) - Detailed findings, remediations, and testing - Security posture improved from B+ to A - 85%+ reduction in exploitable attack surface - SECURITY_CHANGELOG.md: Detailed changelog with migration guide - Complete implementation details for all fixes - Configuration examples - Backwards compatibility notes - New metrics and features - DEPLOYMENT_CHECKLIST.md: Step-by-step deployment guide - Pre-deployment backup procedures - Deployment steps for Docker and LXC - Verification procedures - Rollback procedures - Troubleshooting guide - Success criteria - README.md: Updated with security hardening highlights - Links to audit report - Key security features added Audit performed by Claude (Sonnet 4.5) + Codex collaboration. All implementations by Codex based on Claude specifications. 100% remediation rate (9/9 issues fixed). 17 new tests added, all passing. Related to security audit 2025-11-07.	2025-11-07 17:10:21 +00:00
rcourtman	a1dc451ed4	Document alert reliability features and DLQ API Add comprehensive documentation for new alert system reliability features: API Documentation (docs/API.md): - Dead Letter Queue (DLQ) API endpoints - GET /api/notifications/dlq - Retrieve failed notifications - GET /api/notifications/queue/stats - Queue statistics - POST /api/notifications/dlq/retry - Retry DLQ items - POST /api/notifications/dlq/delete - Delete DLQ items - Prometheus metrics endpoint documentation - 18 metrics covering alerts, notifications, and queue health - Example Prometheus configuration - Example PromQL queries for common monitoring scenarios Configuration Documentation (docs/CONFIGURATION.md): - Alert TTL configuration - maxAlertAgeDays, maxAcknowledgedAgeDays, autoAcknowledgeAfterHours - Flapping detection configuration - flappingEnabled, flappingWindowSeconds, flappingThreshold, flappingCooldownMinutes - Usage examples and common scenarios - Best practices for preventing notification storms All new features are fully documented with examples and default values.	2025-11-06 17:34:05 +00:00
rcourtman	88ad986877	Revert "Hide Settings tab when authentication is not configured" This reverts commit d5a1e3d07729bad61743e8645a636e2545e11038.	2025-11-05 23:21:34 +00:00
rcourtman	3d1c910daa	Hide Settings tab when authentication is not configured Related to #636 When authentication is not configured (hasAuth() returns false), the Settings tab is now automatically hidden from the web interface. This provides a cleaner monitoring-only view for unauthenticated deployments where users only need to check the health of their environment. The Settings icon beside the Alerts tab will only appear when authentication is properly configured via PULSE_AUTH_USER/PASS, API tokens, proxy auth, or OIDC. Changes: - Modified utilityTabs in App.tsx to conditionally include Settings based on hasAuth() signal - Updated CONFIGURATION.md to document this UI behavior	2025-11-05 23:10:20 +00:00
rcourtman	8ca31003a0	docs: document TLS certificate file permissions for HTTPS setup Add comprehensive documentation for HTTPS/TLS configuration including: - File ownership and permission requirements (pulse user) - Common troubleshooting steps for startup failures - Complete setup examples for systemd and Docker - Validation commands for certificate/key verification Related to discussion #634	2025-11-05 23:08:02 +00:00
rcourtman	efa1ec1cd9	docs: document per-metric alert delay configuration (addresses #433 ) Added comprehensive documentation for the per-metric alert delay feature that was requested in issue #433. This feature allows configuring different alert delays for different metrics (e.g., longer delays for CPU spikes, shorter delays for memory pressure). Key additions: - Detailed explanation of delay precedence hierarchy - JSON configuration examples for common use cases - Table of recommended delays by metric type with reasoning - UI access instructions for the Alert Delay row Also added example tests demonstrating the feature's functionality and common configuration patterns. The feature itself was already fully implemented in both backend (metricTimeThresholds support) and frontend (per-metric delay inputs in ResourceTable). This commit surfaces the feature through documentation so users know it exists and how to use it. Related to #433	2025-11-05 20:04:44 +00:00
rcourtman	d52ac6d8b5	Fix CSRF token validation and improve token management - Add Access-Control-Expose-Headers to allow frontend to read X-CSRF-Token response header - Implement proactive CSRF token issuance on GET requests when session exists but CSRF cookie is missing - Ensures frontend always has valid CSRF token before making POST requests - Fixes 403 Forbidden errors when toggling system settings This resolves CSRF validation failures that occurred when CSRF tokens expired or were missing while valid sessions existed.	2025-11-05 09:23:44 +00:00
rcourtman	6eb1a10d9b	Refactor: Code cleanup and localStorage consolidation This commit includes comprehensive codebase cleanup and refactoring: ## Code Cleanup - Remove dead TypeScript code (types/monitoring.ts - 194 lines duplicate) - Remove unused Go functions (GetClusterNodes, MigratePassword, GetClusterHealthInfo) - Clean up commented-out code blocks across multiple files - Remove unused TypeScript exports (helpTextClass, private tag color helpers) - Delete obsolete test files and components ## localStorage Consolidation - Centralize all storage keys into STORAGE_KEYS constant - Update 5 files to use centralized keys: * utils/apiClient.ts (AUTH, LEGACY_TOKEN) * components/Dashboard/Dashboard.tsx (GUEST_METADATA) * components/Docker/DockerHosts.tsx (DOCKER_METADATA) * App.tsx (PLATFORMS_SEEN) * stores/updates.ts (UPDATES) - Benefits: Single source of truth, prevents typos, better maintainability ## Previous Work Committed - Docker monitoring improvements and disk metrics - Security enhancements and setup fixes - API refactoring and cleanup - Documentation updates - Build system improvements ## Testing - All frontend tests pass (29 tests) - All Go tests pass (15 packages) - Production build successful - Zero breaking changes Total: 186 files changed, 5825 insertions(+), 11602 deletions(-)	2025-11-04 21:50:46 +00:00
rcourtman	fb22469eb0	Add disk usage threshold support for Docker containers Extends the Docker monitoring and alerting system to track writable layer usage as a percentage of the container's root filesystem. This helps identify containers with bloated copy-on-write layers before they consume excessive disk space. - Add disk threshold to DockerThresholdConfig (default: 85% trigger, 80% clear) - Evaluate disk alerts for running containers when RootFilesystemBytes > 0 - Include disk metadata (writable layer, total filesystem, block I/O stats) - Update frontend to display and configure disk thresholds - Add test coverage for disk usage alert hysteresis - Document disk monitoring in DOCKER_MONITORING.md Per-container and per-host overrides apply to disk thresholds the same way they do for CPU and memory.	2025-10-29 14:52:25 +00:00
rcourtman	b3285c05c8	Consolidate pending changes - Add Docker metadata test comment - Update alerts configuration and thresholds - Enhance config file watcher - Update documentation - Refine settings UI	2025-10-28 23:20:44 +00:00
rcourtman	e07336dd9f	refactor: remove legacy DISABLE_AUTH flag and enhance authentication UX Major authentication system improvements: - Remove deprecated DISABLE_AUTH environment variable support - Update all documentation to remove DISABLE_AUTH references - Add auth recovery instructions to docs (create .auth_recovery file) - Improve first-run setup and Quick Security wizard flows - Enhance login page with better error messaging and validation - Refactor Docker hosts view with new unified table and tree components - Add useDebouncedValue hook for better search performance - Improve Settings page with better security configuration UX - Update mock mode and development scripts for consistency - Add ScrollableTable persistence and improved responsive design Backend changes: - Remove DISABLE_AUTH flag detection and handling - Improve auth configuration validation and error messages - Enhance security status endpoint responses - Update router integration tests Frontend changes: - New Docker components: DockerUnifiedTable, DockerTree, DockerSummaryStats - Better connection status indicator positioning - Improved authentication state management - Enhanced CSRF and session handling - Better loading states and error recovery This completes the migration away from the insecure DISABLE_AUTH pattern toward proper authentication with recovery mechanisms.	2025-10-27 19:46:51 +00:00
rcourtman	68ce8e7520	feat: finalize swarm service monitoring (#598 )	2025-10-26 09:35:49 +00:00
rcourtman	5c54685f04	Add API token scopes and standalone host agent Introduces granular permission scopes for API tokens (docker:report, docker:manage, host-agent:report, monitoring:read/write, settings:read/write) allowing tokens to be restricted to minimum required access. Legacy tokens default to full access until scopes are explicitly configured. Adds standalone host agent for monitoring Linux, macOS, and Windows servers outside Proxmox/Docker estates. New Servers workspace in UI displays uptime, OS metadata, and capacity metrics from enrolled agents. Includes comprehensive token management UI overhaul with scope presets, inline editing, and visual scope indicators.	2025-10-23 11:40:31 +00:00
rcourtman	ff4dc49ae4	Update Pulse install flow and related components	2025-10-21 19:58:53 +00:00
rcourtman	e0396c1362	docs: update documentation for diagnostics improvements Add comprehensive operator documentation for the new observability features introduced in the previous commit. New Documentation: - docs/monitoring/PROMETHEUS_METRICS.md - Complete reference for all 18 new Prometheus metrics with alert suggestions Updated Documentation: - docs/API.md - Document X-Request-ID and X-Diagnostics-Cached-At headers, explain diagnostics endpoint caching behavior - docs/TROUBLESHOOTING.md - Add section on correlating API calls with logs using request IDs - docs/operations/ADAPTIVE_POLLING_ROLLOUT.md - Update monitoring checklists with new per-node and scheduler metrics - docs/CONFIGURATION.md - Clarify LOG_FILE dual-output behavior and rotation defaults These updates ensure operators understand: - How to set up monitoring/alerting for new metrics - How to configure file logging with rotation - How to troubleshoot using request correlation - What metrics are available for dashboards Related to: `495e6c794` (feat: comprehensive diagnostics improvements)	2025-10-21 12:45:19 +00:00
rcourtman	ddc9a7a068	docs: comprehensive documentation for rate limit fix and configurability Document the pulse-sensor-proxy rate limiting bug fix and new configurability across all relevant documentation: TEMPERATURE_MONITORING.md: - Added 'Rate Limiting & Scaling' section with symptom diagnosis - Included sizing table for 1-3, 4-10, 10-20, and 30+ node deployments - Provided tuning formula: interval_ms = polling_interval / node_count TROUBLESHOOTING.md: - Added 'Temperature data flickers after adding nodes' section - Step-by-step diagnosis using limiter metrics and scheduler health - Quick fix with config example CONFIGURATION.md: - Added pulse-sensor-proxy/config.yaml reference section - Documented rate_limit.per_peer_interval_ms and per_peer_burst fields - Included defaults and example override pulse-sensor-proxy-runbook.md: - Updated quick reference with new defaults (1 req/sec, burst 5) - Added 'Rate Limit Tuning' procedure with 4 deployment profiles - Included validation steps and monitoring commands TEMPERATURE_MONITORING_SECURITY.md: - Updated rate limiting section with new defaults - Added configurable overrides guidance - Documented security considerations for production deployments Related commits: - `46b8b8d08`: Initial rate limit fix (hardcoded defaults) - `ca534e2b6`: Made rate limits configurable via YAML - `e244da837`: Added guidance for large deployments (30+ nodes)	2025-10-21 11:36:07 +00:00
rcourtman	fd0a4f2b0a	docs: update documentation for v4.24.0 features Updates documentation to reflect features implemented in recent commits: Security & API Enhancements: - Rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining, Retry-After) - Audit logging for rollback actions and scheduler health - Runtime logging configuration tracking Scheduler Health API: - Document new v4.24.0 endpoint features - Per-instance circuit breaker status - Dead-letter queue tracking - Staleness metrics - Enhanced response format with backward compatibility Version & Health Endpoints: - Updated /api/version response fields - Optional health endpoint fields - Deployment type and update availability Configuration & Installation: - HTTP config fetch via PULSE_INIT_CONFIG_URL - Updated environment variable documentation - Enhanced FAQ entries Monitoring & Operations: - Adaptive polling architecture documentation - Rollback procedure references - Production deployment guidance All documentation changes align with implemented features from commits: - `656ae0d25` (PMG test fix) - `dec85a4ef` (PBS/PMG stubs + HTTP config) - Earlier commits: scheduler health API, rollback, rate limiting	2025-10-20 16:08:10 +00:00
Pulse Automation Bot	0b4e4f9c59	Add configurable backup polling interval	2025-10-18 13:06:41 +00:00
rcourtman	4838793677	feat: enhance alerts system with tests and improved thresholds - Add comprehensive test coverage for alerts package with 285+ new tests - Implement ThresholdsTable component with metric thresholds display - Enhance Alerts page UI with improved layout and metric filtering - Add frontend component tests for Alerts page and ThresholdsTable - Set up Vitest testing infrastructure for SolidJS components - Improve config persistence with better validation - Expand discovery tests with 333+ test cases - Update API, configuration, and Docker monitoring documentation	2025-10-15 22:25:04 +00:00
rcourtman	261bd7ac74	Adopt multi-token auth across docs, UI, and tooling	2025-10-14 15:47:49 +00:00
rcourtman	f46ff1792b	Fix settings security tab navigation	2025-10-11 23:29:47 +00:00

47 Commits