Commit Graph

207 Commits

Author SHA1 Message Date
rcourtman
ad4acf1222 chore: add frontend utilities and metrics documentation
- Add useResizeObserver and useTooltip React hooks
- Add utility functions for anomaly colors, error extraction, text width, and threshold colors
- Add METRICS_DATA_FLOW.md documentation
- Ignore SQLite temp files (*.db-shm, *.db-wal)
2026-01-22 13:48:41 +00:00
rcourtman
f1c2d7c12c docs: add logging overrides to configuration reference
Document LOG_FILE, LOG_MAX_SIZE, LOG_MAX_AGE, and LOG_COMPRESS
environment variables for log file configuration.
2026-01-22 00:44:33 +00:00
rcourtman
c8b6cbfc6d feat(pro): long-term metrics history (30d/90d)
- Add FeatureLongTermMetrics license feature for Pro tier
- Implement tiered storage in metrics store (raw, minute, hourly, daily)
- Add covering index for unified history query performance
- Seed mock data for 90 days with appropriate aggregation tiers
- Update PULSE_PRO.md to document the feature
- 7-day history remains free, 30d/90d requires Pro license
2026-01-22 00:42:41 +00:00
rcourtman
0ca6001bad docs: update documentation after sensor proxy deprecation
Update docs to reflect the simplified temperature monitoring architecture:
- Remove references to pulse-sensor-proxy throughout
- Update TEMPERATURE_MONITORING.md to focus on unified agent approach
- Update CONFIGURATION.md, DEPLOYMENT_MODELS.md, FAQ.md
- Remove SECURITY_CHANGELOG.md (proxy-specific security notes)
- Clarify current recommended setup in various guides
2026-01-21 12:00:59 +00:00
rcourtman
ee63d438cc docs: standardize markdown syntax and remove deprecated sensor-proxy docs 2026-01-20 09:43:49 +00:00
rcourtman
035436ad6e fix: add mutex to prevent concurrent map writes in Docker agent CPU tracking
The agent was crashing with 'fatal error: concurrent map writes' when
handleCheckUpdatesCommand spawned a goroutine that called collectOnce
concurrently with the main collection loop. Both code paths access
a.prevContainerCPU without synchronization.

Added a.cpuMu mutex to protect all accesses to prevContainerCPU in:
- pruneStaleCPUSamples()
- collectContainer() delete operation
- calculateContainerCPUPercent()

Related to #1063
2026-01-15 21:10:55 +00:00
rcourtman
a7de907c35 chore: remove internal planning doc, add gitignore patterns
- Remove docs/AGENTS_AI_SCOPE_PLAN.md (internal dev doc)
- Add gitignore patterns for *_PLAN.md, *_ROADMAP.md, *IMPLEMENTATION*.md in docs/
2026-01-15 13:53:42 +00:00
rcourtman
8c7581d32c feat(profiles): add AI-assisted profile suggestions
Add ability for users to describe what kind of agent profile they need
in natural language, and have AI generate a suggestion with name,
description, config values, and rationale.

- Add ProfileSuggestionHandler with schema-aware prompting
- Add SuggestProfileModal component with example prompts
- Update AgentProfilesPanel with suggest button and description field
- Streamline ValidConfigKeys to only agent-supported settings
- Update profile validation tests for simplified schema
2026-01-15 13:24:18 +00:00
rcourtman
95b849e213 chore: remove internal dev roadmap doc 2026-01-12 15:28:05 +00:00
rcourtman
b5233466a3 docs: add Pro features roadmap and implementation status
Documents implementation status for:
- Advanced SSO (SAML & Multi-Provider) - 100%
- Advanced Reporting - 100%
- Audit Logging - 100%
- Agent Profiles - 100%
- RBAC - 100%
- AI Auto-Fix - 100%
- Kubernetes AI - 55%
- AI Alert Analysis - 70%
- AI Patrol - 85%
2026-01-12 15:21:46 +00:00
rcourtman
f527e6ebd0 docs: fix Kubernetes DaemonSet deployment guide
Fixes #1091 - addresses all three documentation issues reported:

1. Binary path: Changed from /usr/local/bin/pulse-agent (which doesn't
   exist in the main image) to /opt/pulse/bin/pulse-agent-linux-amd64

2. PULSE_AGENT_ID: Added to example and documented why it's required
   for DaemonSets (prevents token conflicts when all pods share one
   API token)

3. Resource visibility flags: Added PULSE_KUBE_INCLUDE_ALL_PODS and
   PULSE_KUBE_INCLUDE_ALL_DEPLOYMENTS to example, with explanation
   of the default behavior (show only problematic resources)

Also added tolerations, resource requests/limits, and ARM64 note.
2026-01-11 21:43:23 +00:00
rcourtman
80729408c1 docs: add RBAC endpoints, OIDC group mapping, and update Pro terminology
- Add RBAC/role management endpoints to API.md
- Document OIDC group-to-role mapping feature in OIDC.md
- Add missing config files to CONFIGURATION.md (audit.db, AI files)
- Add OIDC_GROUP_ROLE_MAPPINGS env var documentation
- Fix "enterprise" -> "Pro" terminology in TROUBLESHOOTING.md
- Refocus TEMPERATURE_MONITORING.md on agent method, collapse legacy proxy docs
2026-01-10 13:59:50 +00:00
rcourtman
2a8f55d719 feat(enterprise): add Advanced Reporting and Audit Webhooks integration
This commit adds enterprise-grade reporting and audit capabilities:

Reporting:
- Refactored metrics store from internal/ to pkg/ for enterprise access
- Added pkg/reporting with shared interfaces for report generation
- Created API endpoint: GET /api/admin/reports/generate
- New ReportingPanel.tsx for PDF/CSV report configuration

Audit Webhooks:
- Extended pkg/audit with webhook URL management interface
- Added API endpoint: GET/POST /api/admin/webhooks/audit
- New AuditWebhookPanel.tsx for webhook configuration
- Updated Settings.tsx with Reporting and Webhooks tabs

Server Hardening:
- Enterprise hooks now execute outside mutex with panic recovery
- Removed dbPath from metrics Stats API to prevent path disclosure
- Added storage metrics persistence to polling loop

Documentation:
- Updated README.md feature table
- Updated docs/API.md with new endpoints
- Updated docs/PULSE_PRO.md with feature descriptions
- Updated docs/WEBHOOKS.md with audit webhooks section
2026-01-09 21:31:49 +00:00
rcourtman
3e2824a7ff feat: remove Enterprise badges, simplify Pro upgrade prompts
- Replace barrel import in AuditLogPanel.tsx to fix ad-blocker crash
- Remove all Enterprise/Pro badges from nav and feature headers
- Simplify upgrade CTAs to clean 'Upgrade to Pro' links
- Update docs: PULSE_PRO.md, API.md, README.md, SECURITY.md
- Align terminology: single Pro tier, no separate Enterprise tier

Also includes prior refactoring:
- Move auth package to pkg/auth for enterprise reuse
- Export server functions for testability
- Stabilize CLI tests
2026-01-09 16:51:08 +00:00
rcourtman
33bb0a95bb docs: Fix formatting in API reference 2026-01-08 20:15:25 +00:00
rcourtman
73c5128a87 feat(audit): Add audit log API endpoints and UI with signature verification
- Add GET /api/audit endpoint for listing events with filters
- Add GET /api/audit/:id/verify endpoint for signature verification
- Add AuditLogPanel UI component with filtering and verification
- Update docs with audit API documentation
- Add localStorage utils for persisting UI state
- Update gitignore patterns
2026-01-08 19:19:57 +00:00
rcourtman
7342191075 docs: fix Helm chart install commands to use GitHub Pages repo
The GHCR OCI registry (ghcr.io/rcourtman/pulse-chart) is returning 403/404
errors for unauthenticated users. Updated all Helm references to use the
working GitHub Pages Helm repository at https://rcourtman.github.io/Pulse

Fixes install issues reported by customers trying to deploy via Helm.

Files updated:
- docs/KUBERNETES.md
- docs/INSTALL.md
- docs/DEPLOYMENT_MODELS.md
- docs/UPGRADE_v5.md
2026-01-08 14:27:45 +00:00
rcourtman
22e01e2244 feat: Add centralized agent configuration management (Pro)
Allows administrators to create configuration profiles and assign them
to agents for centralized fleet management.

- Configuration profiles with customizable settings (Docker, K8s,
  Proxmox monitoring, log level, reporting interval)
- Profile assignment to agents by ID
- Agent-side remote config client to fetch settings on startup
- Full CRUD API at /api/admin/profiles
- Settings UI panel in Settings → Agents → Agent Profiles
- Automatic cleanup of assignments when profiles are deleted
2026-01-08 12:06:36 +00:00
rcourtman
7db6b3e47d feat: Add AI chat session sync across devices
Implements server-side persistence for AI chat sessions, allowing users
to continue conversations across devices and browser sessions. Related
to #1059.

Backend:
- Add chat session CRUD API endpoints (GET/PUT/DELETE)
- Add persistence layer with per-user session storage
- Support session cleanup for old sessions (90 days)
- Multi-user support via auth context

Frontend:
- Rewrite aiChat store with server sync (debounced)
- Add session management UI (new conversation, switch, delete)
- Local storage as fallback/cache
- Initialize sync on app startup when AI is enabled
2026-01-08 10:47:45 +00:00
rcourtman
695ced6273 docs: Add API token scopes and kiosk mode documentation
Documents all available token scopes, UI presets, and step-by-step
instructions for setting up kiosk mode with read-only dashboard tokens.

Related to #1055
2026-01-08 10:27:15 +00:00
rcourtman
8c4bef27f0 docs: improve reverse proxy HTTPS detection and Swarm troubleshooting
- Add detailed HTTPS detection troubleshooting to REVERSE_PROXY.md
- Explain X-Forwarded-Proto header requirement for nginx/Caddy/Apache
- Add Docker Swarm troubleshooting section to UNIFIED_AGENT.md
- Document how to force Docker runtime if auto-detection fails

Based on customer feedback.
2026-01-07 18:23:48 +00:00
rcourtman
3f0808e9f9 docs: comprehensive core and Pro documentation overhaul
- Major updates to README.md and docs/README.md for Pulse v5
- Added technical deep-dives for Pulse Pro (docs/PULSE_PRO.md) and AI Patrol (docs/AI.md)
- Updated Prometheus metrics documentation and Helm schema for metrics separation
- Refreshed security, installation, and deployment documentation for unified agent models
- Cleaned up legacy summary files
2026-01-07 17:38:27 +00:00
rcourtman
9cfcdbb247 fix: Use per-node shared flag for storage deduplication
The storage deduplication logic only checked cluster config's Shared
flag, but this required the cluster config API call to succeed. When
the per-node storage API already returns shared=1 (as the user
verified), we should use that directly.

Now we check three sources for shared storage detection:
1. Per-node API shared flag (storage.Shared)
2. Cluster config shared flag (if available)
3. Storage type heuristics (NFS, RBD, PBS, etc.)

Related to #1049
2026-01-07 10:16:23 +00:00
rcourtman
dcdbee3c5c feat: Add in-app help system with HelpIcon component
Add contextual help icons throughout the UI to improve feature
discoverability. Users can click (?) icons to see explanations
with examples for settings they might not understand.

- HelpIcon component with click-to-open popover
- Centralized help content registry in /content/help/
- FeatureTip component for dismissible contextual tips
- Help added to: alert delay, AI endpoints, update channel
2026-01-07 09:22:23 +00:00
rcourtman
773376fa5d docs: add deep dive summaries for notifications, discovery, and agent exec 2026-01-02 11:18:28 +00:00
rcourtman
d71754743c docs: Add PULSE_DISABLE_DOCKER_UPDATE_ACTIONS documentation
- Add to DOCKER.md configuration table and new 'Disabling Update Features' section
- Add to CONFIGURATION.md monitoring overrides table
- Clarify difference between disabling update detection vs hiding buttons
2026-01-02 10:35:04 +00:00
rcourtman
94717ba867 feat(agent): add --docker-runtime flag for podman/docker selection
On systems where Docker compatibility layer obscures Podman (like CoreOS),
the auto-detection can fail. Users can now force the runtime:

  --docker-runtime podman
  PULSE_DOCKER_RUNTIME=podman

Valid values: auto (default), docker, podman

Related to Discussion #958
2026-01-01 00:24:37 +00:00
rcourtman
e3b3785582 feat(agent): add option to disable Docker update checks
Add PULSE_DISABLE_DOCKER_UPDATE_CHECKS environment variable and
--disable-docker-update-checks flag to disable Docker image update
detection. This is useful for:
- Avoiding Docker Hub rate limits
- Users who don't want update notifications in their dashboard

Related to Discussion #982
2026-01-01 00:20:49 +00:00
rcourtman
c1f4b8f40b feat: PULSE_DISK_EXCLUDE now applies to SMART monitoring. Related to #983
Previously, the PULSE_DISK_EXCLUDE environment variable and --disk-exclude
flag only filtered mount points in the hostmetrics collector. This change
extends the exclusion to SMART data collection.

Changes:
- Updated smartctl.CollectLocal() to accept diskExclude patterns
- Added matchesDeviceExclude() for block device pattern matching
- Patterns support: exact match (sda), prefix (nvme*), contains (*cache*)
- Updated hostagent to pass DiskExclude to SMART collector
- Added comprehensive tests for pattern matching
- Updated documentation
2025-12-31 23:07:01 +00:00
rcourtman
d07b471e40 Refactor Docker agent: metrics collection, security checks, and batch updates
- Separated metrics collection into internal/dockeragent/collect.go
- Added agent self-update pre-flight check (--self-test)
- Implemented signed binary verification with key rotation for updates
- Added batch update support to frontend with parallel processing
- Cleaned up agent.go and added startup cleanup for backup containers
- Updated documentation for Docker features and agent security
2025-12-29 17:20:18 +00:00
rcourtman
3b506e3ecb docs: Add Docker container update documentation
- Document the new one-click container update feature
- Explain update detection, safety features, and requirements
- Bump VERSION to 5.0.6 for release
2025-12-29 10:06:34 +00:00
rcourtman
ca4c2383b6 docs(pbs): add Password Setup method for Docker PBS users 2025-12-26 10:12:49 +00:00
rcourtman
f891b6217e docs: add comprehensive PBS integration guide
- Created new PBS.md with setup instructions for direct PBS connection
- Explains difference between direct PBS API vs PVE passthrough
- Documents three setup methods: agent install, one-click script, manual
- Includes permissions reference and troubleshooting section
- Added link in docs/README.md under Monitoring & Agents
2025-12-26 09:45:42 +00:00
rcourtman
c3a112a2ec docs: add social preview image for GitHub 2025-12-25 20:24:45 +00:00
rcourtman
a9205e7d51 docs: update dashboard screenshot to match landing page 2025-12-25 20:17:39 +00:00
rcourtman
b638420b72 docs: Add disk exclusion and S.M.A.R.T. documentation
- Document --disk-exclude flag and PULSE_DISK_EXCLUDE env var
- Document pattern types (exact, prefix, contains)
- Document S.M.A.R.T. requirements (smartmontools package)
- Add --enable-commands to configuration table
2025-12-25 12:20:07 +00:00
rcourtman
2c76cdaa43 docs: add backup permissions fix for v4→v5 upgrades
Users upgrading from v4 may have tokens that lack PVEDatastoreAdmin
permission on /storage, causing backups to not appear.

Added section to UPGRADE_v5.md with quick fix command and alternative
approach (re-run agent setup).

Related to #883
2025-12-24 16:33:01 +00:00
rcourtman
9bbe0d6203 docs: expand AI Patrol documentation with full context explanation
- Add comprehensive explanation of what data Patrol receives
- Document the enriched context (trends, baselines, predictions)
- Explain operational memory (notes, dismissed alerts, incidents)
- Clarify why Patrol catches issues that static alerts miss
- Mark Patrol as Pro feature with link to pulserelay.pro
2025-12-23 23:21:36 +00:00
rcourtman
77db408114 feat(ai): add finding validation layer to reduce patrol noise
- Add validateAIFindings() that cross-checks AI findings against actual metrics
- Filter out low-confidence findings (CPU <50%, memory <60%, disk <70%)
- Always allow critical findings, backup issues, and reliability findings through
- Update AI system prompt with stricter thresholds and explicit noise examples
- Add 'before creating a finding' checklist for AI (the 3am test)
- Update AI.md docs with clear value proposition and expectations
- Add comprehensive tests for the validation layer

This ensures paying users get immediate value without noise.
2025-12-22 23:28:09 +00:00
rcourtman
8d0ca16124 docs: update AI.md with v5 patrol and finding management features
- Added finding management section (resolve, dismiss, suppress)
- Documented patrol service and severity levels
- Added AI-assisted remediation capabilities
- Added Ollama tool/function calling support note
- Added new troubleshooting tips for findings persistence
2025-12-22 14:43:56 +00:00
rcourtman
b6140cd6e8 feat(oidc): Add refresh token support for long-lived sessions
When offline_access scope is configured, Pulse now stores and uses
OIDC refresh tokens to automatically extend sessions. Sessions remain
valid as long as the IdP allows token refresh (typically 30-90 days).

Changes:
- Store OIDC tokens (refresh token, expiry, issuer) alongside sessions
- Automatically refresh tokens when access token nears expiry
- Invalidate session if IdP revokes access (forces re-login)
- Add background token refresh with concurrency protection
- Persist OIDC tokens across restarts

Related to #854
2025-12-20 10:45:46 +00:00
rcourtman
3b433b1336 fix(agent): support PULSE_AGENT_CONNECT_URL and improve detection 2025-12-19 17:01:58 +00:00
rcourtman
968e0a7b3d fix: reduce syslog flooding by downgrading routine logs to debug level
Addresses issue #861 - syslog flooded on docker host

Many routine operational messages were being logged at INFO level,
causing excessive log volume when monitoring multiple VMs/containers.
These messages are now logged at DEBUG level:

- Guest threshold checking (every guest, every poll cycle)
- Storage threshold checking (every storage, every poll cycle)
- Host agent linking messages
- Filesystem inclusion in disk calculation
- Guest agent disk usage replacement
- Polling start/completion messages
- Alert cleanup and save messages

Users can set LOG_LEVEL=debug to see these messages if needed for
troubleshooting. The default INFO level now produces significantly
less log output.

Also updated documentation in CONFIGURATION.md and DOCKER.md to:
- Clarify what each log level includes
- Add tip about using LOG_LEVEL=warn for minimal logging
2025-12-18 23:27:32 +00:00
rcourtman
65829983b5 v5: gate legacy sensor-proxy and prune dev docs 2025-12-18 21:51:25 +00:00
rcourtman
2b48b0a459 feat: add --kube-include-all-deployments flag for Kubernetes agent
Adds IncludeAllDeployments option to show all deployments, not just
problem ones (where replicas don't match desired). This provides parity
with the existing --kube-include-all-pods flag.

- Add IncludeAllDeployments to kubernetesagent.Config
- Add --kube-include-all-deployments flag and PULSE_KUBE_INCLUDE_ALL_DEPLOYMENTS env var
- Update collectDeployments to respect the new flag
- Add test for IncludeAllDeployments functionality
- Update UNIFIED_AGENT.md documentation

Addresses feedback from PR #855
2025-12-18 20:58:30 +00:00
Tomas Hruska
a419b6237a support wildcards --kube-include-namespace/--kube-exclude-namespace 2025-12-18 00:00:30 +01:00
rcourtman
944911a4cd Add PNG version of logo for GitHub App 2025-12-16 19:35:43 +00:00
rcourtman
b15097331b docs: fix Docker socket mount path for standalone sensor proxy
The standalone installer creates the socket at /mnt/pulse-proxy on the host,
not /run/pulse-sensor-proxy. Updated documentation to show the correct mount:
  /mnt/pulse-proxy:/run/pulse-sensor-proxy:ro

Related to #822
2025-12-15 02:08:15 +00:00
rcourtman
2e06f6b966 feat: auto-detect platforms during agent install and allow multi-host tokens
- Install script now auto-detects Docker, Kubernetes, and Proxmox
- Platform monitoring is enabled automatically when detected
- Users can override with --disable-* or --enable-* flags
- Allow same token to register multiple hosts (one per hostname)
- Update tests to reflect new multi-host token behavior
- Improve CompleteStep and UnifiedAgents UI components
- Update UNIFIED_AGENT.md documentation
2025-12-14 16:21:59 +00:00
rcourtman
5e2311035b chore: Fix lint warnings in SetupWizard and add AI API docs
- Fixed unused variables in wizard components
- Fixed invalid aiEnabled field in FeaturesStep (AI uses separate API)
- Added AI endpoints section to API.md
2025-12-13 15:36:40 +00:00