Commit Graph

1847 Commits

Author SHA1 Message Date
rcourtman
13af682ce1 fix(config): add PULSE_AGENT_CONNECT_URL and improve Docker detection
- Add AgentConnectURL config option to override public URL for agents
- Improve install.sh to diagnose docker detection failures
- Update router to prioritize AgentConnectURL for agent install commands
2025-12-19 16:43:14 +00:00
rcourtman
ef3cf946e3 chore(e2e): reduce verbose logging in pretest health checks 2025-12-19 16:23:07 +00:00
rcourtman
6ef27d31ca fix(e2e): use http module instead of fetch for health checks
Exit code 13 in Node.js indicates 'Unfinished Top-Level Await'.
Replacing fetch with native http module to see if this resolves the issue.
2025-12-19 16:11:57 +00:00
rcourtman
d786e55f8f fix(e2e): add signal handlers and detailed tracing to diagnose exit code 13 2025-12-19 15:59:48 +00:00
rcourtman
98c4a08d64 fix(e2e): add debugging and container logging to diagnose CI failures
- Separate pretest (start containers) from test (run playwright) steps
- Add container log collection step that runs on failure
- Add verbose logging to pretest.mjs for better failure diagnosis
- Use PULSE_E2E_SKIP_DOCKER and PULSE_E2E_SKIP_PLAYWRIGHT_INSTALL flags
2025-12-19 15:48:35 +00:00
rcourtman
a93148105f fix: exclude WebSocket from rate limiting to prevent UI lockout
The /ws endpoint was rate limited to 30 connections/minute. After
prolonged use with WebSocket reconnections (network hiccups, browser
tab throttling, etc.), users with many Docker containers would hit
this limit and get stuck with a 'Connecting...' UI.

WebSocket connections are already authenticated via session/API token
and reconnections are normal behavior, so rate limiting is not needed.

Fixes #859 (second report about WebSocket rate limiting after hours of use).
2025-12-19 14:51:52 +00:00
rcourtman
16f143d925 fix: respect X-Forwarded-Proto header for hasHTTPS in /api/security/status
Fixes issue where /api/security/status reports hasHTTPS=false when accessed
via HTTPS through a reverse proxy like Caddy.

Resolves feedback from discussion #845 (clar2242).
2025-12-19 14:40:23 +00:00
rcourtman
968e0a7b3d fix: reduce syslog flooding by downgrading routine logs to debug level
Addresses issue #861 - syslog flooded on docker host

Many routine operational messages were being logged at INFO level,
causing excessive log volume when monitoring multiple VMs/containers.
These messages are now logged at DEBUG level:

- Guest threshold checking (every guest, every poll cycle)
- Storage threshold checking (every storage, every poll cycle)
- Host agent linking messages
- Filesystem inclusion in disk calculation
- Guest agent disk usage replacement
- Polling start/completion messages
- Alert cleanup and save messages

Users can set LOG_LEVEL=debug to see these messages if needed for
troubleshooting. The default INFO level now produces significantly
less log output.

Also updated documentation in CONFIGURATION.md and DOCKER.md to:
- Clarify what each log level includes
- Add tip about using LOG_LEVEL=warn for minimal logging
2025-12-18 23:27:32 +00:00
rcourtman
8400976e80 fix: wait for async save in guest metadata test
The TestGuestMetadataStore_GetWithLegacyMigration_ClusteredMatchesNodeFormat
test was flaky because it triggered an async save in GetWithLegacyMigration
but didn't wait for it to complete. When the test ended, t.TempDir() tried
to clean up while the goroutine was still writing, causing 'directory not
empty' errors on CI.

Added time.Sleep(100ms) to wait for the async save, matching the pattern
used in other similar tests in the same file.
2025-12-18 22:48:15 +00:00
rcourtman
0d11da74e2 refactor(ui): standardize URL editing with shared UrlEditPopover component
- Create reusable UrlEditPopover component with fixed positioning
- Add createUrlEditState hook for managing editing state
- Update DockerHostSummaryTable to use new popover
- Update DockerUnifiedTable (containers & services) to use new popover
- Update GuestRow (Proxmox VMs/containers) to use new popover
- Update HostsOverview (Proxmox hosts) to use new popover
- Add Docker host metadata API for custom URLs
- Consistent styling with save, delete, cancel buttons and keyboard shortcuts
2025-12-18 22:22:55 +00:00
rcourtman
65829983b5 v5: gate legacy sensor-proxy and prune dev docs 2025-12-18 21:51:25 +00:00
rcourtman
0d6aaff253 fix: AI Patrol frequency not obeying settings
Fixes #858

The patrol interval setting was not being properly applied due to:

1. ReconfigurePatrol() was setting the deprecated QuickCheckInterval field
   instead of the preferred Interval field

2. SetConfig() was comparing raw field values instead of using GetInterval()
   to compare effective intervals, causing change detection to fail

3. The API response was missing interval_ms, preventing the frontend from
   displaying the correct interval

Changes:
- Update StartPatrol() and ReconfigurePatrol() to use the Interval field
- Fix SetConfig() to use GetInterval() for interval comparison
- Add IntervalMs to PatrolStatusResponse and include it in the API response
2025-12-18 21:33:50 +00:00
rcourtman
c4bf77b9b6 fix(frontend): resolve UI rate limiting on Docker overview (#859)
Previously, each DockerContainerRow component made 2 API calls on mount:
- AIAPI.getSettings() for AI enabled status
- DockerMetadataAPI.getMetadata() for annotations

With 100+ containers, this resulted in 200+ API calls firing simultaneously,
exceeding the 500 requests/minute rate limit and causing 429 errors.

Fix:
- Lift AI settings check to DockerUnifiedTable parent component (1 call)
- Use pre-fetched dockerMetadata prop for annotations (already batch-fetched)
- Pass aiEnabled and initialNotes as props to child rows

This reduces API calls from O(n*2) to O(1) when loading the Docker overview.

Fixes #859
2025-12-18 21:17:56 +00:00
rcourtman
2b48b0a459 feat: add --kube-include-all-deployments flag for Kubernetes agent
Adds IncludeAllDeployments option to show all deployments, not just
problem ones (where replicas don't match desired). This provides parity
with the existing --kube-include-all-pods flag.

- Add IncludeAllDeployments to kubernetesagent.Config
- Add --kube-include-all-deployments flag and PULSE_KUBE_INCLUDE_ALL_DEPLOYMENTS env var
- Update collectDeployments to respect the new flag
- Add test for IncludeAllDeployments functionality
- Update UNIFIED_AGENT.md documentation

Addresses feedback from PR #855
2025-12-18 20:58:30 +00:00
rcourtman
9bc63441a1 fix: eliminate race conditions in release workflow chain
The promote-floating-tags and helm-pages workflows now trigger
automatically via workflow_run when publish-docker.yml completes,
instead of being dispatched immediately by create-release.yml.

This ensures Docker images are fully available before:
- Floating tags (rc, latest, major.minor) are promoted
- Helm chart smoke tests try to pull the image

Key changes:
- promote-floating-tags.yml: Add workflow_run trigger, extract tag
  from triggering workflow, wait for BOTH pulse and agent images
- helm-pages.yml: Add workflow_run trigger, extract version from
  triggering workflow
- create-release.yml: Remove manual dispatch for these workflows
2025-12-18 19:33:39 +00:00
rcourtman
e451f64331 Auto-update Helm chart documentation helm-chart-5.0.0-rc.4 2025-12-18 19:24:13 +00:00
rcourtman
fb6f4c7e9c Auto-update Helm chart version to 5.0.0-rc.4 2025-12-18 19:09:49 +00:00
rcourtman
cb9c4268e3 Auto-update Helm chart documentation 2025-12-18 19:09:48 +00:00
rcourtman
0a81b8090b fix: restore Hide Local Login functionality for OIDC/SSO (#857)
When 'Hide local login form' was enabled in Settings -> Authentication,
the local login form was still displayed instead of showing only the
SSO login. This regression occurred in Pulse 5.x.

Root cause: When App.tsx passed hasAuth to Login.tsx, the Login component
created a minimal SecurityStatus object with only hasAuthentication set,
missing the hideLocalLogin and other OIDC settings.

Changes:
- App.tsx: Store and pass full securityStatus to Login component
- Login.tsx: Accept securityStatus prop and initialize state from it
- Login.tsx: Initialize authStatus directly from props to respect
  hideLocalLogin on first render
- Added tests for hideLocalLogin behavior

Fixes #857
v5.0.0-rc.4
2025-12-18 18:33:34 +00:00
rcourtman
d19765e8bc fix: use 12+ char password for security setup test
Password validation requires minimum 12 characters.
2025-12-18 18:10:36 +00:00
rcourtman
98a6f44cbe fix: add apiToken to security quick-setup payload
The /api/security/quick-setup endpoint requires username, password, AND
apiToken fields. Added a dummy 64-char hex API token for the test.
2025-12-18 17:57:18 +00:00
rcourtman
3af584bb5c fix: perform security setup before login in update integration test
The test was failing with 401 because the Pulse container starts in
fresh state requiring bootstrap token authentication. Added a
setupCredentials step that calls /api/security/quick-setup with the
known bootstrap token to create the admin user before attempting login.
2025-12-18 17:48:04 +00:00
rcourtman
f09e427c18 fix: resolve all frontend lint errors for unused imports and type issues
- Remove unused type imports: SnapshotAlertConfig, PMGThresholdDefaults, RawOverrideConfig, BackupAlertConfig
- Remove unused imports: Settings, Power, JSX, onMount, createEffect
- Remove unused function _getUnitSuffix
- Fix GuestDefaults type to avoid index signature conflict
- Prefix unused catch variables with underscore
- Fix StickyNote title prop by wrapping in span element
2025-12-18 17:07:05 +00:00
rcourtman
90799f4771 fix: correct pod/deployment filtering logic and fix test helper calls
- Remove unused sets import from kubernetesagent
- Fix inverted filtering logic: keep problem pods/deployments, skip healthy ones
- Fix test helper calls: use slice literals instead of undefined makeNamespaceSet
2025-12-18 16:59:37 +00:00
rcourtman
b05791a3e5 fix: remove unused sets import in kubernetesagent 2025-12-18 16:42:51 +00:00
rcourtman
e778ba76d9 chore: bump version to 5.0.0-rc.4 2025-12-18 16:27:43 +00:00
rcourtman
fdb2a07f56 fix(agent): find zpool binary on TrueNAS SCALE (#718)
Enhanced zpool binary lookup to try common paths when exec.LookPath fails.
This fixes issue #718 where TrueNAS SCALE reports inflated storage because
the agent runs with a restricted PATH that doesn't include /usr/sbin.

Changes:
- Added findZpool() helper that tries common paths like /usr/sbin/zpool,
  /sbin/zpool, /usr/local/sbin/zpool for TrueNAS/FreeBSD/Linux systems
- Added commonZpoolPaths variable listing typical zpool locations
- Added tests for the new findZpool function

This ensures zpool list is used for accurate pool-level capacity instead
of falling back to dataset-level summation.
2025-12-18 16:23:56 +00:00
rcourtman
0182cc8310 feat(thresholds): add collapsible accordion sections and UX improvements
- Add CollapsibleSection component with animated expand/collapse
- Wrap all 6 resource sections (Nodes, VMs, PBS, Storage, Backups, Snapshots) with accordion UI
- Add section icons and resource counts in headers
- Add expand all / collapse all buttons for quick navigation
- Make help banner dismissible with localStorage persistence
- Add Ctrl/Cmd+F keyboard shortcut to focus search
- Add keyboard shortcut hint badge on search input
- Add icons to tab navigation for quick identification
- Improve mobile tab labels with shorter text on small screens
- Create reusable components: ThresholdBadge, ResourceCard, GlobalDefaultsRow
- Create useCollapsedSections hook with localStorage persistence
- Default less-used sections (Storage, Backups, Snapshots, PBS) to collapsed
2025-12-18 15:47:44 +00:00
rcourtman
c91307be94 fix: guest URL icon now appears/disappears immediately after AI sets/removes it
The issue was a SolidJS reactivity problem in the Dashboard component.
When guestMetadata signal was accessed inside a For loop callback and
assigned to a plain variable, SolidJS lost reactive tracking.

Changed from:
  const metadata = guestMetadata()[guestId] || ...
  customUrl={metadata?.customUrl}

To:
  const getMetadata = () => guestMetadata()[guestId] || ...
  customUrl={getMetadata()?.customUrl}

This ensures SolidJS properly tracks the signal dependency when the
getter function is called directly in JSX props.
2025-12-18 14:42:47 +00:00
rcourtman
5c9bbf33b6 Merge pull request #856 from BTLzdravtech/wildcard
Adds wildcard support for kube namespace filtering.
2025-12-18 11:30:18 +00:00
rcourtman
cf57cfcb03 Merge pull request #855 from BTLzdravtech/main
Fixes inverted boolean logic in isProblemPod/isProblemDeployment checks and improves init container exit code handling.
2025-12-18 11:30:11 +00:00
rcourtman
f13e9eac69 fix(ui): add CPU tooltip to VM/LXC rows in Proxmox Overview. Related to #816
The previous fix (3b4c77de) only addressed PVE/PBS nodes but missed
guest rows. Now VMs and LXC containers also show CPU details tooltip
on hover using EnhancedCPUBar.
2025-12-18 11:28:09 +00:00
Tomas Hruska
a419b6237a support wildcards --kube-include-namespace/--kube-exclude-namespace 2025-12-18 00:00:30 +01:00
Tomas Hruska
69d693f346 Fix kubernetes logic and init containers detection 2025-12-17 23:47:03 +01:00
rcourtman
210a6f7cc0 monitoring: keep host IDs stable via token+hostname binding 2025-12-17 20:16:27 +00:00
rcourtman
ebc3474647 hostagent: avoid identity collisions with MAC fallback (Related to #836) 2025-12-17 20:09:55 +00:00
rcourtman
3623395549 test(api): allow printable alert IDs for acknowledge (Related to #852) 2025-12-17 20:09:51 +00:00
rcourtman
5338ab580c Stabilize core E2E tests
- Preserve alerts activation state when saving thresholds
- Use compliant default E2E password and deterministic bootstrap token seeding
- Harden Playwright selectors, waits, and diagnostics gating
2025-12-17 19:36:48 +00:00
rcourtman
54fc259221 fix(ai): improve AI settings UX with validation and smart fallbacks
Backend:
- Add smart provider fallback when selected model's provider isn't configured
- Automatically switch to a model from a configured provider instead of failing
- Log warning when fallback occurs for visibility

Frontend (AISettings.tsx):
- Add helper functions to check if model's provider is configured
- Group model dropdown: configured providers first, unconfigured marked with ⚠️
- Add inline warning when selecting model from unconfigured provider
- Validate on save that model's provider is configured (or being added)
- Warn before clearing last configured provider (would disable AI)
- Warn before clearing provider that current model uses
- Add patrol interval validation (must be 0 or >= 10 minutes)
- Show red border + inline error for invalid patrol intervals 1-9
- Update patrol interval hint: '(0=off, 10+ to enable)'

These changes prevent confusing '500 Internal Server Error' and
'AI is not enabled or configured' errors when model/provider mismatch.
2025-12-17 18:30:19 +00:00
rcourtman
c4b893e257 Fix agent download serving wrong architecture binary
When a specific architecture is requested (e.g., linux-arm64), don't fall
back to the generic pulse-agent binary if the requested arch isn't found.
This was causing ARM64 machines to receive x86-64 binaries that can't run.

Now returns 404 with helpful error message if requested architecture
binary is not available.
2025-12-17 17:22:51 +00:00
rcourtman
81c876b786 Simplify agent installation UI - remove Force Docker/K8s/Proxmox checkboxes
- Remove confusing Force Docker, Force Kubernetes, Force Proxmox checkboxes
- Auto-detection handles these platforms; checkboxes were redundant
- Keep Skip TLS verification checkbox (commonly needed for self-signed certs)
- Add Troubleshooting section with --enable-* and --disable-* flags for edge cases
- Update tests to reflect simplified UI
2025-12-17 17:14:41 +00:00
rcourtman
30f01771ac Add meaningful tests for host agent and exec websocket 2025-12-17 17:02:01 +00:00
rcourtman
ab480ca489 fix: Prevent orphaned encrypted data when encryption key is deleted
- crypto.go: Add runtime validation to Encrypt() that verifies the key file
  still exists on disk before encrypting. If the key was deleted while Pulse
  is running, encryption now fails with a clear error instead of creating
  orphaned data that can never be decrypted.

- hot-dev.sh: Auto-generate encryption key for production data directory
  (/etc/pulse) when HOT_DEV_USE_PROD_DATA=true and key is missing. This
  prevents startup failures and ensures encrypted data can be created.

- Added test TestEncryptRefusesAfterKeyDeleted to verify the protection works.
2025-12-17 17:00:53 +00:00
rcourtman
d663ba4342 hostagent: avoid host ID collisions and prefer LAN IP 2025-12-17 16:29:59 +00:00
rcourtman
e44a6fdadd test(envdetect): cover environment detection decisions 2025-12-17 16:08:10 +00:00
rcourtman
71e1b5dc86 test: expand AI provider test coverage with HTTP mocks 2025-12-17 15:53:56 +00:00
rcourtman
47dfa5d703 test: expand cmd and agent update coverage 2025-12-17 13:28:17 +00:00
rcourtman
0ee6e50c8b fix(config): avoid deadlock saving empty nodes config 2025-12-17 13:28:06 +00:00
rcourtman
969fa0e509 test: add unit tests for AI, Kubernetes agent, and clients 2025-12-17 12:47:36 +00:00
rcourtman
b562580764 fix: Skip deprecated pulse-sensor-proxy for v5+ installations
The unified agent now handles temperature monitoring in v5+, making
pulse-sensor-proxy unnecessary. This commit:

1. Adds INSTALLER_MAJOR_VERSION constant to declare bundled version
2. Skips 'Temperature Monitoring Setup' prompts for v5+ installs
3. Skips sensor proxy installation entirely for v5+
4. Updates help text to mark --proxy as deprecated for v5+
5. Removes outdated sensor proxy instructions from completion message

Fixes the 'pct pull TASK ERROR: failed to open /opt/pulse/bin/pulse-sensor-proxy-linux-amd64'
error reported by users installing v5.0.0-rc.3.

Reported-by: RLSinRFV (GitHub Discussion #845)
2025-12-17 12:20:03 +00:00