Commit Graph

2263 Commits

Author SHA1 Message Date
rcourtman
5d4e911298 feat: improve test coverage for pulse-sensor-proxy 2026-01-03 21:42:19 +00:00
rcourtman
fd7e80ae17 fix: Add clear warning when Docker token is already in use
When a Docker agent tries to register with a token that's already bound
to another agent, the error was logged generically as "Failed to send
docker report". Users had to dig into logs to understand the issue.

Now logs a prominent error message:
"DOCKER REGISTRATION FAILED: This API token is already used by another
Docker agent. Each Docker host requires its own unique token. Generate
a new token in Pulse Settings > Agents and reinstall with the new token."

Related to #1027
2026-01-03 20:56:04 +00:00
rcourtman
22e1cc5613 test(agent): achieve 95% coverage for pulse-agent 2026-01-03 20:52:42 +00:00
rcourtman
fa43628cde fix: Alert acknowledge/unacknowledge fails with reverse proxies
Reverse proxies (Traefik, Caddy, nginx) often normalize or reject URLs
containing %2F (encoded slash). Alert IDs contain forward slashes
(e.g., "docker-container-state-docker:abc/def"), causing acknowledge
requests to fail with 400 errors when going through a reverse proxy.

Added new body-based endpoints that accept alert ID in JSON body:
- POST /api/alerts/acknowledge {"id": "..."}
- POST /api/alerts/unacknowledge {"id": "..."}
- POST /api/alerts/clear {"id": "..."}

Updated frontend to use the new endpoints. Legacy path-based endpoints
are preserved for backwards compatibility.

Related to #1026
2026-01-03 20:51:25 +00:00
rcourtman
adba448419 fix(pbs): correct API paths and achieve >95% test coverage 2026-01-03 20:45:36 +00:00
rcourtman
b039b79e4a fix: Physical disk temps showing 0°C when using host agent SMART data
The mergeNVMeTempsIntoDisks and mergeHostAgentSMARTIntoDisks functions
require nodes to have LinkedHostAgentID populated to match disks with
host agent SMART data. However, the code was passing the local modelNodes
variable which doesn't have this field set - the linking happens inside
UpdateNodesForInstance which modifies the state's copy, not the local var.

Fixed by using currentState.Nodes (from GetSnapshot()) instead of
modelNodes/modelNodesCopy in both the skip-poll path and the background
goroutine. The state snapshot contains nodes with LinkedHostAgentID
already populated, allowing proper SMART data merging.

Related to #1014
2026-01-03 19:20:31 +00:00
rcourtman
abccbcafb6 fix: Container update command incorrectly removes Docker host and revokes token
When a container update command completed successfully, the server was
incorrectly returning shouldRemove=true, which caused the Docker host to
be removed and its API token revoked. This caused 401 Unauthorized errors
for subsequent agent reports.

The fix ensures shouldRemove is only true for "stop" commands, not for
"update_container" or "check_updates" commands.

Related to #1020
2026-01-03 19:05:18 +00:00
rcourtman
233278a9d2 Add Docker Swarm frontend components 2026-01-03 18:52:38 +00:00
rcourtman
ed78509f92 Fix flaky tests and improve coverage across alerts, api, and config packages
- Fix deadlock and race conditions in internal/alerts
- Add comprehensive error path tests for internal/config
- Fix 401 handling in internal/api
- Fix Docker Swarm task filtering test logic
2026-01-03 18:36:17 +00:00
rcourtman
08661cca8e fix: Add anchor target for "Manage linked agents" link
The link in the agents list banner pointed to #linked-agents but no
element had that ID, so clicking it did nothing.

Related to #1021
2026-01-03 11:33:08 +00:00
rcourtman
a47c7803bb fix: Preserve configured runtime preference during report collection
When collecting reports, the runtime re-detection was passing RuntimeAuto
instead of the user's configured preference. This caused podman to switch
back to docker on systems like CoreOS where podman provides a docker-
compatible socket at /var/run/docker.sock.

Now the current runtime (set at init from user's --docker-runtime flag)
is passed as the preference, preventing spurious runtime switching.

Related to #1022
2026-01-03 11:30:25 +00:00
rcourtman
9e339957c6 fix: Update runtime config when toggling Docker update actions setting
The DisableDockerUpdateActions setting was being saved to disk but not
updated in h.config, causing the UI toggle to appear to revert on page
refresh since the API returned the stale runtime value.

Related to #1023
2026-01-03 11:14:17 +00:00
rcourtman
fbbefa4546 Improve tests for internal/alerts package
- Fix TestSaveHistoryWithRetry_WriteError to be robust on root
- Add TestOnAlert to history_test.go
- Add pmg_anomaly_test.go for PMG anomaly detection coverage
- Add cleanup_test.go for tracking map cleanup coverage
- extend filter_evaluation_test.go to cover all guest threshold logic
2026-01-02 23:47:16 +00:00
rcourtman
3b48c4acbb Auto-update Helm chart version to 5.0.10 helm-chart-5.0.10 2026-01-02 21:30:25 +00:00
rcourtman
e19c202ff3 Auto-update Helm chart documentation 2026-01-02 21:30:23 +00:00
rcourtman
87ca7c92e0 docs: update example in dev-deploy-agent script 2026-01-02 21:08:42 +00:00
rcourtman
0b3cb71fd1 fix(alerts): use pbsDefaults instead of nodeDefaults for PBS instances v5.0.10 2026-01-02 20:46:53 +00:00
rcourtman
4cd3e53c3e test: add regression tests for missing frontend fields
Ensures that LinkedHostAgentId, CommandsEnabled, IsLegacy, and LinkedNodeId
are correctly propagated to the frontend. This prevents regressions of the
bugs fixed for #952 and #971.
2026-01-02 20:45:35 +00:00
rcourtman
a0e5f22983 chore: bump version to 5.0.10 2026-01-02 20:17:09 +00:00
rcourtman
118574e491 fix: expose linkedHostAgentId and commandsEnabled to frontend
Related to #952 and #971

Both issues were caused by the backend not sending required fields to the
frontend in the ToFrontend() converters:

Issue #971 (Agent required badge):
- NodeFrontend was missing LinkedHostAgentId field
- Frontend couldn't identify linked host agents, so it fell back to showing
  'Agent required' instead of 'Via agent'

Issue #952 (AI Commands toggle stuck):
- HostFrontend was missing CommandsEnabled field
- Frontend couldn't see the actual commandsEnabled state from the backend,
  causing the optimistic UI to never receive confirmation that the state
  had actually changed

Also added IsLegacy and LinkedNodeId to HostFrontend for completeness.
2026-01-02 20:04:20 +00:00
rcourtman
31c704c7a7 refactor: fix lint issues in internal/ai package
- Remove redundant nil checks before len() calls
- Mark unused parameters with underscore
- Convert if/else chains to switch statements for cleaner code
- Add test assertions to resolve unused write warnings in patrol_test.go
2026-01-02 19:53:01 +00:00
rcourtman
7ec012a2e1 feat(pro): expose update_alerts feature and add AI-powered update risk assessment
- Expose FeatureUpdateAlerts in /api/license/features endpoint (was hidden)
- Add 'Update Alerts' label to frontend Pro License panel
- Add AI-powered update risk assessment for Docker container updates
  - Classifies containers by type (auth, web server, database, etc.)
  - Provides context-aware recommendations for update timing
  - Time-based urgency escalation (warning >7d, critical >14d)
- Handle edge cases: nil alerts, empty metadata, float64 pendingHours
- Fix switch case ordering to properly route docker-container-update alerts
- Add comprehensive tests for update analysis (15 new test functions)
2026-01-02 19:21:17 +00:00
rcourtman
c577a7d142 chore: update logo to vector SVG 2026-01-02 18:06:27 +00:00
rcourtman
3637184c63 fix(alerts): decouple PBS custom threshold detection from Node defaults. Related to #1017 2026-01-02 17:46:01 +00:00
rcourtman
b94b6f89d4 Fix ThresholdsTable tests: correct mocking and assertions for resource rendering and filtering 2026-01-02 17:40:23 +00:00
rcourtman
9f0a5d54aa fix(alerts): prevent PBS thresholds from falling back to Node defaults. Related to #1017 2026-01-02 17:04:15 +00:00
rcourtman
9bdbf2616c chore(tests): remove unused test code and redundant test cases
- Remove unused findAlertByID helper and its min dependency from update_alerts_test.go
- Remove redundant negative zero test case from utility_test.go (-0.0 == 0.0 in Go)
2026-01-02 16:11:09 +00:00
rcourtman
0b0b503919 feat: Enable update checks for Docker environments. Related to #1016 2026-01-02 14:22:40 +00:00
rcourtman
180cddb55b refactor: use license package constants for Pro features in AI service 2026-01-02 14:11:56 +00:00
rcourtman
f9ea0fbb5a fix(pro): add error tracking to patrol history store
- Add lastSaveError, lastSaveTime, onSaveError fields to PatrolRunHistoryStore
- Add GetPersistenceStatus() and SetOnSaveError() methods
- Consistent with findings store and cost store error handling
2026-01-02 14:01:32 +00:00
rcourtman
f71c6a6cce fix(pro): add error tracking to cost store and fix race condition
- Add lastSaveError, lastSaveTime, onSaveError fields to cost.Store
- Add GetPersistenceStatus() method to check persistence health
- Add SetOnSaveError() callback for error notifications
- Rename scheduleSave to scheduleSaveLocked for clarity
- Document that scheduleSaveLocked must be called with lock held
- Add tests for new error tracking functionality
2026-01-02 13:59:26 +00:00
rcourtman
c2de1b256b fix(pro): add cleanup goroutine for alert analyzer memory leak
- Add Start/Stop lifecycle methods to AlertTriggeredAnalyzer
- Periodic cleanup of lastAnalyzed map every 30 minutes
- Prevents memory growth from stale cooldown entries
- Document that ai package feature constants are aliases of license constants
- Call Start() in StartPatrol and Stop() in StopPatrol
- Add tests for Start/Stop lifecycle
2026-01-02 13:12:24 +00:00
rcourtman
3029cce172 fix(patrol): address multiple issues in patrol service
- Add missing KubernetesChecked field to persistence (data was being lost)
- Fix Duration field to properly convert between ms and nanoseconds
- Add automatic cleanup of stale stream subscribers (memory leak fix)
- Add error tracking for findings persistence with callback support
- Add GetPersistenceStatus() and SetOnSaveError() methods
- Add tests for new error tracking functionality
2026-01-02 12:45:00 +00:00
rcourtman
3e6ebd593c fix(alerts): resolve mapping and formatting issues for disk temperature thresholds (#1013) 2026-01-02 11:27:48 +00:00
rcourtman
773376fa5d docs: add deep dive summaries for notifications, discovery, and agent exec 2026-01-02 11:18:28 +00:00
rcourtman
d71754743c docs: Add PULSE_DISABLE_DOCKER_UPDATE_ACTIONS documentation
- Add to DOCKER.md configuration table and new 'Disabling Update Features' section
- Add to CONFIGURATION.md monitoring overrides table
- Clarify difference between disabling update detection vs hiding buttons
2026-01-02 10:35:04 +00:00
rcourtman
60220ee161 feat: Add server-wide control to disable Docker update actions
Implements PULSE_DISABLE_DOCKER_UPDATE_ACTIONS environment variable and
Settings UI toggle to hide Docker container update buttons while still
allowing update detection. This addresses requests for a 'read-only' mode
in production environments.

Backend:
- Add DisableDockerUpdateActions to SystemSettings and Config structs
- Add environment variable parsing with EnvOverrides tracking
- Expose setting in GET/POST /api/config/system endpoints
- Block update API with 403 when disabled (defense-in-depth)

Frontend:
- Add disableDockerUpdateActions to SystemConfig type
- Create systemSettings store for reactive access to server config
- Add Docker Settings card in Settings → Agents tab with toggle
- Show env lock badge when set via environment variable

UpdateButton improvements:
- Properly handle loading state (disabled + visual indicator)
- Use Solid.js Show components for proper reactivity
- Show read-only UpdateBadge when updates disabled
- Show interactive button when updates enabled

Closes discussion #982
2026-01-02 10:29:43 +00:00
rcourtman
0751e3ca94 Auto-update Helm chart version to 5.0.9 helm-chart-5.0.9 2026-01-02 00:55:10 +00:00
rcourtman
06cd8c415f Auto-update Helm chart documentation 2026-01-02 00:55:10 +00:00
rcourtman
c654f1486d fix: Docker agent token conflict on reconnect. Related to #1008 v5.0.9 2026-01-02 00:03:23 +00:00
rcourtman
6bb272d3dc fix: Ensure Env Var takes precedence over system settings for HideLocalLogin. Related to #857 2026-01-01 23:36:18 +00:00
rcourtman
1feff00cc5 chore: Bump version to 5.0.9. Related to #1009 2026-01-01 23:27:15 +00:00
rcourtman
4ed03f23c2 fix: use Instance field for backup/snapshot state sync instead of ID prefix
This resolves issues where snapshots/backups persist after deletion if the
Instance field didn't match the ID prefix (due to case changes, name changes, etc).

Now consistent with how VMs, Containers, Storage, etc. are filtered.

Also adds Instance field to BackupTask model for completeness.

Addresses #1009 (refs #991)
2026-01-01 23:22:38 +00:00
rcourtman
661645585a fix: cleanup completed docker commands to prevent re-execution. Address #1010 2026-01-01 23:14:54 +00:00
rcourtman
df1ff42280 fix: Add backup freshness thresholds to UI. Related to #839 2026-01-01 23:06:34 +00:00
rcourtman
83935fa871 feat(ai): enhance AI Patrol with baseline anomaly detection and correlation learning
This update integrates learned baselines into the heuristic analysis to detect abnormal behavior and records significant events (migrations, restarts, spikes) for correlation analysis. Also fixed syntax errors in Ollama integration tests.
2026-01-01 23:00:43 +00:00
rcourtman
8bab7c83ad feat(ai): enhance AI Patrol with baseline anomaly detection and correlation learning
This update integrates learned baselines into the heuristic analysis to detect abnormal behavior and records significant events (migrations, restarts, spikes) for correlation analysis.
2026-01-01 23:00:18 +00:00
rcourtman
002cf36ee0 fix(patrol): use title as fallback for finding key in LLM findings 2026-01-01 22:49:04 +00:00
rcourtman
b225d22395 fix(patrol): use normalizedKey in generateFindingID for stable finding IDs 2026-01-01 22:46:28 +00:00
rcourtman
3fdf753a5b Enhance devcontainer and CI workflows
- Add persistent volume mounts for Go/npm caches (faster rebuilds)
- Add shell config with helpful aliases and custom prompt
- Add comprehensive devcontainer documentation
- Add pre-commit hooks for Go formatting and linting
- Use go-version-file in CI workflows instead of hardcoded versions
- Simplify docker compose commands with --wait flag
- Add gitignore entries for devcontainer auth files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 22:29:15 +00:00