Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-02-18 00:17:39 +01:00

Author	SHA1	Message	Date
rcourtman	17208cbf9d	docs: update AI evaluation matrix and approval workflow documentation	2026-01-30 19:00:40 +00:00
rcourtman	10df3e4d95	chore: update gitignore to exclude dev artifacts	2026-01-30 19:00:02 +00:00
rcourtman	e85ec858fd	fix(ai): discovery transient error handling, agentic loop detection, and read-only classification - Discovery: classify transient errors (429, timeout, connection refused, etc.) and return IsError:true so models stop retrying rate-limited calls - Agentic loop: detect identical tool calls repeated >3 times and block with LOOP_DETECTED error, forcing the model to try a different approach - OpenAI provider: skip tool_choice for DeepSeek Reasoner which doesn't support it - Read-only classifier: fix curl -I case sensitivity (uppercase flags lowered), add iostat/vmstat/mpstat/sar/lxc-ls/lxc-info/nc -z to allowlist, fix 2>&1 false positive in input redirect detection	2026-01-29 18:29:54 +00:00
rcourtman	f0a356c016	fix: ZFS pool usage now includes zvols and all pool consumers The previous reconciliation logic (issue #1052) used per-dataset statfs values for Total and Used. On Proxmox systems, statfs on a mounted dataset (e.g. rpool/ROOT/pve-1) only reports that dataset's own usage, completely missing zvols (VM disk images) and other datasets. This caused storage bars to show ~0% usage (a few GB of OS files) when the pool actually had terabytes of VM data allocated. Fix: derive usable pool capacity from the ratio of dataset Free (usable pool-available from statfs) to zpool Free (raw pool-available from zpool list). This ratio converts raw zpool Size to usable total, and Used is computed as Total - Free. This captures all pool consumers including zvols, handles RAIDZ parity overhead and mirrors uniformly, and produces correct usage percentages. Verified with tests for RAIDZ, mirrors, and both with zvols present.	2026-01-29 12:08:38 +00:00
rcourtman	f1cddf047e	fix: settings sidebar navigation jumping to wrong item Fix URL path matching order in deriveTabFromPath() where clicking 'System Logs' would incorrectly navigate to 'General'. The generic /settings/system check was matching before /settings/system-logs because the latter contains the former as a substring. Moved specific system-* path checks before the generic fallback.	2026-01-29 11:25:01 +00:00
rcourtman	0e880f3c89	feat(eval): improve patrol eval with polling-based completion Refactor patrol eval runner to use a dual approach: 1. Poll GET /api/ai/patrol/status until Running=false (primary signal) 2. Best-effort SSE stream connection for tool event visibility Changes: - Add status polling loop with configurable timeout - Make SSE stream optional (may not connect in time) - Add Completed flag to PatrolRunResult - Improve assertion error messages - Add new scenarios and assertions This is more reliable than relying solely on SSE stream which may timeout waiting for headers during slow patrol initialization.	2026-01-29 08:20:39 +00:00
rcourtman	c457358c63	fix(api): flush SSE headers immediately on patrol stream connect Send an SSE comment immediately when a client connects to the patrol stream endpoint. This flushes HTTP headers so clients receive the 200 response right away, rather than blocking until the first event. This fixes eval tests where the stream connection would time out waiting for headers while patrol was still initializing.	2026-01-29 08:20:25 +00:00
rcourtman	a1fd2c4ddc	fix(ai): skip orphaned tool calls when pruning messages When pruning older messages to fit context limits, we may cut off a user message that preceded an assistant message with tool calls. This leaves an orphaned tool call sequence at the start. Extend pruneMessagesForModel to: - Skip leading assistant messages with tool calls - Also skip their following tool results - Ensures clean message sequence for all providers	2026-01-29 08:19:55 +00:00
rcourtman	fdc81525bb	fix(ai): sanitize Gemini message ordering for function calls Gemini requires that model messages with function calls must be immediately followed by user messages with function responses. When message pruning or errors leave orphaned function calls, Gemini rejects the request. Add sanitizeGeminiContents() to: - Strip orphaned function calls (keeping text content) - Remove orphaned function responses without preceding calls - Log when sanitization occurs for debugging	2026-01-29 08:19:41 +00:00
rcourtman	c409e7a05e	feat(eval): add patrol-specific eval scenarios and assertions Add comprehensive patrol evaluation framework: - patrol.go: Runner for patrol scenarios with streaming support - patrol_assertions.go: Assertions for tool usage, findings, timing - patrol_scenarios.go: Scenarios for basic, investigation, finding quality - eval_test.go: Unit tests for patrol eval runner Scenarios: - patrol-basic: Verifies patrol completes with tools and findings - patrol-investigation: Ensures investigation before reporting - patrol-finding-quality: Validates finding structure and evidence Run with: go run ./cmd/eval -scenario patrol	2026-01-28 23:19:11 +00:00
rcourtman	1b1c9bb2a3	refactor(ai): convert patrol to agentic tool-based execution - Replace output-parsing approach with tool-based finding creation - PatrolService now uses runAIAnalysis with proper scope handling - Add tool event streaming (tool_start, tool_end) to patrol events - Expose GetExecutor() on chat.Service for patrol integration - Remove regex-based finding extraction in favor of patrol tools The patrol now uses the same agentic loop as chat, with the LLM calling patrol_report_finding to create findings rather than outputting JSON that gets parsed. This is more reliable and consistent with the tool model.	2026-01-28 23:18:58 +00:00
rcourtman	f83356b430	feat(ai): add patrol-specific tools for agentic finding creation Add three new patrol tools that enable the LLM to create findings via tool calls instead of relying on output parsing: - patrol_report_finding: Create a structured finding with validation - patrol_resolve_finding: Mark a finding as resolved - patrol_get_findings: Query active findings for a resource These tools are only functional during a patrol run when PatrolFindingCreator is set on the executor. This approach is more reliable than parsing JSON from LLM output.	2026-01-28 23:18:42 +00:00
rcourtman	9c2f8a3284	refactor(ai): remove obsolete tool and chat files Remove files that were consolidated into other modules: - chat/patrol.go, patrol_test.go → moved to chat/service.go - tools_infrastructure.go → merged into tools_storage.go - tools_intelligence.go → merged into tools_metrics.go - tools_patrol.go → merged into tools_alerts.go - tools_profiles.go, tools_profiles_test.go → removed (unused) Update related test file references.	2026-01-28 21:30:24 +00:00
rcourtman	e227314d76	docs: update pulse-assistant architecture with current structure - Remove hardcoded line numbers from enforcement references - Update tool classification table with all current tools - Reflect consolidated tool structure	2026-01-28 21:24:45 +00:00
rcourtman	03b5586ac8	refactor(ai): update patrol and service to use chat service adapter - Update patrol.go to use chat service for AI execution - Update service.go with chat service provider integration - Add patrol streaming endpoint to router	2026-01-28 21:24:34 +00:00
rcourtman	2335099089	feat(ui): add Manage Subscription link for Pro users Add a link to pulserelay.pro/manage for paid subscribers to manage their subscription. Only shown for non-lifetime paid tiers.	2026-01-28 21:24:23 +00:00
rcourtman	44fecc37c0	feat(eval): enhance AI eval harness with retries and reporting - Add retry logic for transient failures (phantom, stream, empty response) - Add environment variable overrides for infrastructure naming - Add JSON report output per scenario - Expand assertions with new validation types - Add more comprehensive test scenarios - Add docs/EVAL.md with usage documentation The eval harness now better handles flaky AI responses and provides detailed reports for debugging.	2026-01-28 21:24:12 +00:00
rcourtman	badbad4464	refactor(ai): integrate patrol execution into chat service - Add ExecutePatrolStream method to chat.Service for patrol-specific execution - Create chat_service_adapter.go to bridge chat.Service to ai.ChatServiceProvider - Remove standalone patrol.go and patrol_test.go from chat package - Add PatrolRequest/PatrolResponse types to chat service - Add context injection for recent message context This allows patrol to use an isolated agentic loop with its own system prompt while leveraging the common chat infrastructure.	2026-01-28 21:21:41 +00:00
rcourtman	a75393d1c5	refactor(ai): consolidate tool implementations into domain-specific files - Merge tools_infrastructure.go, tools_intelligence.go, tools_patrol.go, tools_profiles.go into their respective domain tools - Expand tools_control.go with command execution logic - Expand tools_discovery.go with resource discovery handlers - Expand tools_storage.go with storage-related operations - Expand tools_metrics.go with metrics functionality - Update tests to match new structure This consolidation reduces file count and groups related functionality together.	2026-01-28 21:21:28 +00:00
rcourtman	23ff4d1337	chore: remove remaining gitignored files from tracking - analyze_coverage.py (local coverage analysis script) - coverage_summary.txt (coverage output) - mock.env (environment file)	2026-01-28 21:19:52 +00:00
rcourtman	a53aa387f3	chore: remove files that should be gitignored Remove development artifacts that were accidentally tracked: - analyze_coverage.py (local coverage analysis script) - coverage_summary.txt (coverage output) - mock.env (environment file) - pulse-check (binary) These files are already covered by .gitignore patterns.	2026-01-28 21:19:26 +00:00
rcourtman	824d65830c	Add debug logging to ZFS disk collection for diagnostics Adds zerolog debug statements throughout the ZFS collection pipeline (collector.go and zfs.go) to trace partition discovery, dataset collection, zpool stats fetching, and pool summarization. This will help diagnose issues like empty storage bars on mirror-vdev pools.	2026-01-28 17:30:53 +00:00
rcourtman	870528b4ba	Comment out unused patrol run detail variables These memos and helpers are prepared for the patrol run detail panel but not yet wired up. Commenting out to fix TypeScript strict unused variable checks.	2026-01-28 17:06:10 +00:00
rcourtman	fdde0c6b99	Fix unused variable warnings in AIIntelligence page	2026-01-28 17:04:14 +00:00
rcourtman	6184418704	Update frontend for AI assistant and discovery features AI Chat Improvements: - MentionAutocomplete for @-mentioning resources - Better tool execution display - Enhanced chat interface New Components: - FindingsPanel for AI findings display - DiscoveryTab for infrastructure discovery - PatrolActivitySection for patrol monitoring - StorageConfigPanel for storage management API Updates: - Discovery API integration - Enhanced AI chat API - Patrol API improvements - Monitoring API updates UI/UX: - Better AI status indicator - Improved investigation drawer - Enhanced settings page - Better guest drawer integration Types: - New discovery types - Enhanced AI types - API type improvements Removed deprecated UnifiedFindingsPanel in favor of new FindingsPanel.	2026-01-28 16:53:15 +00:00
rcourtman	13a6f7750c	Minor updates to main and proxmox client	2026-01-28 16:52:50 +00:00
rcourtman	19a67dd4f3	Update core infrastructure components Config: - AI configuration improvements - API tokens handling - Persistence layer updates Host Agent: - Command execution improvements - Better test coverage Infrastructure Discovery: - Service improvements - Enhanced test coverage Models: - State snapshot updates - Model improvements Monitoring: - Polling improvements - Guest config handling - Storage config support WebSocket: - Hub tenant test updates Service Discovery: - New service discovery module	2026-01-28 16:52:35 +00:00
rcourtman	c92811f3b2	Remove deprecated aidiscovery package The aidiscovery package has been superseded by the consolidated tools approach in internal/ai/tools/. Discovery functionality is now handled through: - pulse_query tool for resource search and discovery - pulse_discovery tool for infrastructure scanning - Better integration with the main AI chat pipeline Removing: - commands.go and related tests - deep_scanner.go and tests - formatters.go and tests - service.go and tests - store.go and tests - tools_adapter.go - types.go and tests	2026-01-28 16:52:17 +00:00
rcourtman	e194e17159	Update AI core services and adapters AI module improvements: Patrol System: - Better trigger handling - Improved history persistence - Enhanced coverage testing Knowledge Store: - Extended functionality - Better test coverage Adapters: - Discovery adapter updates - Investigation adapter improvements Unified Bridge: - Setup improvements - Better test coverage Alert handling and service updates.	2026-01-28 16:51:53 +00:00
rcourtman	9dcd859056	Update API handlers for AI and discovery endpoints API layer updates: AI Handlers: - Better streaming response handling - Improved error responses - Session management improvements Discovery Handlers: - New discovery endpoint handlers - Storage config handler - Better router organization Removed deprecated aidiscovery handlers in favor of unified approach.	2026-01-28 16:51:35 +00:00
rcourtman	641d29a16b	Update AI providers for tool call improvements Provider updates across all supported backends: - Anthropic: Better tool call handling - OpenAI: Improved response parsing - Gemini: Enhanced compatibility - Ollama: Local model support improvements Includes test updates for OpenAI provider.	2026-01-28 16:51:18 +00:00
rcourtman	7be3ab2c1a	Update investigation orchestrator for improved guardrails Investigation system updates: Guardrails improvements: - Better validation logic - Improved test coverage - Cleaner error messages Orchestrator updates: - Enhanced context handling - Better chat adapter integration - Improved type definitions Type refinements: - Simplified type hierarchy - Better serialization	2026-01-28 16:51:04 +00:00
rcourtman	279d4e7ec3	Add context prefetching and metrics to chat service Chat service improvements for better performance and observability: Context Prefetching: - Pre-load resource context when user mentions containers/nodes - Reduces latency for follow-up queries - Smart caching with TTL-based invalidation Metrics Collection: - Track tool execution counts and durations - FSM state transition metrics - Recovery success/failure rates - Telemetry for safety blocks Service Updates: - Better session management - Improved error handling - Cleaner test organization	2026-01-28 16:50:46 +00:00
rcourtman	0013d64c7b	Consolidate and extend AI tool suite Major tools refactoring for better organization and capabilities: New consolidated tools: - pulse_query: Unified resource search, get, config, topology operations - pulse_read: Safe read-only command execution with NonInteractiveOnly - pulse_control: Guest lifecycle control (start/stop/restart) - pulse_docker: Docker container operations - pulse_file: Safe file read/write operations - pulse_kubernetes: K8s resource management - pulse_metrics: Performance metrics retrieval - pulse_alerts: Alert management - pulse_storage: Storage pool operations - pulse_knowledge: Note-taking and recall - pulse_pmg: Proxmox Mail Gateway integration Executor improvements: - Cleaner tool registration pattern - Better error handling and recovery - Protocol layer for result formatting - Enhanced adapter interfaces Includes comprehensive tests for: - File and Docker operations - Kubernetes control operations - Command execution safety	2026-01-28 16:50:25 +00:00
rcourtman	94863a6750	Add comprehensive architecture documentation for Pulse Assistant Document the complete safety architecture: 1. High-Level Architecture - LLM as untrusted proposer pattern - FSM gating and tool execution flow - ResolvedContext for session truth 2. Safety Invariants (9 total) - Session-scoped tool registration - FSM state enforcement - Strict resolution requirements - ExecutionIntent classification - NonInteractiveOnly constraint - Read/Write tool separation - Phantom execution detection - Recovery loop protection - Telemetry for all safety blocks 3. Implementation Details - FSM states and transitions - Tool classification rules - Intent detection patterns - Error handling and recovery 4. Extension Guide - Adding new tools safely - Required validations - Testing requirements This serves as authoritative reference for contributors and security auditors.	2026-01-28 16:49:51 +00:00
rcourtman	a04d41ce2c	Add end-to-end evaluation framework for AI assistant testing Implement comprehensive eval framework for testing Pulse Assistant: Core components: - Runner: Executes scenarios against live API with SSE stream parsing - Assertions: Reusable checks (tool usage, content, duration, errors) - Scenarios: Multi-step test workflows with configurable assertions Basic scenarios: - QuickSmokeTest: Minimal functionality verification - ReadOnlyInfrastructure: List, logs, status operations - RoutingValidation: Command routing to correct targets - LogTailing: Bounded log commands complete properly - Discovery: Infrastructure discovery capabilities Advanced scenarios: - TroubleshootingScenario: Multi-step investigation workflow - DeepDiveScenario: Thorough single-service investigation - ConfigInspectionScenario: Reading configuration files - ResourceAnalysisScenario: Cross-container resource comparison - MultiNodeScenario: Operations across Proxmox nodes - DockerInDockerScenario: Docker containers inside LXCs - ContextChainScenario: Context retention across turns Usage: go test ./internal/ai/eval -live -run TestQuickSmokeTest	2026-01-28 16:49:24 +00:00
rcourtman	b2e0ae3fdb	Add ExecutionIntent classification and NonInteractiveOnly enforcement Implement safety layers for command execution: ExecutionIntent classifies commands as: - ObservationOnly: Pure read (status, logs, metrics) - SideEffects: May change state (restart, write, delete) NonInteractiveOnly enforces safe command forms: - Blocks interactive commands (vim, top without -b, etc) - Blocks unbounded streaming (tail -f without limit) - Suggests safe alternatives in error messages Add phantom execution detection: - Catches when model claims actions without using tools - Skips check when tools actually succeeded (fixes false positives) Includes comprehensive tests for: - Intent classification accuracy - Interactive command blocking - Strict resolution validation	2026-01-28 16:49:00 +00:00
rcourtman	6e739cea5c	Add resolved context and routing provenance tracking Implement ResolvedContext to track pinned resources during chat sessions: - ResolvedTarget captures resource ID, type, node, and provenance info - Provenance tracking records how targets were resolved (user mention, tool result, or implicit context) - Session maintains pinned targets that persist across conversation turns Add routing contract tests to verify: - Commands routed to correct container vs host targets - Provenance properly recorded for different resolution methods - Context maintained across multi-turn conversations This provides audit trail for which resources were accessed and how they were identified, supporting safety verification and debugging.	2026-01-28 16:48:25 +00:00
rcourtman	6a0ba8d1a4	Add FSM workflow guardrails for AI assistant safety Implement a state machine that enforces structural safety guarantees: - RESOLVING: Initial state, must discover resources before writing - READING: Read tools allowed after discovery - WRITING: Transitions to VERIFYING after any write operation - VERIFYING: Must perform read verification before next write This prevents: - Write operations without resource discovery - Consecutive writes without verification - Final answers without post-write verification The FSM is enforced at the tool execution layer, providing defense-in-depth that doesn't rely on prompt instructions alone.	2026-01-28 16:47:54 +00:00
rcourtman	70dbb495ad	fix: address triage issues #1149 , #1153 , #1162 , #1163 - #1163: Add node badges to storage resources in threshold tables (ResourceTable.tsx, ResourceCard.tsx) - #1162: Fix PBS backup alerts showing datastore as node name (alerts.go - use "Unknown" for orphaned backups) - #1153: Fix memory leaks in tracking maps - Add max 48 sample limit for pmgQuarantineHistory - Add max 10 entry limit for flappingHistory - Add cleanup for dockerUpdateFirstSeen - Add cleanupTrackingMaps() for auth, polling, and circuit breaker maps Note: #1149 fix (chat sessions null check) is in AISettings.tsx which has other pending changes - will be committed separately.	2026-01-26 22:21:10 +00:00
rcourtman	6873913e64	fix: install script and docs improvements - Fixed --disable-docker not being passed to systemd service file. Related to #1151 - Added init: true requirement to HTTPS/TLS docs for Docker. Related to #1166	2026-01-26 20:48:57 +00:00
rcourtman	7f7edfceb4	test: expand backend coverage	2026-01-25 21:08:44 +00:00
rcourtman	3ea5f54d93	chore: fix outdated comment in migration.go RunMigrationIfNeeded IS called from pkg/server/server.go, so removed the misleading comment about it being dormant.	2026-01-24 23:27:09 +00:00
rcourtman	f2648891a9	chore: remove unused utility functions Remove createColumn helper from responsive types and getPulsePort from URL utilities. Both were exported but never used anywhere.	2026-01-24 23:23:27 +00:00
rcourtman	4e64ea22b2	chore: remove unused alertFormatters functions Remove isTemperatureMetric and isThroughputMetric functions that are exported but never imported anywhere in the codebase.	2026-01-24 23:19:41 +00:00
rcourtman	a4d5e786ff	chore: remove unused frontend API functions Remove API functions that are defined but never called: - ai.ts: OAuth flow, execute/executeStream, chat session sync - charts.ts: getStorageCharts, getMetricsStoreStats - notifications.ts: queue/DLQ management, health check - updates.ts: update history functions Also removes unused type definitions (MetricsStoreStats, UpdateHistoryEntry)	2026-01-24 23:15:30 +00:00
rcourtman	ff2841a5c6	Fix patrol scoping and config propagation	2026-01-24 23:07:55 +00:00
rcourtman	de2cb7a29b	chore: remove deprecated GetAvailableModels and ModelInfo - Remove deprecated config.ModelInfo type (use providers.ModelInfo) - Remove deprecated GetAvailableModels function (always returned nil) - Remove associated test - Update AISettingsResponse to use providers.ModelInfo	2026-01-24 23:00:16 +00:00
rcourtman	0e602518c8	chore: remove unused CSS and clean up code - Remove unused animations.css (all classes were unused) - Replace console.log with logger in UnifiedHistoryChart - Remove deprecated isEnterprise export from license store	2026-01-24 22:58:06 +00:00
rcourtman	7f759e6fd2	chore: remove unused frontend code Remove unused utility files: - utils/textWidth.ts - not imported anywhere - utils/extractErrorMessage.ts - not imported anywhere Remove unused exports: - utils/canvasRenderQueue.ts: getRenderQueueStats() - hooks/useBreakpoint.ts: getVisibilityClass(), getPriorityMinWidth() - api/ai.ts: getForecast(), getForecastOverview(), related types - api/patrol.ts: categoryLabels, autonomyLevelLabels (kept used exports)	2026-01-24 22:55:55 +00:00

1 2 3 4 5 ...

2608 Commits