mirror of
https://github.com/rcourtman/Pulse.git
synced 2026-02-18 00:17:39 +01:00
Implement comprehensive eval framework for testing Pulse Assistant: Core components: - Runner: Executes scenarios against live API with SSE stream parsing - Assertions: Reusable checks (tool usage, content, duration, errors) - Scenarios: Multi-step test workflows with configurable assertions Basic scenarios: - QuickSmokeTest: Minimal functionality verification - ReadOnlyInfrastructure: List, logs, status operations - RoutingValidation: Command routing to correct targets - LogTailing: Bounded log commands complete properly - Discovery: Infrastructure discovery capabilities Advanced scenarios: - TroubleshootingScenario: Multi-step investigation workflow - DeepDiveScenario: Thorough single-service investigation - ConfigInspectionScenario: Reading configuration files - ResourceAnalysisScenario: Cross-container resource comparison - MultiNodeScenario: Operations across Proxmox nodes - DockerInDockerScenario: Docker containers inside LXCs - ContextChainScenario: Context retention across turns Usage: go test ./internal/ai/eval -live -run TestQuickSmokeTest