Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-02-19 07:50:43 +01:00

Author	SHA1	Message	Date
rcourtman	d663ba4342	hostagent: avoid host ID collisions and prefer LAN IP	2025-12-17 16:29:59 +00:00
rcourtman	e44a6fdadd	test(envdetect): cover environment detection decisions	2025-12-17 16:08:10 +00:00
rcourtman	969fa0e509	test: add unit tests for AI, Kubernetes agent, and clients	2025-12-17 12:47:36 +00:00
rcourtman	a115af6906	feat: Improve cluster endpoint error messages for users - Add sanitizeEndpointError() to transform raw Go errors into user-friendly messages - Transform 'context deadline exceeded' into helpful messages mentioning possible causes - Storage timeout errors now suggest checking PBS/NFS/Ceph backend connectivity - Connection refused, certificate errors, and auth errors get actionable hints - Apply sanitization everywhere cluster endpoint lastError is stored - Add comprehensive tests for all error transformations	2025-12-16 21:50:02 +00:00
rcourtman	3a2a73f9d6	Merge main into ai-features: incorporate latest bugfixes Resolved conflicts: - pkg/fsfilters/filters.go: Keep both TrueNAS and EnhanceCP filter fixes - DockerUnifiedTable.tsx: Use main's resource column overlap fix	2025-12-13 15:18:51 +00:00
rcourtman	a259b67348	feat: add Kubernetes platform support	2025-12-12 21:31:11 +00:00
rcourtman	88d419dd5b	feat(ai): Add enriched context with historical trends and predictions Phase 1 of Pulse AI differentiation: - Create internal/ai/context package with types, trends, builder, formatter - Implement linear regression for trend computation (growing/declining/stable/volatile) - Add storage capacity predictions (predicts days until 90% and 100%) - Wire MetricsHistory from monitor to patrol service - Update patrol to use buildEnrichedContext instead of basic summary - Update patrol prompt to reference trend indicators and predictions This gives the AI awareness of historical patterns, enabling it to: - Identify resources with concerning growth rates - Predict capacity exhaustion before it happens - Distinguish between stable high usage vs growing problems - Provide more actionable, time-aware insights All tests passing. Falls back to basic summary if metrics history unavailable.	2025-12-12 09:45:57 +00:00
rcourtman	fa13919987	fix(ai-chat): Display messages chronologically in AI chatbot - Add 'content' type to StreamDisplayEvent for tracking text chunks - Track content events in streamEvents array for chronological display - Update render to use Switch/Match for cleaner conditional rendering - Interleave thinking, tool calls, and content as they stream in - Add fallback for old messages without streamEvents for backwards compat Previously, tool/command outputs stayed at top while AI text responses accumulated at the bottom. Now all events appear in order like a normal chatbot.	2025-12-11 23:02:59 +00:00
rcourtman	927ac76bad	feat: AI integration, Docker metrics, RAID display, and infrastructure improvements - Add Claude OAuth authentication support with hybrid API key/OAuth flow - Implement Docker container historical metrics in backend and charts API - Add CEPH cluster data collection and new Ceph page - Enhance RAID status display with detailed tooltips and visual indicators - Fix host deduplication logic with Docker bridge IP filtering - Fix NVMe temperature collection in host agent - Add comprehensive test coverage for new features - Improve frontend sparklines and metrics history handling - Fix navigation issues and frontend reload loops	2025-12-09 09:29:27 +00:00
rcourtman	8948e84fe5	feat: AI features, agent improvements, and host monitoring enhancements AI Chat Integration: - Multi-provider support (Anthropic, OpenAI, Ollama) - Streaming responses with markdown rendering - Agent command execution for remote troubleshooting - Context-aware conversations with host/container metadata Agent Updates: - Add --enable-proxmox flag for automatic PVE/PBS token setup - Improve auto-update with semver comparison (prevents downgrades) - Add updatedFrom tracking to report previous version after update - Reduce initial update check delay from 30s to 5s - Add agent version column to Hosts page table Host Metrics: - Add DiskIO stats collection (read/write bytes, ops, time) - Improve disk filtering to exclude Docker overlay mounts - Add RAID array monitoring via mdadm - Enhanced temperature sensor parsing Frontend: - New Agent Version column on Hosts overview table - Improved node modal with agent-first installation flow - Add DiskIO display in host drawer - Better responsive handling for metric bars	2025-12-05 10:37:02 +00:00
rcourtman	63038b5f30	fix: Filter EnhanceCP /var/container_tmp overlay mounts from disk stats EnhanceCP uses /var/container_tmp/{uuid}/merged for container overlays. These are ephemeral container layers, not user storage, and should be filtered from disk usage display. Related to #790	2025-12-04 20:11:10 +00:00
rcourtman	da51449392	fix: Exclude TrueNAS Docker overlay mounts from disk stats Host agent was including Docker overlay2 mounts from TrueNAS SCALE's .ix-apps directory in disk totals. These mounts inherit the ZFS pool's AVAIL space, causing massively inflated storage numbers (e.g., 173 TB per container overlay instead of actual usage). Changes: - Add /mnt/.ix-apps/docker/ to container overlay path exclusions - Use ShouldSkipFilesystem() in host agent disk collection (was only using ShouldIgnoreReadOnlyFilesystem() which missed container paths) - Add test cases for TrueNAS overlay paths Related to #718	2025-12-04 03:03:04 +00:00
rcourtman	4c98933175	fix: Filter container overlay mounts in non-standard locations Detect container overlay filesystem paths from various container runtimes (Docker, Podman, LXC, EnhanceCP, etc.) that may not be in standard /var/lib/docker or /var/lib/containers locations. Paths containing /containers/ with overlay patterns (/overlay2/, /overlay/, /diff/, /merged) are now filtered from disk usage aggregation. Related to #790	2025-12-03 14:06:15 +00:00
rcourtman	4f824ab148	style: Apply gofmt to 37 files Standardize code formatting across test files and monitor.go. No functional changes.	2025-12-02 17:21:48 +00:00
rcourtman	c05817f9de	docs: Add godoc comments to exported functions Add missing godoc comments to: - NewRateLimiter and Allow in ratelimit.go - SnapshotSyncStatus in temperature_proxy.go - NewClient and GetVersion in pkg/pmg/client.go	2025-12-02 15:58:59 +00:00
rcourtman	c812720f25	test: Add Disk UnmarshalJSON RPM and error path tests Cover RPM field handling (numeric, string, SSD, N/A, null, invalid), invalid JSON error path, and unexpected type fallbacks for both wearout and RPM fields. Coverage: 50% → 95.5%	2025-12-02 02:23:44 +00:00
rcourtman	618fc084f1	test: Add invalid user format tests for NewClient Test error handling for password authentication user format validation: - Missing realm separator (no @) - Empty user string - Multiple @ symbols Improves NewClient coverage from 74.2% to 83.9%.	2025-12-02 01:25:11 +00:00
rcourtman	de33653dc2	test: Add invalid value tests for VMFileSystem.UnmarshalJSON Test error handling for JSON parsing edge cases: - Invalid JSON syntax - Unsupported field types (bool, array) - Unparseable string values for total-bytes and used-bytes Improves coverage from 83.3% to 94.4%.	2025-12-02 01:22:42 +00:00
rcourtman	79afff8ba2	test: Add invalid value tests for MemoryStatus.UnmarshalJSON Test error handling for JSON parsing edge cases: - Invalid JSON syntax - Unsupported field types (bool, array, object) - Unparseable string values Improves coverage from 70.0% to 83.3%.	2025-12-02 01:20:15 +00:00
rcourtman	22d9e2795c	test: Add permanent failure test for ClusterClient.GetNodes Tests the error logging path when all endpoints fail with auth error (83.3% to 91.7% coverage).	2025-12-02 01:05:48 +00:00
rcourtman	5bbf7de1a3	test: Add JSON decode error test for Client.GetNodes Tests the error path when server returns invalid JSON (87.5% to 100%).	2025-12-02 01:03:30 +00:00
rcourtman	490fd9a810	test: Add edge cases for parseReplicationJob fields - Test jobid fallback when id field is missing - Test jobnum field takes precedence over ID parsing - Test last_sync_duration and duration fields - Test last-sync-duration fallback format - Test next_sync and next-sync fallback formats Coverage: 79.7% → 100%	2025-12-02 00:24:40 +00:00
rcourtman	29e01f8ff5	test: Add edge case for coerceUint64 ParseUint error branch String 'abc' without .eE characters triggers ParseUint error path. Coverage: 97.4% to 100%.	2025-12-01 23:44:04 +00:00
rcourtman	e2172b16de	test: Add edge case test for isNotImplementedError fallback branch Tab character triggers extractStatusCode fallback path (regex \s+ matches tab but ' 501' substring check doesn't). Coverage: 87.5% to 100%.	2025-12-01 23:18:45 +00:00
rcourtman	2afc7f0c41	test: Add edge case tests for parseWearoutValue function Add 4 new test cases covering previously untested branches: - Float zero exactly (0.0) - Float negative zero (-0.0) - Only escaped quotes becoming empty after trimming - Quoted whitespace becoming empty after trimming Coverage improved from 95.8% to 100%.	2025-12-01 23:02:18 +00:00
rcourtman	be892f5e07	fix: match storage timeout errors without trailing slash The error pattern `/storage/` only matched storage content endpoints (`/storage/{name}/content`) but not the main storage list endpoint (`/nodes/{node}/storage`). This caused storage timeout errors like: Get ".../nodes/pve-100-224/storage": context deadline exceeded to incorrectly mark cluster nodes as unhealthy, even though the timeout was due to a slow cross-node storage query, not actual node connectivity issues. Fixes #754	2025-12-01 22:48:01 +00:00
rcourtman	9097b507fd	test: Add edge case tests for parseReplicationTime function Add 13 new test cases covering previously untested branches: - float32 timestamp with valid value (using smaller value for precision) - float32/float64 zero and negative values - json.Number zero and negative values - int32 and uint32 timestamp handling - Invalid date format strings (no matching layout) - Partial date strings - Unsupported types (bool, slice) Coverage improved from 93.8% to 100%.	2025-12-01 22:44:23 +00:00
rcourtman	18472f1668	test: Add float32 NaN/Inf tests for intFromAny and floatFromAny Add 6 test cases covering float32 special values: - intFromAny: float32 NaN, +Inf, -Inf (all return 0, false) - floatFromAny: float32 NaN, +Inf, -Inf (all return 0, false) Coverage improved: - intFromAny: 96.7% -> 100% - floatFromAny: 95.0% -> 100%	2025-12-01 22:40:08 +00:00
rcourtman	1e9fbdfdcc	test: Add edge case tests for coerceUint64 function Add 6 new test cases covering previously untested branches: - float64 at MaxUint64 boundary (clamping behavior) - float64 exceeding MaxUint64 (overflow protection) - String with quoted "null" value - String with quoted empty value ("") - String with single quoted empty value ('') - Invalid float parsing in scientific notation Coverage improved from 92.3% to 97.4%.	2025-12-01 22:36:03 +00:00
rcourtman	05b9c3ab2d	test: Add tests for CPUInfo.GetMHzString method Add 11 test cases covering: - Nil MHz returns empty string - String MHz returned as-is - Empty string handling - Float64 formatted without decimals - Float64 zero handling - Float64 rounding for large values - Int formatting - Int zero handling - Default formatting for other types (int64, bool, slice) Coverage: GetMHzString 0% -> 100%	2025-12-01 22:29:30 +00:00
rcourtman	1f748e8670	fix: recover unhealthy cluster nodes even when some nodes are healthy Previously, recovery of unhealthy nodes only triggered when ALL nodes were unhealthy. This caused individual degraded nodes to stay degraded forever since operations would succeed on healthy nodes and never trigger the recovery path. Now recovery is attempted whenever any unhealthy nodes exist, allowing clusters to recover individual nodes over time. Also added: - Panic-safe unlock/lock pattern using anonymous function - Refresh of both healthy and cooling endpoints after recovery - Updated timestamp for accurate cooldown checks Related to #754	2025-12-01 21:47:26 +00:00
rcourtman	d9331570f5	test: Add tests for VMAgentField JSON unmarshaling Covers both Proxmox API formats: - Integer format (older versions): direct int value - Object format (Proxmox 8.3+): {enabled, available} fields - Preference order: available > enabled > 0 - Invalid input handling defaults to 0 - Integration with VMStatus struct	2025-12-01 21:40:47 +00:00
rcourtman	32333cdbbe	test: Add tests for authHTTPError.Error and shouldFallbackToForm Tests for Proxmox client authentication error handling: - authHTTPError.Error: message formatting based on status code (401/403 include status in message, others don't) - shouldFallbackToForm: determines when to retry with form encoding (triggers on 400/415, not on auth errors or server errors) 16 test cases covering all code paths.	2025-12-01 13:39:50 +00:00
rcourtman	42eec54d6e	Add unit tests for parseWearoutValue and clampWearoutConsumed functions 52 test cases covering: - Empty/whitespace input - Simple numeric strings and quoted values - Percentage symbols and N/A variants - Float values with truncation - Messy SMART data with digit extraction fallback - Clamping behavior for unknown, normal, and out-of-range values	2025-12-01 09:18:04 +00:00
rcourtman	f9122d736e	Add unit tests for parseUint64Flexible function 32 test cases covering all code paths: - nil, uint64, int, int64, float64 type handling - json.Number parsing (delegates to string branch) - String parsing: empty, decimal, hex (0x/0X), float notation, scientific - Negative value handling (returns 0 for numeric types) - Error cases: invalid strings, unsupported types	2025-12-01 09:11:02 +00:00
rcourtman	37550bff6d	Add unit tests for ZFS device conversion functions Tests added by ADA run #97 but commit was missed. Covers: RaidZ types, log/cache/spare devices, nested mirrors, ConvertToModelZFSPool, and struct field tests.	2025-12-01 09:03:48 +00:00
rcourtman	68c0e79b21	Add unit tests for cloneProfile and clonePhase functions in discovery Add comprehensive tests for the cloneProfile and clonePhase utility functions in pkg/discovery/discovery.go. Tests verify deep copying behavior for all fields including subnets, metadata, warnings, extra targets, and phases to ensure mutations don't affect original objects.	2025-12-01 01:51:54 +00:00
rcourtman	6c18849f79	Add unit tests for cluster_client utility functions Test coverage for error detection and retry logic: - extractStatusCode: 13 test cases for HTTP status code extraction - isTransientRateLimitError: 17 test cases for rate limit detection - isNotImplementedError: 14 test cases for 501 error detection - isVMSpecificError: 16 test cases for VM-scoped errors - calculateRateLimitBackoff: backoff timing verification - isAuthError: 12 test cases for authentication errors Coverage 35.5% → 37.3%	2025-12-01 00:24:21 +00:00
rcourtman	dc76294ce1	Add unit tests for discovery package utility functions Test coverage for pure utility functions: friendlyPhaseName, defaultProductsForPort, cloneHeader, copyMetadata, and ensurePolicyDefaults.	2025-11-30 16:05:11 +00:00
rcourtman	efafbe8e31	Add unit tests for PMG flexible JSON type parsers Tests for flexibleFloat and flexibleInt custom JSON unmarshalers that handle PMG API responses where numeric values may arrive as numbers, strings, or nulls. 64 test cases covering: - JSON numbers (integers, floats, scientific notation, negatives) - String values (numeric strings, empty, whitespace, "null") - JSON null values - Error cases (invalid strings, arrays, objects, booleans) - Boundary values (max/min float64) - Real PMG response patterns (mail stats, queue status) - Struct embedding behavior	2025-11-30 03:04:12 +00:00
rcourtman	92c2d198b1	Add unit tests for Proxmox replication utility functions Comprehensive test coverage for JSON parsing helpers used in replication job status parsing: stringFromAny, intFromAny, boolFromAny, floatFromAny, parseReplicationTime, parseDurationSeconds, parseHHMMSSToSeconds, and parseReplicationJob. Coverage increased from 22.6% to 35.5%.	2025-11-30 02:35:11 +00:00
rcourtman	316161f989	Add unit tests for coerceUint64 and FlexInt.UnmarshalJSON 45 test cases covering: - FlexInt: integer/float/string parsing, truncation behavior, error cases - coerceUint64: nil, float64 (including NaN/Inf), int/int32/int64, uint32/uint64, json.Number, string parsing (whitespace, null, quotes, commas, scientific notation), unsupported types Coverage: 20.5% -> 22.6%	2025-11-30 02:17:52 +00:00
rcourtman	69de7c25ce	Fix cluster degraded status not recovering after transient failures The previous fix (`6db4ee7a`) cleared stale error messages but didn't mark endpoints as healthy again after successful operations. This caused clusters to remain in "degraded" state permanently once any endpoint had a temporary issue, even if all endpoints were actually working. The fix now marks endpoints healthy in clearEndpointError() after successful operations, ensuring degraded clusters recover automatically. Related to #659	2025-11-29 19:04:11 +00:00
rcourtman	2eea0335a2	Extract filesystem filtering logic into pkg/fsfilters Move the inline filesystem skip logic from pollVMsAndContainersEfficient into a reusable ShouldSkipFilesystem function. This consolidates filtering for virtual filesystems (tmpfs, cgroup, etc.), network mounts (nfs, cifs, fuse), and special mountpoints (/dev, /proc, /snap, etc.) into one tested location. Reduces cyclomatic complexity of pollVMsAndContainersEfficient and adds 28 test cases covering virtual fs types, network mounts, special mounts, Windows paths, and edge cases.	2025-11-29 16:38:08 +00:00
rcourtman	1b5528356b	fix: clear stale errors after successful cluster operations Previously, errors stored in ClusterClient.lastError were only cleared during initial health checks or when recovering unhealthy nodes. This caused stale error messages to persist in the UI even after the underlying issues were resolved. The fix clears cached errors in two places: 1. After passing connectivity test in getHealthyClient() 2. After successful operation in executeWithFailover() This ensures that once an endpoint starts working again, any previous error messages are cleared from the UI without requiring a restart. Related to #659, #754	2025-11-27 16:22:16 +00:00
rcourtman	ad998a1e2f	style: fix staticcheck style warnings - Merge variable declaration with assignment (S1021) - Use unconditional strings.TrimPrefix (S1017) - Remove unnecessary nil checks around range (S1031) - Remove unnecessary fmt.Sprintf (S1039) - Use copy() instead of manual loop (S1001) - Use time.Until instead of t.Sub(time.Now()) (S1024) - Use buf.String() instead of string(buf.Bytes()) (S1030)	2025-11-27 09:19:33 +00:00
rcourtman	bc9e89696b	chore: fix staticcheck U1000 unused code warnings - Remove unused ipv6Regex from validation.go - Suppress unused recordAlertFired/recordAlertResolved hooks (kept for future use) - Remove unused apiLimiter rate limiter - Remove unused stopOnce fields from csrf_store.go and session_store.go - Remove unused lastBroadcast field from hub.go - Remove unused lastUsedIndex field from cluster_client.go	2025-11-27 09:12:17 +00:00
rcourtman	8276ae837e	chore: cleanup proxmox IsAuthError and remove stray comment - Make IsAuthError unexported (isAuthError) since it's only used internally - Remove stray '// test comment' from docker_metadata.go	2025-11-27 08:59:01 +00:00
rcourtman	3045aa16fb	chore: remove unused phaseError type from discovery	2025-11-27 08:47:13 +00:00
rcourtman	c439a83fba	chore: remove additional dead code Remove 241 lines of unreachable code across internal and pkg: - internal/crypto/crypto.go: unused NewCryptoManager wrapper - internal/monitoring/scheduler.go: unused fixedIntervalSelector type - internal/ssh/knownhosts/manager.go: unused hostKeyExists function - internal/updates/manager.go: unused getLatestRelease wrapper - internal/updates/updater.go: unused GetAll method - pkg/discovery/discovery.go: unused scanWorker and runPhase (legacy compat) - pkg/proxmox/client.go: unused post, getTaskStatus, waitForTaskCompletion, getTaskLog - pkg/proxmox/cluster_client.go: unused markUnhealthy wrapper	2025-11-27 05:13:26 +00:00

1 2 3

106 Commits