docs: add Mermaid diagrams to improve visual documentation

Enhance documentation with six Mermaid diagrams to better explain
complex system implementations:

- Adaptive polling lifecycle flowchart showing enqueue→execute→feedback
  cycle with scheduler, priority queue, and worker interactions
- Circuit breaker state machine diagram illustrating Closed↔Open↔Half-open
  transitions with triggers and recovery paths
- Temperature proxy architecture diagram highlighting trust boundaries,
  security controls, and data flow between host/container/cluster
- Sensor proxy request flow sequence diagram showing auth, rate limiting,
  validation, and SSH execution pipeline
- Alert webhook pipeline flowchart detailing template resolution, URL
  rendering, HTTP dispatch, and retry logic
- Script library workflow diagram illustrating dev→test→bundle→distribute
  lifecycle emphasizing modular design

These visualizations make it easier for operators and contributors to
understand Pulse's sophisticated architectural patterns.
This commit is contained in:
rcourtman
2025-10-21 10:40:33 +00:00
parent f9cb96ceb8
commit 85ffe10aed
13 changed files with 875 additions and 39 deletions

View File

@@ -17,27 +17,39 @@ This document describes the security architecture of Pulse's temperature monitor
## Architecture Overview
```
┌─────────────────────────────────────────┐
Proxmox Host (delly)
┌──────────────────────────────────┐ │
pulse-sensor-proxy (UID 999) │ │
- SSH keys (host-only) │ │
│ │ - Unix socket exposed │ │
│ │ - Method-level authorization │ │
- Rate limiting enforced │ │
└──────────────────────────────────┘ │
│ │
│ Unix Socket (read-only) │
│ ↓ │
┌──────────────────────────────────┐ │
LXC Container (ID-mapped) │ │
│ │ - No SSH keys │ │
│ │ - Socket at /mnt/pulse-proxy │ │
- Can't call privileged RPCs │ │
│ └──────────────────────────────────┘ │
└─────────────────────────────────────────┘
```mermaid
graph TD
subgraph Host["Proxmox Host (delly)\nTrust Boundary"]
Proxy["pulse-sensor-proxy service\nUID 999\nSO_PEERCRED auth\nMethod ACL + per-UID rate limit\nPer-node concurrency = 1"]
Socket["Unix socket\n/run/pulse-sensor-proxy.sock\n(0600 bind mount)"]
Audit["Audit & Metrics\n/var/log/pulse/... & :9127/metrics"]
PrivOps["Privileged RPCs\nensure_cluster_keys | register_nodes | request_cleanup\nHost UID only"]
end
subgraph Container["Pulse Container (ID-mapped root)"]
Backend["Pulse Backend"]
Poller["Temperature Poller worker"]
end
subgraph Cluster["Cluster Nodes"]
SensorCmd["Forced SSH command\n`sensors -j` only\nRestricted authorized_keys entry"]
end
Poller -->|poll request| Backend
Backend -->|RPC via bind-mounted socket| Socket
Socket --> Proxy
Proxy -->|temperature JSON response| Backend
Proxy -->|rate-limit reject + 2 s penalty| Reject["429 response"]
Reject --> Backend
Proxy -->|SSH (ed25519 key)\nforced command| SensorCmd
SensorCmd -->|temperature JSON| Proxy
Proxy -->|audit entry + metrics| Audit
Audit -->|Prometheus scrape| Metrics["Telemetry Consumers\n(Grafana, watchdog)"]
PrivOps --> Proxy
Backend -. blocked (ID-mapped root) .-> PrivOps
```
**Key Principle**: SSH keys never enter containers. All SSH operations are performed by the host-side proxy.

View File

@@ -113,6 +113,34 @@ For webhooks that require authentication or custom headers:
## Custom Payload Templates
```mermaid
flowchart TD
AlertEvent["Alert Event Triggered"]
GatherData["Gather Alert Data\n(Level, Type, Resource, Node, etc.)"]
ResolveURL["Resolve URL Template\n({{urlpath}}, {{urlquery}}"]
ResolvePayload["Resolve Payload Template\n(variable substitution)"]
ApplyFunctions["Apply Template Functions\n(title, upper, lower, printf)"]
Dispatch["HTTP POST Request"]
CheckResponse{"Response\nStatus?"}
Success["200-299: Success\nLog delivery"]
Retry["429/5xx: Retry\n(exponential backoff)"]
Failure["4xx: Failure\nLog error"]
TrackDelivery["Update Delivery Metrics\npulse_webhook_deliveries_total"]
AlertEvent --> GatherData
GatherData --> ResolveURL
ResolveURL --> ResolvePayload
ResolvePayload --> ApplyFunctions
ApplyFunctions --> Dispatch
Dispatch --> CheckResponse
CheckResponse -->|Success| Success
CheckResponse -->|Transient Error| Retry
CheckResponse -->|Permanent Error| Failure
Success --> TrackDelivery
Retry --> TrackDelivery
Failure --> TrackDelivery
```
For generic webhooks, you can define custom JSON payloads using Go template syntax.
### Available Variables

View File

@@ -3,12 +3,46 @@
## Overview
Phase2 introduces a scheduler that adapts poll cadence based on freshness, errors, and workload. The goal is to prioritize stale or changing instances while backing off on healthy, idle targets.
```
┌──────────┐ ┌──────────────┐ ┌──────────────┐ ┌─────────────┐
PollLoop │─────▶│ Scheduler │─────▶│ Priority Q │─────▶│ TaskWorkers │
└──────────┘ └──────────────┘ └──────────────┘ └─────────────┘
▲ │ │ │
│ └─────► Staleness, metrics, circuit breaker feedback ────┘
```mermaid
flowchart TD
PollLoop["PollLoop\n(ticker & config updates)"]
Scheduler["Scheduler\ncomputes ScheduledTask"]
Staleness["Staleness Tracker\n(last success, freshness score)"]
CircuitBreaker["Circuit Breaker\ntracks failure streaks"]
Backoff["Backoff Policy\nexponential w/ jitter"]
PriorityQ["Priority Queue\nmin-heap by NextRun"]
WorkerPool["TaskWorkers\nN concurrent workers"]
Metrics["Metrics & History\nPrometheus + retention"]
Success["Poll Success"]
Failure{"Poll Failure?"}
Reschedule["Reschedule\n(next interval)"]
BackoffPath["Backoff / Breaker Open"]
DeadLetter["Dead-Letter Queue\noperator review"]
PollLoop --> Scheduler
Staleness --> Scheduler
CircuitBreaker --> Scheduler
Scheduler --> PriorityQ
PriorityQ -->|due task| WorkerPool
WorkerPool --> Failure
WorkerPool -->|result| Metrics
WorkerPool -->|freshness| Staleness
Failure -->|No| Success
Success --> CircuitBreaker
Success --> Reschedule
Success --> Metrics
Reschedule --> Scheduler
Failure -->|Yes| BackoffPath
BackoffPath --> CircuitBreaker
BackoffPath --> Backoff
Backoff --> Scheduler
Backoff --> DeadLetter
DeadLetter -. periodic retry .-> Scheduler
CircuitBreaker -. state change .-> Scheduler
Metrics --> Scheduler
```
- **Scheduler** computes `ScheduledTask` entries using adaptive intervals.
@@ -74,6 +108,18 @@ Exposed via Prometheus (`:9091/metrics`):
| **Open** | ≥3 consecutive failures. Poll suppressed. | Exponential delay (max 5min). |
| **Half-open**| Retry window elapsed. Limited re-attempt. | Success ⇒ closed. Failure ⇒ open. |
```mermaid
stateDiagram-v2
[*] --> Closed: Startup / reset
Closed: Default state\nPolling active\nFailure counter increments
Closed --> Open: ≥3 consecutive failures
Open: Polls suppressed\nScheduler schedules backoff (max 5m)
Open --> HalfOpen: Retry window elapsed
HalfOpen: Single probe allowed\nBreaker watches probe result
HalfOpen --> Closed: Probe success\nReset failure streak & delay
HalfOpen --> Open: Probe failure\nIncrease streak & backoff
```
Backoff configuration:
- Initial delay: 5s

View File

@@ -9,6 +9,41 @@
- Limiters: ~12 requests/minute per UID (burst 2), per-UID concurrency 2, global concurrency 8, 2s penalty on validation failures
## Monitoring Alerts & Response
```mermaid
sequenceDiagram
participant Backend as Pulse Backend
participant Proxy as Sensor Proxy RPC Server
participant Limiter as Limiter (per UID & global)
participant Validator as Payload Validator
participant SSH as Cluster Node (forced `sensors -j`)
participant Metrics as Metrics & Audit Log
Backend->>Proxy: RPC request (get_temperature)
Proxy->>Proxy: Extract SO_PEERCRED (UID/GID/PID)
Proxy->>Limiter: Check per-UID rate & concurrency
alt Rate limit exceeded
Limiter-->>Proxy: reject
Proxy-->>Backend: 429 Too Many Requests (2 s penalty)
Proxy->>Metrics: increment limiter_rejections_total
else Allowed
Limiter-->>Proxy: permit
Proxy->>Validator: Validate method & payload
alt Validation failure
Validator-->>Proxy: error
Proxy-->>Backend: 400 validation error
Proxy->>Metrics: penalty + audit log entry
else Valid request
Validator-->>Proxy: ok
Proxy->>SSH: run `sensors -j` via forced command
SSH-->>Proxy: temperature JSON
Proxy-->>Backend: telemetry payload
Proxy->>Metrics: record success, latency histogram
Proxy->>Metrics: append audit/audit trail
end
end
```
### Rate Limit Hits (`pulse_proxy_limiter_rejections_total`)
1. Check audit log entries tagged `limiter.rejection` for offending UID.
2. Confirm workload legitimacy; if expected, consider increasing limits via config override.

View File

@@ -31,6 +31,38 @@ dist/ # Generated bundled scripts
└── install-*.sh # Ready for distribution
```
### Development & Bundling Workflow
```mermaid
flowchart TD
Author["Author Code\nscripts/lib/*.sh\nscripts/install-*.sh"]
WriteTests["Write Tests\nscripts/tests/test-*.sh\nscripts/tests/integration/"]
UpdateManifest["Update Bundle Manifest\nscripts/bundle.manifest"]
RunTests["Run Tests\nmake test-scripts\nscripts/tests/run.sh"]
TestPass{"Tests Pass?"}
FixCode["Fix Issues"]
Bundle["Bundle Scripts\nmake bundle-scripts\nbash scripts/bundle.sh"]
ValidateBundled["Validate Bundled Output\nbash -n dist/*.sh\ndist/*.sh --dry-run"]
ValidatePass{"Validation\nPass?"}
Distribute["Distribute\ndist/*.sh ready"]
UpdateDocs["Update Documentation\nscripts/lib/README.md"]
Author --> WriteTests
WriteTests --> UpdateManifest
UpdateManifest --> RunTests
RunTests --> TestPass
TestPass -->|No| FixCode
FixCode --> Author
TestPass -->|Yes| Bundle
Bundle --> ValidateBundled
ValidateBundled --> ValidatePass
ValidatePass -->|No| FixCode
ValidatePass -->|Yes| UpdateDocs
UpdateDocs --> Distribute
```
This workflow emphasizes the library's modular design: develop reusable modules in `scripts/lib`, test thoroughly, bundle for distribution, and validate bundled artifacts before release.
## 3. Using the Library in Your Script
```bash

View File

@@ -44,6 +44,8 @@ import { DockerIcon } from '@/components/icons/DockerIcon';
import { AlertsIcon } from '@/components/icons/AlertsIcon';
import { SettingsGearIcon } from '@/components/icons/SettingsGearIcon';
import { TokenRevealDialog } from './components/TokenRevealDialog';
import { ActivationBanner } from './components/Alerts/ActivationBanner';
import { useAlertsActivation } from './stores/alertsActivation';
// Enhanced store type with proper typing
type EnhancedStore = ReturnType<typeof getGlobalWebSocketStore>;
@@ -88,6 +90,7 @@ function App() {
: getGlobalWebSocketStore();
return store || getGlobalWebSocketStore();
};
const alertsActivation = useAlertsActivation();
const fallbackState: State = {
nodes: [],
@@ -183,6 +186,11 @@ function App() {
}
});
onMount(() => {
void alertsActivation.refreshConfig();
void alertsActivation.refreshActiveAlerts();
});
// No longer need tab state management - using router now
// Version info
@@ -559,6 +567,15 @@ function App() {
<DarkModeContext.Provider value={darkMode}>
<SecurityWarning />
<DemoBanner />
<ActivationBanner
activationState={alertsActivation.activationState}
activeAlerts={alertsActivation.activeAlerts}
config={alertsActivation.config}
isPastObservationWindow={alertsActivation.isPastObservationWindow}
isLoading={alertsActivation.isLoading}
refreshActiveAlerts={alertsActivation.refreshActiveAlerts}
activate={alertsActivation.activate}
/>
<UpdateBanner />
<LegacySSHBanner />
<div class="min-h-screen bg-gray-100 dark:bg-gray-900 text-gray-800 dark:text-gray-200 font-sans py-4 sm:py-6">

View File

@@ -58,6 +58,12 @@ export class AlertsAPI {
});
}
static async activate(): Promise<{ success: boolean; state: string; activationTime?: string }> {
return apiFetchJSON(`${this.baseUrl}/activate`, {
method: 'POST',
});
}
static async clearAlert(alertId: string): Promise<{ success: boolean }> {
return apiFetchJSON(`${this.baseUrl}/${encodeURIComponent(alertId)}/clear`, {
method: 'POST',

View File

@@ -0,0 +1,117 @@
import { Show, createEffect, createMemo, createSignal } from 'solid-js';
import type { JSX } from 'solid-js';
import type { Alert } from '@/types/api';
import type { ActivationState, AlertConfig } from '@/types/alerts';
import { ActivationModal } from './ActivationModal';
interface ActivationBannerProps {
activationState: () => ActivationState | null;
activeAlerts: () => Alert[] | undefined;
config: () => AlertConfig | null;
isPastObservationWindow: () => boolean;
isLoading: () => boolean;
refreshActiveAlerts: () => Promise<void>;
activate: () => Promise<boolean>;
}
export function ActivationBanner(props: ActivationBannerProps): JSX.Element {
const [isModalOpen, setIsModalOpen] = createSignal(false);
const shouldShow = createMemo(() => {
const state = props.activationState();
return state === 'pending_review' || state === 'snoozed';
});
createEffect(() => {
// Close the modal automatically if activation becomes active while it is open
if (!shouldShow() && isModalOpen()) {
setIsModalOpen(false);
}
});
const violationCount = createMemo(() => props.activeAlerts()?.length ?? 0);
const observationSummary = createMemo(() => {
const count = violationCount();
if (count <= 0) {
return 'No alert violations detected during observation yet.';
}
const label = count === 1 ? 'issue' : 'issues';
return `${count} ${label} detected during observation.`;
});
const handleReview = async () => {
await props.refreshActiveAlerts();
setIsModalOpen(true);
};
const handleActivated = async () => {
await props.refreshActiveAlerts();
};
return (
<>
<Show when={shouldShow()}>
<div class="bg-blue-50 dark:bg-blue-900/30 border-b border-blue-200 dark:border-blue-800 text-blue-900 dark:text-blue-100 relative animate-slideDown">
<div class="px-4 py-2">
<div class="flex flex-col gap-3 sm:flex-row sm:items-center sm:justify-between">
<div class="flex items-start gap-3">
<svg
class="w-5 h-5 flex-shrink-0 text-blue-600 dark:text-blue-300 mt-0.5"
viewBox="0 0 24 24"
fill="none"
stroke="currentColor"
stroke-width="2"
>
<path
d="M12 22c1.1 0 2-.9 2-2h-4c0 1.1.9 2 2 2z"
stroke-linecap="round"
stroke-linejoin="round"
/>
<path
d="M18 16v-5a6 6 0 1 0-12 0v5l-2 2h16l-2-2z"
stroke-linecap="round"
stroke-linejoin="round"
/>
</svg>
<div class="space-y-1">
<p class="text-sm font-medium">
Monitoring is live; notifications will start after you review settings.
</p>
<p class="text-xs text-blue-700 dark:text-blue-200">{observationSummary()}</p>
<Show when={props.isPastObservationWindow()}>
<p class="text-xs font-semibold text-blue-800 dark:text-blue-100">
24h observation endingactivate to start notifications.
</p>
</Show>
</div>
</div>
<div class="flex items-center gap-2">
<button
type="button"
class="inline-flex items-center justify-center px-3 py-1.5 text-sm font-medium rounded-md bg-blue-600 hover:bg-blue-700 text-white transition-colors disabled:opacity-60 disabled:cursor-not-allowed"
onClick={handleReview}
disabled={props.isLoading()}
>
{props.isLoading() ? 'Loading…' : 'Review & Activate'}
</button>
</div>
</div>
</div>
</div>
</Show>
<ActivationModal
isOpen={isModalOpen()}
onClose={() => setIsModalOpen(false)}
onActivated={handleActivated}
config={props.config}
activeAlerts={props.activeAlerts}
isLoading={props.isLoading}
activate={props.activate}
refreshActiveAlerts={props.refreshActiveAlerts}
/>
</>
);
}

View File

@@ -0,0 +1,354 @@
import { For, Show, createMemo, createSignal } from 'solid-js';
import { Portal } from 'solid-js/web';
import { useNavigate } from '@solidjs/router';
import type { JSX } from 'solid-js';
import type { Alert } from '@/types/api';
import type { AlertConfig, AlertThresholds, HysteresisThreshold } from '@/types/alerts';
import { showError, showSuccess } from '@/utils/toast';
interface ActivationModalProps {
isOpen: boolean;
onClose: () => void;
onActivated?: () => Promise<void> | void;
config: () => AlertConfig | null;
activeAlerts: () => Alert[] | undefined;
isLoading: () => boolean;
activate: () => Promise<boolean>;
refreshActiveAlerts: () => Promise<void>;
}
interface ThresholdSummary {
heading: string;
items: Array<{ label: string; value: string }>;
}
const extractTrigger = (
threshold?: HysteresisThreshold | number,
legacy?: number,
): number | undefined => {
if (typeof threshold === 'number') {
return threshold;
}
if (threshold && typeof threshold === 'object' && typeof threshold.trigger === 'number') {
return threshold.trigger;
}
if (typeof legacy === 'number') {
return legacy;
}
return undefined;
};
const formatThreshold = (value: number | undefined): string => {
if (value === undefined || Number.isNaN(value)) {
return 'Not configured';
}
if (value <= 0) {
return 'Disabled';
}
return `${value}%`;
};
const summarizeThresholds = (config: AlertConfig | null): ThresholdSummary[] => {
if (!config) {
return [];
}
const summarize = (thresholds?: AlertThresholds): Array<{ label: string; value: string }> => {
if (!thresholds) return [];
return [
{
label: 'CPU',
value: formatThreshold(extractTrigger(thresholds.cpu, thresholds.cpuLegacy)),
},
{
label: 'Memory',
value: formatThreshold(extractTrigger(thresholds.memory, thresholds.memoryLegacy)),
},
{
label: 'Disk',
value: formatThreshold(extractTrigger(thresholds.disk, thresholds.diskLegacy)),
},
];
};
const guestItems = summarize(config.guestDefaults);
const nodeItems = summarize(config.nodeDefaults);
const storageValue = formatThreshold(extractTrigger(config.storageDefault));
const summaries: ThresholdSummary[] = [];
if (guestItems.length > 0) {
summaries.push({ heading: 'Guest thresholds', items: guestItems });
}
if (nodeItems.length > 0) {
const nodeWithTemperature = [
...nodeItems,
{
label: 'Temperature',
value: formatThreshold(extractTrigger(config.nodeDefaults?.temperature)),
},
];
summaries.push({ heading: 'Node thresholds', items: nodeWithTemperature });
}
summaries.push({
heading: 'Storage',
items: [
{
label: 'Usage',
value: storageValue,
},
],
});
return summaries;
};
const getChannelSummary = (config: AlertConfig | null): { status: 'configured' | 'missing'; message: string } => {
if (!config || !config.notifications) {
return {
status: 'missing',
message: 'Notification channels are not configured yet. Configure email or webhook destinations before activation.',
};
}
const emailConfigured = Boolean(config.notifications.email?.server);
const webhookConfigured = Boolean(config.notifications.webhooks?.some((hook) => hook.enabled));
if (!emailConfigured && !webhookConfigured) {
return {
status: 'missing',
message: 'Notification channels are not configured yet. Configure email or webhook destinations before activation.',
};
}
if (emailConfigured && webhookConfigured) {
return {
status: 'configured',
message: 'Email and webhook destinations are ready. You can fine-tune them under Notification Destinations.',
};
}
if (emailConfigured) {
return {
status: 'configured',
message: 'Email notifications are configured. Add additional webhook destinations if needed.',
};
}
return {
status: 'configured',
message: 'Webhook notifications are configured. Add email fallbacks if needed.',
};
};
export function ActivationModal(props: ActivationModalProps): JSX.Element {
const navigate = useNavigate();
const [isSubmitting, setIsSubmitting] = createSignal(false);
const thresholdSummaries = createMemo(() => summarizeThresholds(props.config()));
const violations = createMemo(() => props.activeAlerts() ?? []);
const violationCount = createMemo(() => violations().length);
const channelSummary = createMemo(() => getChannelSummary(props.config()));
const observationHours = createMemo(() => props.config()?.observationWindowHours ?? 24);
const handleActivate = async () => {
if (isSubmitting()) {
return;
}
setIsSubmitting(true);
const success = await props.activate();
if (success) {
await props.refreshActiveAlerts();
showSuccess('Alert notifications activated. Notifications will now dispatch to configured destinations.');
if (props.onActivated) {
await props.onActivated();
}
props.onClose();
} else {
showError('Failed to activate alert notifications. Please try again.');
}
setIsSubmitting(false);
};
const handleNavigateDestinations = () => {
props.onClose();
navigate('/alerts/destinations');
};
return (
<Show when={props.isOpen}>
<Portal>
<div class="fixed inset-0 z-50 flex items-center justify-center p-4">
<div class="absolute inset-0 bg-black/50 dark:bg-black/60" onClick={props.onClose} />
<div class="relative bg-white dark:bg-gray-800 rounded-lg shadow-xl max-w-3xl w-full max-h-[90vh] overflow-hidden border border-gray-200 dark:border-gray-700">
<div class="px-6 py-4 border-b border-gray-200 dark:border-gray-700 flex items-center justify-between">
<div>
<h2 class="text-lg font-semibold text-gray-900 dark:text-gray-100">Review alerts before activating</h2>
<p class="text-sm text-gray-600 dark:text-gray-400">
Monitoring is already running. Confirm thresholds and destinations before enabling notifications.
</p>
</div>
<button
type="button"
class="p-1.5 rounded-md hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-500 dark:text-gray-400 transition-colors"
onClick={props.onClose}
aria-label="Close activation review"
>
<svg class="w-4 h-4" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<line x1="18" y1="6" x2="6" y2="18" />
<line x1="6" y1="6" x2="18" y2="18" />
</svg>
</button>
</div>
<div class="px-6 py-5 space-y-6 overflow-y-auto">
<section>
<h3 class="text-sm font-semibold text-gray-800 dark:text-gray-200 uppercase tracking-wide">
Current thresholds
</h3>
<p class="text-xs text-gray-500 dark:text-gray-400 mt-1">
Thresholds determine when alerts fire. Adjust them under Alert Thresholds if needed before activating.
</p>
<div class="mt-4 grid gap-4 sm:grid-cols-2">
<For each={thresholdSummaries()}>
{(section) => (
<div class="rounded-md border border-gray-200 dark:border-gray-700 bg-white dark:bg-gray-800/60 p-3">
<h4 class="text-xs font-semibold text-gray-700 dark:text-gray-300 uppercase">
{section.heading}
</h4>
<ul class="mt-2 space-y-1">
<For each={section.items}>
{(item) => (
<li class="flex items-center justify-between text-sm text-gray-700 dark:text-gray-300">
<span>{item.label}</span>
<span class="font-medium text-gray-900 dark:text-gray-100">{item.value}</span>
</li>
)}
</For>
</ul>
</div>
)}
</For>
</div>
</section>
<section>
<div class="flex items-center justify-between">
<h3 class="text-sm font-semibold text-gray-800 dark:text-gray-200 uppercase tracking-wide">
Issues detected
</h3>
<span class="text-xs text-gray-500 dark:text-gray-400">
Observation window: {observationHours()}h
</span>
</div>
<p class="text-xs text-gray-500 dark:text-gray-400 mt-1">
{violationCount() > 0
? 'These alerts are currently open. Activating notifications will send them to configured channels.'
: 'No alerts have breached thresholds yet. Activation will notify you immediately when new issues appear.'}
</p>
<Show
when={violationCount() > 0}
fallback={
<div class="mt-4 rounded-md border border-dashed border-gray-300 dark:border-gray-600 bg-gray-50 dark:bg-gray-900/30 p-4 text-sm text-gray-600 dark:text-gray-400">
No active violations detected during the observation window.
</div>
}
>
<div class="mt-4 space-y-3">
<For each={violations()}>
{(alert) => (
<div
class={`border rounded-md p-3 text-sm transition-colors ${
alert.level === 'critical'
? 'border-red-300 dark:border-red-700 bg-red-50 dark:bg-red-900/20'
: 'border-yellow-300 dark:border-yellow-700 bg-yellow-50 dark:bg-yellow-900/20'
}`}
>
<div class="flex items-center justify-between">
<div class="flex items-center gap-2">
<span
class={`px-2 py-0.5 rounded-full text-xs font-semibold uppercase ${
alert.level === 'critical'
? 'bg-red-600 text-white'
: 'bg-yellow-500 text-gray-900'
}`}
>
{alert.level}
</span>
<span class="font-medium text-gray-800 dark:text-gray-100">
{alert.resourceName || alert.resourceId}
</span>
</div>
<span class="text-xs text-gray-600 dark:text-gray-300">{alert.type}</span>
</div>
<p class="mt-2 text-xs text-gray-600 dark:text-gray-300">{alert.message}</p>
<p class="mt-1 text-xs text-gray-500 dark:text-gray-400">
Threshold {alert.threshold}% Current {alert.value}% Since{' '}
{new Date(alert.startTime).toLocaleString()}
</p>
</div>
)}
</For>
</div>
</Show>
</section>
<section>
<h3 class="text-sm font-semibold text-gray-800 dark:text-gray-200 uppercase tracking-wide">
Notification channels
</h3>
<div
class={`mt-3 rounded-md border p-4 ${
channelSummary().status === 'configured'
? 'border-green-200 dark:border-green-700 bg-green-50 dark:bg-green-900/20'
: 'border-blue-200 dark:border-blue-700 bg-blue-50 dark:bg-blue-900/20'
}`}
>
<p class="text-sm text-gray-800 dark:text-gray-100">{channelSummary().message}</p>
<button
type="button"
class="mt-3 inline-flex items-center gap-1 text-sm font-medium text-blue-600 dark:text-blue-300 hover:text-blue-700 dark:hover:text-blue-200 transition-colors"
onClick={handleNavigateDestinations}
>
Open Notification Destinations
<svg class="w-3.5 h-3.5" viewBox="0 0 20 20" fill="currentColor">
<path d="M12.293 2.293a1 1 0 011.414 0l4 4a1 1 0 010 1.414l-8 8a1 1 0 01-.497.263l-4 1a1 1 0 01-1.213-1.213l1-4a1 1 0 01.263-.497l8-8z" />
</svg>
</button>
</div>
</section>
</div>
<div class="px-6 py-4 border-t border-gray-200 dark:border-gray-700 bg-gray-50 dark:bg-gray-900/40 flex flex-col gap-3 sm:flex-row sm:items-center sm:justify-between">
<p class="text-xs text-gray-600 dark:text-gray-400">
You can snooze alerts later if you need a quiet period.
</p>
<div class="flex items-center gap-2">
<button
type="button"
class="px-4 py-2 text-sm font-medium text-gray-700 dark:text-gray-200 bg-white dark:bg-gray-800 border border-gray-300 dark:border-gray-600 rounded-md hover:bg-gray-100 dark:hover:bg-gray-700 transition-colors"
onClick={props.onClose}
>
Not now
</button>
<button
type="button"
class="inline-flex items-center justify-center px-4 py-2 text-sm font-semibold rounded-md bg-blue-600 hover:bg-blue-700 text-white transition-colors disabled:opacity-60 disabled:cursor-not-allowed"
onClick={handleActivate}
disabled={isSubmitting() || props.isLoading()}
>
{isSubmitting() || props.isLoading() ? 'Activating…' : 'Activate Notifications'}
</button>
</div>
</div>
</div>
</div>
</Portal>
</Show>
);
}

View File

@@ -0,0 +1,96 @@
import { createSignal } from 'solid-js';
import { AlertsAPI } from '@/api/alerts';
import type { AlertConfig, ActivationState as ActivationStateType } from '@/types/alerts';
import type { Alert } from '@/types/api';
// Create signals for activation state
const [config, setConfig] = createSignal<AlertConfig | null>(null);
const [activationState, setActivationState] = createSignal<ActivationStateType | null>(null);
const [isLoading, setIsLoading] = createSignal(false);
const [activeAlerts, setActiveAlerts] = createSignal<Alert[]>([]);
const [lastError, setLastError] = createSignal<string | null>(null);
// Refresh config from API
const refreshConfig = async (): Promise<void> => {
try {
setIsLoading(true);
setLastError(null);
const alertConfig = await AlertsAPI.getConfig();
setConfig(alertConfig);
setActivationState(alertConfig.activationState || 'active');
} catch (error) {
console.error('Failed to fetch alert config:', error);
setLastError(error instanceof Error ? error.message : 'Unknown error');
} finally {
setIsLoading(false);
}
};
// Fetch active alerts (for violation count)
const refreshActiveAlerts = async (): Promise<void> => {
try {
const alerts = await AlertsAPI.getActive();
setActiveAlerts(alerts);
} catch (error) {
console.error('Failed to fetch active alerts:', error);
// Don't set error state for this - it's not critical
}
};
// Activate alert notifications
const activate = async (): Promise<boolean> => {
try {
setIsLoading(true);
setLastError(null);
const result = await AlertsAPI.activate();
if (result.success) {
// Refresh config to get updated state
await refreshConfig();
return true;
}
return false;
} catch (error) {
console.error('Failed to activate alerts:', error);
setLastError(error instanceof Error ? error.message : 'Unknown error');
return false;
} finally {
setIsLoading(false);
}
};
// Check if past observation window
const isPastObservationWindow = (): boolean => {
const cfg = config();
if (!cfg || !cfg.activationTime || !cfg.observationWindowHours) {
return false;
}
const activationTime = new Date(cfg.activationTime);
const windowMs = cfg.observationWindowHours * 60 * 60 * 1000;
const expiryTime = activationTime.getTime() + windowMs;
return Date.now() > expiryTime;
};
// Export the store
export const useAlertsActivation = () => ({
// Signals
config,
activationState,
isLoading,
activeAlerts,
lastError,
// Computed
isPastObservationWindow,
// Actions
refreshConfig,
refreshActiveAlerts,
activate,
});
// Initialize on module load
refreshConfig();
refreshActiveAlerts();

View File

@@ -97,8 +97,13 @@ export interface BackupAlertConfig {
criticalDays: number;
}
export type ActivationState = 'pending_review' | 'active' | 'snoozed';
export interface AlertConfig {
enabled: boolean;
activationState?: ActivationState;
observationWindowHours?: number;
activationTime?: string;
guestDefaults: AlertThresholds;
nodeDefaults: AlertThresholds;
storageDefault: HysteresisThreshold;

View File

@@ -24,6 +24,15 @@ const (
AlertLevelCritical AlertLevel = "critical"
)
// ActivationState represents the alert notification activation state
type ActivationState string
const (
ActivationPending ActivationState = "pending_review"
ActivationActive ActivationState = "active"
ActivationSnoozed ActivationState = "snoozed"
)
func normalizePoweredOffSeverity(level AlertLevel) AlertLevel {
switch strings.ToLower(string(level)) {
case string(AlertLevelCritical):
@@ -309,6 +318,9 @@ type GuestLookup struct {
// AlertConfig represents the complete alert configuration
type AlertConfig struct {
Enabled bool `json:"enabled"`
ActivationState ActivationState `json:"activationState,omitempty"`
ObservationWindowHours int `json:"observationWindowHours,omitempty"`
ActivationTime *time.Time `json:"activationTime,omitempty"`
GuestDefaults ThresholdConfig `json:"guestDefaults"`
NodeDefaults ThresholdConfig `json:"nodeDefaults"`
StorageDefault HysteresisThreshold `json:"storageDefault"`
@@ -455,7 +467,9 @@ func NewManager() *Manager {
pmgAnomalyTrackers: make(map[string]*pmgAnomalyTracker),
ackState: make(map[string]ackRecord),
config: AlertConfig{
Enabled: true,
Enabled: true,
ActivationState: ActivationPending,
ObservationWindowHours: 24,
GuestDefaults: ThresholdConfig{
PoweredOffSeverity: AlertLevelWarning,
CPU: &HysteresisThreshold{Trigger: 80, Clear: 75},
@@ -615,6 +629,15 @@ func (m *Manager) dispatchAlert(alert *Alert, async bool) bool {
return false
}
// Check activation state - only dispatch notifications if active
if m.config.ActivationState != ActivationActive {
log.Debug().
Str("alertID", alert.ID).
Str("activationState", string(m.config.ActivationState)).
Msg("Alert notification suppressed - not activated")
return false
}
if suppressed, reason := m.shouldSuppressNotification(alert); suppressed {
log.Debug().
Str("alertID", alert.ID).
@@ -783,6 +806,27 @@ func (m *Manager) UpdateConfig(config AlertConfig) {
config.GuestDefaults.PoweredOffSeverity = normalizePoweredOffSeverity(config.GuestDefaults.PoweredOffSeverity)
config.NodeDefaults.PoweredOffSeverity = normalizePoweredOffSeverity(config.NodeDefaults.PoweredOffSeverity)
// Migration logic for activation state (backward compatibility)
if config.ObservationWindowHours <= 0 {
config.ObservationWindowHours = 24
}
if config.ActivationState == "" {
// Determine if this is an existing installation or new
// Existing installations have active alerts already
isExistingInstall := len(m.activeAlerts) > 0 || len(config.Overrides) > 0
if isExistingInstall {
// Existing install: auto-activate to preserve behavior
config.ActivationState = ActivationActive
now := time.Now()
config.ActivationTime = &now
log.Info().Msg("Migrating existing installation to active alert state")
} else {
// New install: start in pending review
config.ActivationState = ActivationPending
log.Info().Msg("New installation: alerts pending activation")
}
}
m.config = config
for id, override := range m.config.Overrides {
override.PoweredOffSeverity = normalizePoweredOffSeverity(override.PoweredOffSeverity)
@@ -6548,17 +6592,15 @@ func (m *Manager) LoadActiveAlerts() error {
// Only notify for alerts that started recently (within last 2 hours) to avoid spam
if alert.Level == AlertLevelCritical && now.Sub(alert.StartTime) < 2*time.Hour {
// Use a goroutine and add a small delay to avoid notification spam on startup
if m.onAlert != nil {
alertCopy := alert.Clone()
go func(a *Alert) {
time.Sleep(10 * time.Second) // Wait for system to stabilize after restart
log.Info().
Str("alertID", a.ID).
Str("resource", a.ResourceName).
Msg("Sending notification for restored critical alert")
m.onAlert(a)
}(alertCopy)
}
alertCopy := alert.Clone()
go func(a *Alert) {
time.Sleep(10 * time.Second) // Wait for system to stabilize after restart
log.Info().
Str("alertID", a.ID).
Str("resource", a.ResourceName).
Msg("Attempting to send notification for restored critical alert")
m.dispatchAlert(a, false) // Use dispatchAlert to respect activation state and quiet hours
}(alertCopy)
}
}

View File

@@ -103,6 +103,50 @@ func (h *AlertHandlers) UpdateAlertConfig(w http.ResponseWriter, r *http.Request
}
}
// ActivateAlerts activates alert notifications
func (h *AlertHandlers) ActivateAlerts(w http.ResponseWriter, r *http.Request) {
// Get current config
config := h.monitor.GetAlertManager().GetConfig()
// Check if already active
if config.ActivationState == alerts.ActivationActive {
if err := utils.WriteJSONResponse(w, map[string]interface{}{
"status": "success",
"message": "Alerts already activated",
"state": string(config.ActivationState),
}); err != nil {
log.Error().Err(err).Msg("Failed to write activate response")
}
return
}
// Activate notifications
now := time.Now()
config.ActivationState = alerts.ActivationActive
config.ActivationTime = &now
// Update config
h.monitor.GetAlertManager().UpdateConfig(config)
// Save to persistent storage
if err := h.monitor.GetConfigPersistence().SaveAlertConfig(config); err != nil {
log.Error().Err(err).Msg("Failed to save alert configuration after activation")
http.Error(w, "Failed to save configuration", http.StatusInternalServerError)
return
}
log.Info().Msg("Alert notifications activated")
if err := utils.WriteJSONResponse(w, map[string]interface{}{
"status": "success",
"message": "Alert notifications activated",
"state": string(config.ActivationState),
"activationTime": config.ActivationTime,
}); err != nil {
log.Error().Err(err).Msg("Failed to write activate response")
}
}
// GetActiveAlerts returns all active alerts
func (h *AlertHandlers) GetActiveAlerts(w http.ResponseWriter, r *http.Request) {
alerts := h.monitor.GetAlertManager().GetActiveAlerts()
@@ -619,6 +663,8 @@ func (h *AlertHandlers) HandleAlerts(w http.ResponseWriter, r *http.Request) {
h.GetAlertConfig(w, r)
case path == "config" && r.Method == http.MethodPut:
h.UpdateAlertConfig(w, r)
case path == "activate" && r.Method == http.MethodPost:
h.ActivateAlerts(w, r)
case path == "active" && r.Method == http.MethodGet:
h.GetActiveAlerts(w, r)
case path == "history" && r.Method == http.MethodGet: