mirror of https://github.com/rcourtman/Pulse.git synced 2026-02-18 00:17:39 +01:00

Files

rcourtman bb7ca93c18 feat: Add mdadm RAID monitoring support for host agents

Implements comprehensive mdadm RAID array monitoring for Linux hosts
via pulse-host-agent. Arrays are automatically detected and monitored
with real-time status updates, rebuild progress tracking, and automatic
alerting for degraded or failed arrays.

Key changes:

**Backend:**
- Add mdadm package for parsing mdadm --detail output
- Extend host agent report structure with RAID array data
- Integrate mdadm collection into host agent (Linux-only, best-effort)
- Add RAID array processing in monitoring system
- Implement automatic alerting:
  - Critical alerts for degraded arrays or arrays with failed devices
  - Warning alerts for rebuilding/resyncing arrays with progress tracking
  - Auto-clear alerts when arrays return to healthy state

**Frontend:**
- Add TypeScript types for RAID arrays and devices
- Display RAID arrays in host details drawer with:
  - Array status (clean/degraded/recovering) with color-coded indicators
  - Device counts (active/total/failed/spare)
  - Rebuild progress percentage and speed when applicable
  - Green for healthy, amber for rebuilding, red for degraded

**Documentation:**
- Document mdadm monitoring feature in HOST_AGENT.md
- Explain requirements (Linux, mdadm installed, root access)
- Clarify scope (software RAID only, hardware RAID not supported)

**Testing:**
- Add comprehensive tests for mdadm output parsing
- Test parsing of healthy, degraded, and rebuilding arrays
- Verify proper extraction of device states and rebuild progress

All builds pass successfully. RAID monitoring is automatic and best-effort
- if mdadm is not installed or no arrays exist, host agent continues
reporting other metrics normally.

Related to #676

2025-11-09 16:36:33 +00:00

9.7 KiB

Raw Blame History

Pulse Host Agent

The Pulse host agent extends monitoring to standalone servers that do not expose Proxmox or Docker APIs. With it you can surface uptime, OS metadata, CPU load, memory/disk utilisation, temperature sensors, and connection health for any Linux, macOS, or Windows machine alongside the rest of your infrastructure. Starting in v4.26.0 the installer handshakes with Pulse in real time so you can confirm registration from the UI and receive host-agent alerts alongside your existing Docker/Proxmox notifications.

Temperature Monitoring

The host agent automatically collects temperature data on Linux systems with lm-sensors installed:

CPU Package Temperature: Overall CPU temperature
Per-Core Temperatures: Individual CPU core readings
NVMe Drive Temperatures: SSD thermal data
GPU Temperatures: AMD and NVIDIA GPU sensors

Temperature data appears in the Hosts tab alongside other host metrics. This is particularly useful for monitoring Proxmox hosts when running Pulse in a VM (where the sensor proxy socket cannot cross VM boundaries).

Requirements:

Linux operating system
lm-sensors package installed (apt-get install lm-sensors)
Sensors configured (sensors-detect --auto)

Temperature collection is automatic and best-effort. If lm-sensors is not installed or sensors are unavailable, the agent continues reporting other metrics normally.

RAID Monitoring (mdadm)

The host agent automatically collects mdadm RAID array information on Linux systems:

Array Status: Displays array state (clean, degraded, recovering, resyncing)
Device Health: Shows active, failed, and spare device counts
Rebuild Progress: Real-time rebuild or resync percentage and speed
Automatic Alerting: Critical alerts for degraded arrays, warnings during rebuilds

RAID data appears in the Hosts tab when you expand a host's details. Each array shows its RAID level, current state, and device status. Color-coded indicators highlight:

Green: Healthy arrays (clean state, no failed devices)
Amber: Rebuilding or resyncing arrays
Red: Degraded arrays or arrays with failed devices

Requirements:

Linux operating system
mdadm installed and configured
Root or sudo access for the host agent

RAID monitoring is automatic and best-effort. If mdadm is not installed or no arrays are present, the agent continues reporting other metrics normally. Only Linux software RAID (mdadm) is supported; hardware RAID controllers are not monitored.

Prerequisites

Pulse v4.26.0 or newer (host agent reporting shipped with /api/agents/host/report)
An API token with the host-agent:report scope (create under Settings → Security)
Outbound HTTP/HTTPS connectivity from the host back to Pulse

ℹ️ The agent only initiates outbound connections; no inbound firewall rules are required.

If your Pulse instance does not require API tokens (e.g. during an on-premises lab install) you can still generate commands without embedding a credential. Confirm the warning in Settings → Agents → Host agents and the script will prompt for a token instead of hard-coding one.

Quick Start

Replace <api-token> with a Pulse API token limited to the host-agent:report scope. Tokens generated from Settings → Agents → Host agents already apply this scope.

Linux

The hosted installer handles systemd, rc.local environments, and Unraid automatically.

curl -fsSL http://pulse.example.local:7655/install-host-agent.sh | \
  bash -s -- --url http://pulse.example.local:7655 --token <api-token>

On systemd machines the script installs the binary, wires up /etc/systemd/system/pulse-host-agent.service, enables it, and tails the registration status.
On Unraid hosts it starts the agent under nohup, creates /var/log/pulse, and (optionally) inserts the auto-start line into /boot/config/go.
On minimalist distros without systemd (e.g. Alpine) it creates/updates /etc/rc.local, adds the background runner, and verifies it launches.

Use --force to skip interactive prompts or --interval 1m to change the polling cadence.

macOS

curl -fsSL http://pulse.example.local:7655/install-host-agent.sh | \
  bash -s -- --url http://pulse.example.local:7655 --token <api-token>

On macOS the installer stores the token in the Keychain when possible, generates a launchd plist inside ~/Library/LaunchAgents, and restarts the job so the agent survives logouts and reboots.

Windows

Run the PowerShell bootstrapper as an administrator:

irm http://pulse.example.local:7655/install-host-agent.ps1 | iex

Set PULSE_URL and PULSE_TOKEN in the environment first for a non-interactive flow:

$env:PULSE_URL    = "http://pulse.example.local:7655"
$env:PULSE_TOKEN  = "<api-token>"
irm http://pulse.example.local:7655/install-host-agent.ps1 | iex

The script installs the service under PulseHostAgent, registers Windows Event Log messages, configures automatic recovery on failure, and waits for Pulse to acknowledge the new host.

Manual installation (advanced)

Prefer to take full control or working in air-gapped environments? You can still download the static binaries and wire them up manually. The commands below mirror what the installer scripts perform for their respective platforms.

Linux (systemd)

# Download the binary for your architecture
sudo curl -fsSL https://github.com/rcourtman/Pulse/releases/latest/download/pulse-host-agent-linux-amd64 \
  -o /usr/local/bin/pulse-host-agent
sudo chmod +x /usr/local/bin/pulse-host-agent

# Run the agent
sudo /usr/local/bin/pulse-host-agent \
  --url http://pulse.example.local:7655 \
  --token <api-token> \
  --interval 30s

Available Linux architectures: pulse-host-agent-linux-amd64, pulse-host-agent-linux-arm64, pulse-host-agent-linux-armv7

For persistence, drop a systemd unit (e.g. /etc/systemd/system/pulse-host-agent.service) referencing the same command and enable it with systemctl enable --now pulse-host-agent.

macOS (launchd)

sudo curl -fsSL https://github.com/rcourtman/Pulse/releases/latest/download/pulse-host-agent-darwin-arm64 \
  -o /usr/local/bin/pulse-host-agent
sudo chmod +x /usr/local/bin/pulse-host-agent
sudo /usr/local/bin/pulse-host-agent \
  --url http://pulse.example.local:7655 \
  --token <api-token> \
  --interval 30s

Create ~/Library/LaunchAgents/com.pulse.host-agent.plist to keep the agent running between logins:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
  <dict>
    <key>Label</key>
    <string>com.pulse.host-agent</string>
    <key>ProgramArguments</key>
    <array>
      <string>/usr/local/bin/pulse-host-agent</string>
      <string>--url</string>
      <string>http://pulse.example.local:7655</string>
      <string>--token</string>
      <string>&lt;api-token&gt;</string>
      <string>--interval</string>
      <string>30s</string>
    </array>
    <key>RunAtLoad</key><true/>
    <key>KeepAlive</key><true/>
    <key>StandardOutPath</key><string>/Users/your-user/Library/Logs/pulse-host-agent.log</string>
    <key>StandardErrorPath</key><string>/Users/your-user/Library/Logs/pulse-host-agent.log</string>
  </dict>
</plist>

Load it with launchctl load ~/Library/LaunchAgents/com.pulse.host-agent.plist.

Windows (manual)

Compile from source (GOOS=windows GOARCH=amd64) or download the latest release, then install the Windows service yourself:

New-Service -Name PulseHostAgent `
  -BinaryPathName '"C:\Program Files\Pulse\pulse-host-agent.exe" --url http://pulse.example.local:7655 --token <api-token> --interval 30s' `
  -DisplayName "Pulse Host Agent" `
  -Description "Monitors system metrics and reports to Pulse monitoring server" `
  -StartupType Automatic
Start-Service -Name PulseHostAgent

Command Flags

Flag	Description
`--url`	Pulse base URL (defaults to `http://localhost:7655`)
`--token`	API token with the `host-agent:report` scope
`--interval`	Polling interval (`30s` default)
`--hostname`	Override reported hostname
`--agent-id`	Override agent identifier (used as dedupe key)
`--tag`	Optional tag(s) to annotate the host (repeatable)
`--insecure`	Skip TLS verification (development/testing only)
`--once`	Send a single report and exit

Run pulse-host-agent --help for the full list.

Viewing Hosts

Settings → Agents → Host agents lists every reporting host and provides ready-made install commands.
The Hosts tab surfaces host telemetry alongside Proxmox/Docker data in the main dashboard.

Checking installation status

Click Check status under Settings → Agents → Host agents and enter the host ID or hostname you just installed.
Pulse hits /api/agents/host/lookup, highlights the matching row for 10 seconds, and refreshes the connection badge, last-seen timestamp, and agent version in-line.
If the host has not checked in yet, the UI returns a friendly "Host has not registered" message so you can retry without re-running the script.

Alerts and notifications

Host agents now participate in the main alert engine. Offline detection, metric thresholds, and override scopes (global or per-host) live in Settings → Alerts → Thresholds beside your Docker and Proxmox rules.
Alert notifications, webhooks, and quiet-hours behaviour reuse the existing pipelines—no extra setup is required once you enable host-agent monitoring.

Updating

Since the agent is a single static binary, updates are as simple as replacing the file and restarting your launchd/systemd unit. The Settings pane always links to the latest release artefacts.

9.7 KiB Raw Blame History Unescape Escape