mirror of https://github.com/rcourtman/Pulse.git synced 2026-02-18 00:17:39 +01:00

Files

rcourtman 6eb1a10d9b Refactor: Code cleanup and localStorage consolidation

This commit includes comprehensive codebase cleanup and refactoring:

## Code Cleanup
- Remove dead TypeScript code (types/monitoring.ts - 194 lines duplicate)
- Remove unused Go functions (GetClusterNodes, MigratePassword, GetClusterHealthInfo)
- Clean up commented-out code blocks across multiple files
- Remove unused TypeScript exports (helpTextClass, private tag color helpers)
- Delete obsolete test files and components

## localStorage Consolidation
- Centralize all storage keys into STORAGE_KEYS constant
- Update 5 files to use centralized keys:
  * utils/apiClient.ts (AUTH, LEGACY_TOKEN)
  * components/Dashboard/Dashboard.tsx (GUEST_METADATA)
  * components/Docker/DockerHosts.tsx (DOCKER_METADATA)
  * App.tsx (PLATFORMS_SEEN)
  * stores/updates.ts (UPDATES)
- Benefits: Single source of truth, prevents typos, better maintainability

## Previous Work Committed
- Docker monitoring improvements and disk metrics
- Security enhancements and setup fixes
- API refactoring and cleanup
- Documentation updates
- Build system improvements

## Testing
- All frontend tests pass (29 tests)
- All Go tests pass (15 packages)
- Production build successful
- Zero breaking changes

Total: 186 files changed, 5825 insertions(+), 11602 deletions(-)

2025-11-04 21:50:46 +00:00

17 KiB

Raw Blame History

Docker & Podman Monitoring Agent

Pulse is focused on Proxmox VE and PBS, but many homelabs also run application stacks in container runtimes such as Docker and Podman. The optional Pulse container agent turns runtime health and resource usage into first-class metrics that show up alongside your hypervisor data. The recommended deployment is the bundled, least-privilege systemd service that runs the static pulse-docker-agent binary directly on the host. That path lets the installer lock down permissions, manage upgrades automatically, and integrate with the native init system. Containerising the agent is still available for orchestrated environments, but it trades away some of those controls (and still needs the runtime socket) so treat that option as advanced.

What the agent reports

Every check interval (30s by default) the agent collects:

Host metadata (hostname, Docker version, CPU count, total memory, uptime)
Container status (running, exited, paused) and health probe state
Restart counters and exit codes
CPU usage, memory consumption and limits
Images, port mappings, network addresses, and start times
Writable layer size, root filesystem size, block I/O totals, and mount metadata (shown in the Containers table drawer)
Read/write throughput derived from Docker block I/O counters so you can spot noisy workloads at a glance
Health-check failures, restart-loop windows, and recent exit codes (displayed in the UI under each container drawer)

Data is pushed to Pulse over HTTPS using your existing API token – no inbound firewall rules required.

Prerequisites

Pulse v4.22.0 or newer with an API token enabled (Settings → Security)
API token with the docker:report scope (add docker:manage if you use remote lifecycle commands)
Docker 20.10+ or Podman 4.7+ on Linux (the agent talks to the runtime API socket)
Access to the runtime socket (/var/run/docker.sock, /run/podman/podman.sock, or a unix:// URI)
Go 1.24+ if you plan to build the binary from source

Installation

Grab the pulse-docker-agent binary from the release assets (or build it yourself):

# Build from source
cd /opt/pulse
CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o pulse-docker-agent ./cmd/pulse-docker-agent

Copy the binary to your Docker host (e.g. /usr/local/bin/pulse-docker-agent) and make it executable.

Why CGO_ENABLED=0? Building a fully static binary ensures the agent runs on hosts still using older glibc releases (for example Debian 11 with glibc 2.31).

Quick install from your Pulse server (recommended)

Use the bundled installation script (ships with Pulse v4.22.0+) to deploy and manage the agent. Replace the token placeholder with an API token generated in Settings → Security. Create a dedicated token for each Docker host so you can revoke individual credentials without touching others—sharing one token across many hosts makes incident response much harder. Tokens used here should include the docker:report scope so the agent can submit telemetry (add docker:manage only if you plan to issue lifecycle commands remotely).

curl -fSL http://pulse.example.com/install-docker-agent.sh -o /tmp/pulse-install-docker-agent.sh && \
  sudo bash /tmp/pulse-install-docker-agent.sh --url http://pulse.example.com --token <api-token> && \
  rm -f /tmp/pulse-install-docker-agent.sh

Why sudo? The installer needs to drop binaries under /usr/local/bin, create a systemd service, and start it—actions that require root privileges. Piping to sudo bash … saves you from retrying if you run the command as an unprivileged user.

The script stores credentials in /etc/pulse/pulse-docker-agent.env (mode 600) and creates a locked-down pulse-docker service account that only needs access to the Docker socket. Rotate tokens by editing that env file and running sudo systemctl restart pulse-docker-agent.

To keep remote stop/remove commands working from Pulse, the installer also drops a small polkit rule that lets the pulse-docker service account run systemctl stop/disable pulse-docker-agent without password prompts. If you remove that rule, expect to acknowledge stop requests manually with sudo systemctl disable --now pulse-docker-agent.

Running the one-liner again from another Pulse server (with its own URL/token) will merge that server into the same agent automatically—no extra flags required.

To report to more than one Pulse instance from the same Docker host, repeat the --target flag (format: https://pulse.example.com|<api-token>) or export PULSE_TARGETS before running the script:

curl -fSL http://pulse.example.com/install-docker-agent.sh -o /tmp/pulse-install-docker-agent.sh && \
  sudo bash /tmp/pulse-install-docker-agent.sh -- \
    --target https://pulse.example.com|<primary-token> \
    --target https://pulse-dr.example.com|<dr-token> && \
  rm -f /tmp/pulse-install-docker-agent.sh

Quick install for Podman (system service)

Use the multi-runtime installer when you want the agent to run against Podman as a systemd service. The script takes care of enabling podman.socket, creating a dedicated service account, and wiring the correct runtime socket automatically:

curl -fSL http://pulse.example.com/install-container-agent.sh -o /tmp/pulse-install-container-agent.sh && \
  sudo bash /tmp/pulse-install-container-agent.sh --runtime podman --url http://pulse.example.com --token <api-token> && \
  rm -f /tmp/pulse-install-container-agent.sh

The environment file lives at /etc/pulse/pulse-docker-agent.env and the unit is still named pulse-docker-agent.service for backwards compatibility. The agent exports PULSE_RUNTIME=podman and points both CONTAINER_HOST and DOCKER_HOST at the Podman socket (/run/podman/podman.sock by default). Restart the service after editing the env file with sudo systemctl restart pulse-docker-agent.

What's new for Podman? The agent now sends pod- and compose-aware metadata for Podman hosts. Pulse surfaces pod names, infra-container markers, compose project/service identifiers, auto-update policies, and user namespace hints so you can see how containers relate without leaving the UI.

Quick install for Podman (rootless user service)

Podman’s rootless mode works too. Run the installer as the target user and add the --rootless flag — no sudo required:

curl -fSL http://pulse.example.com/install-container-agent.sh -o /tmp/pulse-install-container-agent.sh && \
  bash /tmp/pulse-install-container-agent.sh --runtime podman --rootless --url http://pulse.example.com --token <api-token> && \
  rm -f /tmp/pulse-install-container-agent.sh

The agent binary is dropped into ~/.local/bin, configuration lives under ~/.config/pulse, and a user-level service (~/.config/systemd/user/pulse-docker-agent.service) is created. Enable lingering so the agent keeps running after you log out:

sudo loginctl enable-linger "$USER"

If systemctl --user is unavailable, the installer will print the exact command you can place in a cron job or another init system.

Running the agent

The agent needs to know where Pulse lives and which API token to use.

Single instance:

export PULSE_URL="http://pulse.lan:7655"
export PULSE_TOKEN="<your-api-token>"

sudo /usr/local/bin/pulse-docker-agent --interval 30s

Multiple instances (one agent fan-out):

export PULSE_TARGETS="https://pulse-primary.lan:7655|<token-primary>;https://pulse-dr.lan:7655|<token-dr>"

sudo /usr/local/bin/pulse-docker-agent --interval 30s

You can also repeat --target https://pulse.example.com|<token> on the command line instead of using PULSE_TARGETS; the agent will broadcast each heartbeat to every configured URL.

The binary reads standard Docker environment variables. If you already use TLS-secured remote sockets set DOCKER_HOST, DOCKER_TLS_VERIFY, etc. as normal. To skip TLS verification for Pulse (not recommended) add --insecure or PULSE_INSECURE_SKIP_VERIFY=true.

Filtering container states

High churn environments can flood Pulse with noise from short-lived tasks. Restrict the agent to the container states you care about by repeating --container-state (for example, --container-state running --container-state paused) or by exporting PULSE_CONTAINER_STATES=running,paused. Allowed values match Docker’s status filter: created, running, restarting, removing, paused, exited, and dead. If no values are provided the agent reports every container, mirroring the previous behaviour.

Swarm-aware reporting

The agent now recognises Docker Swarm roles. Managers query the Swarm control plane for service and task metadata, while workers fall back to the labels present on local containers. The Settings → Docker Agents view surfaces role, scope, service counts, and updates per host so you can spot noisy stacks or unhealthy rollouts at a glance.

Use the new flags to tune the payload:

--swarm-scope / PULSE_SWARM_SCOPE chooses between node-only and cluster-wide aggregation (auto switches based on the node’s role).
--swarm-services and --swarm-tasks disable service or task blocks if you only need a subset of data.
--include-containers removes per-container metrics when service-level reporting is sufficient (note that workers need container data to derive task info).

If a manager cannot reach the Swarm API the agent automatically falls back to node scope so updates keep flowing.

Adjust warning and critical replica gaps (or disable service alerts entirely) under Alerts → Thresholds → Containers in the Pulse UI.

Multiple Pulse instances

A single pulse-docker-agent process can now serve any number of Pulse backends. Each target entry keeps its own API token and TLS preference, and Pulse de-duplicates reports using the shared agent ID / machine ID. This avoids running duplicate agents on busy Docker hosts.

Systemd unit example

[Unit]
Description=Pulse Docker Agent
After=network-online.target docker.socket docker.service
Wants=network-online.target docker.socket

[Service]
Type=simple
EnvironmentFile=-/etc/pulse/pulse-docker-agent.env
ExecStart=/usr/local/bin/pulse-docker-agent --interval 30s
Restart=on-failure
RestartSec=5s
StartLimitIntervalSec=120
StartLimitBurst=5
User=pulse-docker
Group=pulse-docker
SupplementaryGroups=docker
UMask=0077
NoNewPrivileges=yes
RestrictSUIDSGID=yes
RestrictRealtime=yes
PrivateTmp=yes
ProtectSystem=full
ProtectHome=read-only
ProtectControlGroups=yes
ProtectKernelModules=yes
ProtectKernelTunables=yes
ProtectKernelLogs=yes
LockPersonality=yes
MemoryDenyWriteExecute=yes
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6
ReadWritePaths=/var/run/docker.sock
ProtectHostname=yes
ProtectClock=yes

[Install]
WantedBy=multi-user.target

Rotate credentials or add additional Pulse targets by editing /etc/pulse/pulse-docker-agent.env and reloading the service with sudo systemctl restart pulse-docker-agent.

Containerised agent (advanced / optional)

If you prefer to run the agent inside a container, mount the Docker socket and supply the same environment variables:

docker run -d \
  --name pulse-docker-agent \
  --pid=host \
  --uts=host \
  -e PULSE_URL="https://pulse.example.com" \
  -e PULSE_TOKEN="<token>" \
  -e PULSE_TARGETS="https://pulse.example.com|<token>;https://pulse-dr.example.com|<token-dr>" \
  -e PULSE_NO_AUTO_UPDATE=true \
  -v /etc/machine-id:/etc/machine-id:ro \
  -v /var/run/docker.sock:/var/run/docker.sock \
  --restart unless-stopped \
  ghcr.io/rcourtman/pulse-docker-agent:latest

Note

: Official images for linux/amd64 and linux/arm64 are published to ghcr.io/rcourtman/pulse-docker-agent. To test local changes, run docker build --target agent_runtime -t pulse-docker-agent:test . from the repository root.

--pid=host, --uts=host, and the /etc/machine-id bind keep host metadata stable so Pulse doesn’t think the container itself is the Docker host. Auto-update is disabled in the image by default; rebuild or override PULSE_NO_AUTO_UPDATE=false only if you manage upgrades outside of your orchestrator. Expect to grant the container the same level of Docker socket access as the systemd service—running inside Docker doesn’t sandbox the agent from the host.

Configuration reference

Flag / Env var	Description	Default
`--url`, `PULSE_URL`	Pulse base URL (http/https).	`http://localhost:7655`
`--token`, `PULSE_TOKEN`	Pulse API token with `docker:report` scope (required).	—
`--target`, `PULSE_TARGETS`	One or more `url	token[
`--interval`, `PULSE_INTERVAL`	Reporting cadence (supports `30s`, `1m`, etc.).	`30s`
`--runtime`, `PULSE_RUNTIME`	Container runtime to target (`docker`, `podman`, `auto`).	`docker`
`--container-socket`, `PULSE_CONTAINER_SOCKET` / `CONTAINER_HOST`	Explicit runtime socket path or `unix://` URI.	Runtime default
`--rootless`, `PULSE_RUNTIME_ROOTLESS`	Install/manage the agent as a user service (Podman).	Auto (rootful)
`--container-state`, `PULSE_CONTAINER_STATES`	Limit reports to specific Docker statuses (`created`, `running`, `restarting`, `removing`, `paused`, `exited`, `dead`). Separate multiple values with commas/semicolons or repeat the flag.	—
`--swarm-scope`, `PULSE_SWARM_SCOPE`	Swarm data scope: `node`, `cluster`, or `auto` (auto picks cluster on managers, node on workers).	`node`
`--swarm-services`, `PULSE_SWARM_SERVICES`	Include Swarm service summaries in reports.	`true`
`--swarm-tasks`, `PULSE_SWARM_TASKS`	Include individual Swarm tasks in reports.	`true`
`--include-containers`, `PULSE_INCLUDE_CONTAINERS`	Include per-container metrics (disable when only Swarm data is needed).	`true`
`--collect-disk`, `PULSE_COLLECT_DISK`	Collect per-container disk usage, block I/O, and mount metadata. Disable to skip Docker size queries on extremely large fleets.	`true`
`--hostname`, `PULSE_HOSTNAME`	Override host name reported to Pulse.	Docker info / OS hostname
`--agent-id`, `PULSE_AGENT_ID`	Stable ID for the agent (useful for clustering).	Docker engine ID / machine-id
`--insecure`, `PULSE_INSECURE_SKIP_VERIFY`	Skip TLS cert validation (unsafe).	`false`

The agent automatically discovers the Docker socket via the usual environment variables. To use SSH tunnels or TCP sockets, export DOCKER_HOST as you would for the Docker CLI.

Disk usage monitoring & alerts

When --collect-disk is enabled (the default), Pulse records each container’s writable layer and root filesystem sizes. The Alerts engine treats the proportion of writable data to total filesystem as the disk usage percentage for that container. A fleet-wide threshold lives under Alerts → Thresholds → Containers and defaults to 85% trigger / 80% clear; adjust or disable it per host/container when your workload makes heavy use of copy-on-write layers. Containers that stop reporting disk metrics (for example when size queries are disabled) automatically skip the disk alert evaluation.

Suppressing ephemeral containers

CI runners and short-lived build containers can generate noisy state alerts when they exit on schedule. In Pulse v4.24.0 and later you can provide a list of prefixes to ignore under Alerts → Thresholds → Containers → Ignored container prefixes. Any container whose name or ID begins with a configured prefix is skipped for state, health, metric, restart-loop, and OOM alerts. Matching is case-insensitive and the list is saved as dockerIgnoredContainerPrefixes inside alerts.json. Use one entry per family of ephemeral containers (for example, runner- or gitlab-job-).

Need the alerts but at a different tone? The same Containers tab exposes global controls for the container state detector. Flip Disable container state alerts (stateDisableConnectivity) to mute powered-off/offline warnings across the fleet, or change Default severity (statePoweredOffSeverity) to critical so unexpected exits page immediately. Individual host/container overrides still win when you need exceptions.

Testing and troubleshooting

Run with --interval 15s --insecure in a terminal to see log output while testing.
Ensure the Pulse API token has not expired or been regenerated.
If pulse-docker-agent reports Cannot connect to the Docker daemon, verify the socket path and permissions.
Check Pulse (/containers tab) for the latest heartbeat time. Hosts are marked offline if they stop reporting for >4× the configured interval.
Use the search box above the host grid to filter by host name, stack label, or container name. Restart loops surface in the “Issues” column and display the last five exit codes.

Removing the agent

Stop the systemd service or container and remove the binary. Pulse retains the last reported state until it ages out after a few minutes of inactivity.

17 KiB Raw Blame History Unescape Escape