Security Audit, 2026-04-21¶

Auditor: Claude Opus 4.7 (sec-audit session) Scope: Full fleet: Finn, UDev, 7 LXCs/VMs, Cloudflare tunnels, SSH, secrets, backups Fleet state at audit: Healthy post-DNS-crisis. Load avg 1.18, all services 200, Gluetun killswitch verified (egress 159.26.100.51, not home IP).

TL;DR: Top 5 Urgent¶

#	Finding	Severity	Why
1	HA long-lived JWT committed to git (`.claude/settings.json`, ~10 lines)	CRITICAL	Token valid until 2036. If GitHub repo access leaks, HA fully controllable via public tunnel.
2	Nightly workspace backup broken since 2026-03-20 (32+ days)	CRITICAL	Cron points at stale `/mnt/pve/fast-storage` path. Workspace NVMe has zero fresh backups.
3	12 Cloudflare-tunneled subdomains without Access auth	HIGH	Anyone who knows the URL hits Plex/HA/Overseerr/audiobooks/books/requests directly. No bot wall.
4	Frigate UI + API fully open on LAN (no auth)	HIGH	Anyone on LAN/Tailscale can view all cameras, recordings, and events.
5	Security monitor spamming 2177 false-positive tasks (port 8100 nginx)	MEDIUM (ops)	Filter missing; dispatcher dead anyway; will bury real alerts when restarted.

1. Secrets in Git (`/home/justinwieb/forge`, commit `d681a5a`)¶

Repo is private on GitHub (github.com/JustinWieb/forge → 404 unauth). Blast radius limited to Justin's account + collaborators, but the secrets must still be treated as burned.

Secret	Where	Status	Action
HA long-lived JWT (`eyJhbGci…ON4`, exp 2036-03-26)	`.claude/settings.json` lines 111, 112, 124, 128–134	Tracked in initial commit	Revoke in HA UI → Profile → Security, reissue, store only in `.env`
Discord bot token (OpenClaw era, `MTQ3ODA3…SDZY97Vq_9gI`)	`.claude/settings.json` line 9	Tracked in initial commit	Bot already deleted; still scrub from history
n8n API key (`n8n_api_98szbI8281snxfQzORSg6`)	`memory/daily/2026-04-14.md`, `memory/handoffs/n8n-setup.md`	Untracked, `memory/` is in gitignored territory? No: tracked files exist in `memory/daily/`. These 2 files aren't tracked yet.	Rotate anyway (it's in plaintext filesystem notes); sanitize before any commit
HA_TOKEN in `.env` + `/opt/stacks/media/docker-compose.yml` VPN Wireguard keys	local-only on each LXC	Not in git	Acceptable; consider moving Wireguard keys from compose into .env
Cloudflare tunnel token (UDev)	`systemd[cloudflared].service` ExecStart, literal token	Not in git	Acceptable; note token grants tunnel-connect-only, not dashboard

Remediation for git-tracked secrets: - Option A (clean slate, since only 1 commit): git reset --soft; scrub .claude/settings.json; recommit; force-push. After force-push, still rotate tokens. - Option B: git filter-repo --replace-text to redact all commits; rotate tokens regardless. - Rotate first, scrub second. Rotation invalidates the leak; scrub prevents reuse of the pattern.

2. Attack Surfaces: Ranked¶

Rank	Surface	Exposure	Auth	Risk
1	`plex.justinkrystal.com` (tunnel → .73:32400)	Internet	Plex built-in (account required)	LOW-MED. Plex auth OK, but no Access layer means brute force on Plex accounts + 0-day Plex CVEs hit directly.
2	`homeassistant.justinkrystal.com` (tunnel → .70:8123)	Internet	HA login page	MED-HIGH. HA login is the wall; no MFA visible in config; combined with leaked JWT = full smart-home control from anywhere.
3	`audiobooks/books/requests.justinkrystal.com` (tunnel → :13378/:8083/:5055)	Internet	App-level only	LOW-MED. Overseerr has auth; audiobookshelf/calibre-web auth varies. No Access gate = fingerprinting + bot traffic.
4	`justinkrystal.com` root + `jk-landing` on UDev:8085	Internet	None (static site)	LOW. Public landing page, intended.
5	Frigate :5000 UI+API on 192.168.86.84	LAN + Tailscale	NONE (HTTP 200 on /api/stats and / with no header)	MED. Anyone on LAN or tailnet can watch cameras, pull recordings, enumerate events. Reolink camera RTSP password `&hVd9pZ74^YrRusPN` in config.yml.
6	Finn offers exit node on Tailscale (`finn 100.112.22.2 offers exit node`)	Tailnet	Tailscale auth	LOW today, MED if tailnet key compromised. Anyone Justin invites to the tailnet could route traffic through his ISP. Verify this is intentional.
7	UDev SSH: `PasswordAuthentication yes`	LAN + Tailscale	Key OR password	MED. Key auth works; password path is a brute-force surface. LAN-only limits it.
8	media-server SSH: `PasswordAuthentication yes`	LAN + Tailscale	Key OR password	MED. Same as above.
9	FileBrowser on 192.168.86.67:8080 tunneled at `filebrowser.justinkrystal.com`	Internet	Access (✓) + FileBrowser login	LOW. Access policy gates it.
10	NFS on Finn exports w/ `no_root_squash` to UDev (.50 only)	LAN	IP-based only	LOW. Correctly IP-locked; but `no_root_squash` means UDev root = Finn root over NFS. Acceptable for single-admin setup.

Tunnel Access Policy Audit¶

From docs/fleet-docs/06-services-and-networking.md + Cloudflare dashboard list:

NO Cloudflare Access (public with URL): - justinkrystal.com, plex.justinkrystal.com, audiobooks.justinkrystal.com, books.justinkrystal.com, requests.justinkrystal.com, homeassistant.justinkrystal.com

WITH Cloudflare Access (email-gated): - finn.* (Proxmox), qtorrent.*, prowlarr.*, sonarr.*, radarr.*, portainer.*, portainer-http.*, filebrowser.*, shelfarr.*, invoiceninja.*

Recommendation: Add Access policies (OTP email) to at least homeassistant.* (smart home = physical world) and requests.* (Overseerr can trigger downloads). Plex uses its own account auth so optional. Audiobooks/books are low-risk if stays family-only.

3. Secrets Handling Review¶

Vector	Posture	Gap
`.env` (UDev)	gitignored in `.gitignore` line 9	OK
n8n encrypted DB	At-rest encrypted by n8n	OK
NordPass	Not audited, user-managed	OK
`.claude/settings.json`	Tracked in git + contains JWT	VIOLATES hard rule
Media compose VPN keys	Plaintext in `/opt/stacks/media/docker-compose.yml`	Move to `.env` file (file perms 600)
HA token in `.env` vs hardcoded	Both exist (line 27 of QUICKSTART.md is placeholder)	After rotation, centralize on `.env` only

4. SSH / Sudo Posture¶

Host	Root login	Password auth	Pubkey	Finding
UDev	without-password	yes	yes	Disable password auth
Finn	without-password	no	yes	✓ Good
media-server	without-password	yes	yes	Disable password auth
others	(not audited this pass)	,	,	Verify next pass

/etc/sudoers.d/justinwieb: justinwieb ALL=(ALL) NOPASSWD:ALL, intentional per user, acceptable for single-admin.

maxauthtries 6 default on UDev. No fail2ban detected. LAN-only limits impact, but enabling fail2ban costs ~5 min and covers Tailscale-tunnelled SSH.

5. CVE / Version Sanity Check¶

Component	Version	Latest (as of 2026-04-21)	Notes
Proxmox VE	9.1.1 (kernel 6.17.2-1-pve)	9.1.x stream	Current, no known unpatched high-severity
Ubuntu (UDev)	24.04.1 LTS, kernel 6.8.0-106	24.04.x, 6.8.0-11x	Minor kernel updates available; run `apt upgrade`
Plex Media Server	1.43.0.10492	1.43.x	Current
n8n	2.11.4	2.11.x / 2.12.x	Near-current. Running `n8n:latest`, will auto-update on pull. Prefer pinned tag for reproducibility.
Frigate	`stable` (sha 1724960349d…)	0.14/0.15 stable line	Using `stable` tag, auto-bumped on pull. Pin to specific version for stability.
Docker (UDev)	29.2.1	29.x	Current
cloudflared	2026.2.0	2026.2.x	Current
nginx (UDev)	1.24.0 Ubuntu	1.26 mainline	1.24 is still supported (LTS); fine.
Claude Code CLI	2.1.116	2.1.x	Current

Version hygiene note: 7 of ~12 Docker services use :latest tag. Each docker pull is a potential unpinned supply-chain ingestion. Pinning to SHA digests or specific versions gives reproducibility and blocks accidental breaking changes.

6. Monitor / Task Queue State (Operational Security)¶

Issue	Evidence	Impact
`forge-dispatcher.service` failed 2026-04-14 02:39 (exit 143)	`systemctl status forge-dispatcher`	No tasks being executed at all. All monitor alerts queue but never dispatch.
2482 pending task files (2177 security + 305 infra)	`tasks/pending/ \| wc -l`	When dispatcher restarts, it will try to run them all. Sort+dedup+delete first.
Security monitor false positives on port 8100	`logs/monitor-security.log`, every 5 min	`KNOWN_PORTS_FILTER` in `scripts/monitors/security-check.sh:54` missing `8100` (documented nginx port).
Security monitor port filter is fragile regex	Grep-based, no structured match	False positives burn cycles and bury real alerts.

Quick fixes: 1. Add :8100 to the filter in scripts/monitors/security-check.sh line 54. 2. Delete the backlog: find tasks/pending -name "infra-alert-*" -o -name "security-alert-*" | xargs rm. 3. Re-enable + start dispatcher: sudo systemctl enable --now forge-dispatcher.

7. Backup Posture (Critical Data-Loss Risk)¶

Backup	Source	Destination	Status
Nightly workspace rsync	`/mnt/pve/fast-storage/` (STALE path)	`/mnt/storage/workspace-backup/`	BROKEN since 2026-03-20. Latest versioned dir: 2026-03-20; workspace-backup dir mtime 2026-03-16.
Media HDD backup	,	,	No backup exists. 14TB of Plex/photos/audiobooks has no secondary copy.
Google Drive	Google cloud	,	Google handles redundancy; acceptable.
VM/LXC config	,	,	No Proxmox Backup Server. LVM-thin container disks lost entirely on Finn failure.
Forge repo	git @ GitHub	origin	✓ Off-site via GitHub

Fix (immediate):

# Edit Finn root crontab: replace /mnt/pve/fast-storage/ with /mnt/pve/workspace/
ssh finn 'sed -i "s|/mnt/pve/fast-storage/|/mnt/pve/workspace/|g" /var/spool/cron/crontabs/root'
# Trigger a catch-up run manually
ssh finn 'rsync -a --delete --backup --backup-dir=/mnt/storage/workspace-backup-versions/$(date +%Y-%m-%d) /mnt/pve/workspace/ /mnt/storage/workspace-backup/ >> /var/log/workspace-backup.log 2>&1 &'

8. Firewall / Network¶

No ufw/nftables audit performed on UDev (low priority: LAN-only posture, no internet-facing open ports).
Router: Google Fiber, no port-forwarding detected (all external access via Cloudflare Tunnels).
Tailscale: confirmed Finn is subnet router + offering exit node. Exit node advertisement is a security surface, verify intentional, otherwise tailscale set --advertise-exit-node=false.
AdGuard upstream DNS: Google 8.8.8.8 per docs, but Tailscale config uses Cloudflare 1.1.1.1/1.0.0.1 as fallback. Verify AdGuard's actual upstream, mixing resolvers can leak DNS to multiple parties.

9. Recommended Remediation Order (Effort vs Impact)¶

Order	Action	Effort	Impact
1	Revoke HA long-lived token in HA UI, reissue, update `.env` on UDev	5 min	Eliminates HIGH-severity leaked JWT
2	Fix workspace backup cron path on Finn	2 min	Restores nightly backup (32 days broken)
3	Scrub `.claude/settings.json`, replace JWTs with `$HA_TOKEN` vars, force-push	10 min	Removes secret from git
4	Rotate n8n API key (regardless of git status, it's in plaintext handoff notes)	5 min	Clean hygiene
5	*Add Cloudflare Access to `homeassistant.` and `requests.`*	10 min	Real wall on highest-risk external endpoints
6	Fix security monitor port 8100 false positive (line 54 of security-check.sh)	1 min	Stops task spam
7	Purge 2482 pending task backlog	1 min	Prevents dispatcher thundering herd
8	Restart + enable forge-dispatcher	1 min	Restores task execution
9	Enable Frigate auth (`auth.enabled: true` in config.yml + set password)	10 min	Cam access behind credential
10	Disable SSH PasswordAuth on UDev + media-server	5 min	Kills brute-force surface
11	Audit/disable Tailscale exit-node advertisement on Finn if not intentional	2 min	Removes ISP-routing attack via tailnet
12	Pin Docker image tags to specific versions (away from `:latest`)	30 min	Reproducibility, supply-chain stability
13	Install Proxmox Backup Server on Seagate `/mnt/seagate` (5.5T free)	2 hrs	Automated VM/LXC snapshots
14	Add offsite backup (rclone to Backblaze B2 for `/mnt/storage/workspace-backup` nightly)	1 hr	True disaster recovery

10. What's Working Well¶

Gluetun killswitch is solid (qBittorrent egress = VPN IP, confirmed).
Git ignores .env, *.token, secrets/, right patterns.
DNS topology post-fix is clean: Tailscale → AdGuard → Cloudflare DoH upstream.
AdGuard boot order (=1) ensures DNS works before dependent containers start.
NFS exports locked to UDev IP only.
Cloudflare tunnels eliminate open router ports entirely.
SSH key-auth on Finn is password-disabled.
Task queue architecture is sound (separation of monitor → queue → dispatcher → worker).

[Claude Code], sec-audit (Opus47) session