NELSON HOME OPS CONSOLE
GitHub ↗

Sprint Board

Immediate Tasks

Setup alerting system (Telegram) for lab failures [Claude Code, 2026-03-08 09:40]
Done: Grafana (4 threshold alerts) + Kuma (13 service monitors) → Telegram nelson-home bot.
Next: Add ntfy/Apprise channels if additional platforms needed.
Deploy Monitoring Stack (Prometheus, Grafana, cAdvisor) to nelson-manager Gemini
Deploy Uptime Kuma to nelson-manager Gemini
Push monitoring fixes to GitHub and redeploy via Semaphore [Claude Code, 2026-03-08 09:05]
Configure Uptime Kuma monitors for all core services and nodes [Claude Code, 2026-03-08 09:15]
Ensure Tailscale is setup and running correctly on all nodes Gemini

Phase 2.3: Manager Node Live

Provision nelson-manager LXC (ID 300, 192.168.1.30)
Migrate Semaphore & Postgres to nelson-manager
Migrate Vaultwarden to nelson-manager
Deploy Uptime Kuma to nelson-manager
Deploy Monitoring (Prometheus/Grafana) to nelson-manager [Claude Code, 2026-03-08 09:05]
Configure Uptime Kuma monitors (12 monitors: 8 HTTP + 4 ping) [Claude Code, 2026-03-08 09:15]

Phase 2.4: Networking (Edge)

Deploy NPM to nelson-edge (Pi 5) Gemini
Determine public exposure strategy for owned domains (tudhopenelson.com, palladiumresearch.com, tanzolabs.com) Gemini
Update internal DNS (AdGuard) for new service locations Gemini
Deploy AdGuard Home to the new Edge LXC Gemini
Decommission nelson-identity LXC (.20) Gemini

Phase 2.1: Pre-Migration Cleanup

Migrate all plaintext secrets to Vaultwarden and Ansible Vault Gemini
Next: Use Vaultwarden as the source of truth; template compose files to remove plaintext.
Checklist:
- [ ] UniFi MongoDB root password (`example`)
- [ ] Semaphore admin/postgres passwords
- [ ] NPM admin credentials
- [ ] Pulse dashboard password
- [ ] Proxmox API / SSH credentials
- [ ] Ansible Vault passphrase
- [ ] DuckDNS token
Audit bolt-claw (VM 105)
Next: SSH in, check running processes and purpose, decide keep/kill
Decommission unused VMs (ubuntu-desktop 101, netbox 104)
Next: Verify no dependencies, then stop and remove
Remove 3 dead NPM rules targeting 192.168.1.157
Next: Use NPM API to identify and delete rules
Template compose credentials into Ansible Vault
Next: Identify 4 files with plaintext passwords (UniFi MongoDB, Pulse, Semaphore, NPM)
Audit and store all lab passwords in Vaultwarden
Next: Verify each item below exists as a Vaultwarden entry, add any missing ones
Checklist:
- [x] UniFi admin 'ansible' (user: ansible) — Added to controller manually
- [ ] USG SSH password (192.168.1.1, user: btnelson) — needed for force re-adopt
- [ ] UniFi MongoDB root password (docker-compose/unifi — currently plaintext: `example`)
- [ ] UniFi MongoDB unifi user password (currently plaintext: `unifi111`)
- [ ] NPM admin credentials
- [ ] Semaphore admin credentials
- [ ] Proxmox root / btnelson passwords (nelson-pve)
- [ ] Ansible Vault passphrase
- [ ] DuckDNS token
- [ ] Vaultwarden admin token
Reconcile unmanaged containers into IaC
Next: Add duckdns, plex, nextcloud, glances, qdirstat, mongo-express to containers.yml or decommission

Ops & Observability

Deploy secondary AdGuard Home to nelson-manager (resilience against nelson-edge LXC failure) [PLANNED]
Define event runbook: New Docker Service Added Claude Code
Define event runbook: New Node Provisioned Claude Code
Define event runbook: Service Decommissioned Claude Code
Define event runbook: Node Decommissioned Claude Code
Define event runbook: DNS / Routing Changed Claude Code
All 5 event runbooks written from real incidents. See .ops/RUNBOOKS.md.
Choose notification stack — Grafana + Kuma dual alerting via Telegram [Claude Code, 2026-03-08 09:40]
Deploy Unpoller for UniFi network monitoring in Grafana [Claude Code, 2026-03-08 10:20]
Done: Unpoller v2.34.0 on nelson-manager, 6 Grafana dashboards imported, UniFi summary row on overview.
Expand monitoring to all nodes (nelson-pve, ubuntu-server cAdvisor) [Claude Code, 2026-03-08 10:00]
Done: cAdvisor v0.51.0 on ubuntu-server, node-exporter on nelson-pve (systemd binary).
Add Home Assistant to monitoring (Uptime Kuma) [Claude Code, 2026-03-08 11:10]
Wire scheduled runbook results into notification system
Next: Depends on notification service being live; audit playbooks push results to alert endpoint
Modernize audit playbooks for multi-node (manager, edge, monolith) Gemini
Fix audit_master.yml broken imports (removed sync_gemini_knowledge) Gemini
Verify Template 4 (RUN ALL AUDITS) execution via Semaphore API Gemini
Set Semaphore cron to weekly Sunday evening for Template 4
Next: User to finalize schedule in Semaphore UI (nelson-manager:3010)

Future Projects

Design web-based interface for `.ops` directory management [Claude Code, 2026-03-08 18:00]
Create interactive sprint board with checkbox toggle + git sync [Claude Code, 2026-03-08 18:30]
Create standup, roadmap, crew activity, and reports views [Claude Code, 2026-03-08 18:30]
LCARS Star Trek theme with collapsible sections and font size controls [Claude Code, 2026-03-08 22:45]
Deploy to nelson-manager with nodemon hot-reload dev loop [Claude Code, 2026-03-08 22:00]
Archive viewer for sprints, retrospectives, and reports [Claude Code, 2026-03-08 22:45]
Document viewer for Knowledge Base, Runbooks, and Changelog [Claude Code, 2026-03-08 18:30]
Home dashboard with sprint/roadmap/audit stats, recent activity, quick links [Claude Code, 2026-03-08 23:15]
About page rendering README.md [Claude Code, 2026-03-08 23:15]
BRIDGE nav group with Home, About, and Vaultwarden link [Claude Code, 2026-03-08 23:15]
Implement Runbook automation integration with Semaphore API
Develop interactive architecture visualization [Claude Code, 2026-03-08 23:45]
Done: vis.js network graph with audit-driven nodes, Prometheus live metrics, config drift detection, LCARS theme.