Changelog
Nelson Home — Changelog
Standard: Nelson Home | Internal Domain: nelson.home | Tailnet: tadpole-dory.ts.net
2026-03-08 (Doc Audit — Node Type Corrections)
- FIX [Claude Code]: Corrected nelson-edge node type across 6 files — docs said "Raspberry Pi 5" but live Proxmox audit confirms it is LXC 301 on nelson-pve (192.168.1.2, 2 GB RAM). A physical Pi (MAC dc:a6:32, .247/.90) remains on the network as a legacy device but is not the active edge node.
- FIX [Claude Code]: Corrected nelson-manager RAM in README.md resource budget (2 GB → 4 GB, confirmed by live audit).
- FIX [Claude Code]: Updated ROADMAP.md resource budget — nelson-manager 3 GB → 4 GB, nelson-edge Pi 5/8 GB/4-core → LXC 301/2 GB/1-core.
- FIX [Claude Code]: Updated GEMINI.md — NPM local URL was pointing to ubuntu-server (192.168.1.11:81), corrected to nelson-edge (192.168.1.2:81); AdGuard Tailscale IP was stale nelson-pi address (100.75.196.25), corrected to nelson-edge LXC (100.77.163.93); Homepage port corrected (3000 → 3030).
- FIX [Claude Code]: Updated KNOWLEDGE.md hostvars comment — "(Pi)" → "(LXC)".
- FIX [Claude Code]: Updated SPRINT.md DNS resilience task — "Pi 5 failure" → "nelson-edge LXC failure".
2026-03-08 (Late Night — Architecture Visualization)
- FEAT [Claude Code]: Built interactive architecture visualization at
/architecture— vis.js network graph showing all hosts, containers, proxies, VMs, LXCs, and network devices pulled from audit reports. - FEAT [Claude Code]: Added config drift detection — compares intended state (common.yml, containers.yml, hosts.ini) against discovered state (audit reports) and flags missing, unmanaged, and mismatched items.
- FEAT [Claude Code]: Added Prometheus live metrics overlay — CPU, RAM, disk usage displayed on node tooltips with health-based border colors (green/amber/red).
- FEAT [Claude Code]: Created 5 audit report parsers (Docker, NPM, Proxmox, Network, Resilience) and 3 config parsers (common.yml, containers.yml, hosts.ini) with 57 unit tests.
- FEAT [Claude Code]: Added Uptime Kuma and Home Assistant links to ops dashboard sidebar.
- DOCS [Claude Code]: Created architecture visualization design doc and implementation plan.
- FIX [Claude Code]: Served vis-network.js locally — CDN (unpkg.com) was unreachable on local network.
- FIX [Claude Code]: Deduplicated architecture device node IDs — two UAP-AC-Lite APs with same name caused vis.js crash.
- FIX [Claude Code]: Roadmap parser now counts plain bullet items as tasks; COMPLETE phases show all items done.
2026-03-08 (Evening — Monitoring Expansion)
- FEAT [Claude Code]: Deployed Unpoller v2.34.0 to monitoring stack — full UniFi network observability (44 clients, 2 APs, 1 USG, 1 switch). 6 community Grafana dashboards imported + UniFi summary row on Nelson Home Overview.
- FIX [Claude Code]: Upgraded cAdvisor v0.49.1 → v0.51.0 on ubuntu-server — Docker API version mismatch broke container name labels.
- FIX [Claude Code]: Added explicit
docker pullto cAdvisor deploy playbook to prevent cached old image on redeploy. - FIX [Claude Code]: Restored btnelson UniFi admin role via MongoDB
db.privilege.updateOne()— role had been changed to readonly. - FIX [Claude Code]: Fixed Home Assistant crash loop — updated stale image (simplejson ImportError) and corrected volume mount path to
/opt/docker-data/homeassistant. - FIX [Claude Code]: Added
http.trusted_proxiesto HA config for NPM reverse proxy (was returning 400 Bad Request). - FEAT [Claude Code]: Added
homeassistant.nelson.homeNPM proxy host + AdGuard DNS rewrite. - FEAT [Claude Code]: Added Home Assistant monitor to Uptime Kuma.
- FEAT [Claude Code]: Added
unpoller_passwordto Semaphore Default environment. - DOCS [Claude Code]: Created Unpoller design doc and implementation plan.
2026-03-08
- FEAT [Claude Code]: Added home dashboard to Nelson Ops — sprint stats, roadmap progress, audit report status, recent crew activity, and quick access links to all services.
- FEAT [Claude Code]: Added About page rendering README.md via markdown-it.
- FEAT [Claude Code]: Added BRIDGE nav group to sidebar with Home, About, and prominent Vaultwarden link.
- FIX [Claude Code]: Diagnosed and fixed
.nelson.homeDNS resolution failure — stale Tailscale global nameserver (nelson-pi) was overriding AdGuard. Added split DNS rule in Tailscale admin routing.nelson.hometo nelson-edge's AdGuard. - FEAT [Claude Code]: Built Nelson Ops dashboard — Node.js/Express web app at
ops.nelson.homewith LCARS Star Trek theme. Views: Sprint board (interactive checkboxes), Standup, Roadmap (progress bars), Audit Reports, Crew Activity, Knowledge/Runbooks/Changelog docs, Archive. - FEAT [Claude Code]: Sprint board checkbox toggle commits and pushes changes via git automatically.
- FEAT [Claude Code]: LCARS design system — amber/lavender/periwinkle/peach/ice-blue palette, collapsible sections, SVG favicon, stardate display, GitHub link.
- FEAT [Claude Code]: Font size controls (A-/A+) with localStorage persistence,
--font-scaleCSS variable (0.75x to 1.6x). - FEAT [Claude Code]: Archive viewer reads
.ops/archive/{sprints,retrospectives,reports}/and renders as collapsible markdown cards. - FEAT [Claude Code]: Deployed to nelson-manager with nodemon hot-reload —
git pullauto-restarts app, no Semaphore needed. - DOCS [Claude Code]: Documented nelson-ops dev workflow in KNOWLEDGE.md, PROTOCOL.md, CLAUDE.md — SSH deploy permitted for app dev, distinct from IaC Semaphore workflow.
- FIX [Claude Code]: Fixed deploy_stack rsync permission issues on nelson-manager — rsync with
--delete-afterandbecome: truedeletes app files and changes ownership. Documented workaround (git checkout -- docker-compose/nelson-ops/). - FIX [Gemini CLI]: Corrected Prometheus scrape targets in
monitoringstack (localhost -> node-exporter:9100). - FIX [Gemini CLI]: Fixed Grafana dashboard datasource linking by defining static "Prometheus" UID.
- FIX [Gemini CLI]: Added missing dependencies (rsync, python3-docker) to
deploy_monitoring.ymlplaybook. - FEAT [Gemini CLI]: Verified Uptime Kuma and Monitoring stacks are ready for active service checks.
- FIX [Claude Code]: Resolved Grafana datasource UID mismatch — added
deleteDatasourcesdirective to force re-provision with correct UIDPrometheus. All dashboard panels now resolve correctly. - FIX [Claude Code]: Fixed cAdvisor Prometheus scrape target (port 8082 → 8080 internal).
- FIX [Claude Code]: Added
recreate: alwaystodeploy_monitoring.ymlso bind-mounted config changes take effect on Semaphore redeploy. - REMOVED [Claude Code]: Dropped moonraker Prometheus scrape target — Moonraker v0.10.0 (Bullseye) lacks
[prometheus]component support. - FEAT [Claude Code]: Created 12 Uptime Kuma monitors via API — 8 HTTP service checks (Semaphore, Grafana, Prometheus, Vaultwarden, AdGuard, NPM, UniFi, Moonraker) + 4 node ping checks (manager, edge, ubuntu-server, pve).
- FEAT [Claude Code]: Created Uptime Kuma API key (
semaphore-automation, expires 2027-03-08) and stored in Semaphore Default variable group asuptime_kuma_api_key. - FEAT [Claude Code]: Created custom "Nelson Home Overview" Grafana dashboard — node status, CPU/RAM/disk gauges, container metrics, network traffic. Set as home dashboard.
- FIX [Claude Code]: Fixed node-exporter on nelson-manager — switched to
network_mode: host+pid: host+hostname: nelson-managerso Grafana dashboards show correct host labels instead of container IDs. - FEAT [Claude Code]: Configured Grafana unified alerting with Telegram contact point (
nelson-homebot). Created 4 alert rules: Node Down, High CPU, High Memory, Disk Critical. - FEAT [Claude Code]: Configured Uptime Kuma Telegram notifications — applied to all 13 monitors as default.
- DOCS [Claude Code]: Updated PROTOCOL.md architecture with full observability stack details and alerting strategy. Updated ROADMAP.md Phase 2.3 to COMPLETE. Added comprehensive observability architecture section to KNOWLEDGE.md.
2026-03-07
- FIX [Gemini]: Resolved
audit_master.ymlfailure by removing the archivedsync_gemini_knowledge.ymlimport. - REFACTOR [Gemini]: Redesigned
audit_docker.ymlto target all active nodes (manager_nodes,edge_nodes,ubuntu-server) and aggregate reports in a non-destructive manner. - REFACTOR [Gemini]: Updated
audit_npm.ymlto correctly targetedge_nodes(nelson-edge) for proxy host audits. - VERIFIED [Gemini]: Successfully ran the full
audit_master.ymlsuite via Semaphore API. - FIX [Gemini]: Implemented a shell-based fallback (
docker inspect) inaudit_docker.ymlfor environments without therequestslibrary (e.g., nelson-edge). - SUCCESS [Gemini]: Validated the Proxmox audit using native
pveshonnelson-pve. - TASK [Gemini]: Updated
SPRINT.mdand ready for the user to set the final Semaphore cron schedule.
2026-02-20
- FIX [Claude Code]: Redesigned MEMORY.md as thin pointer — removed drifting PROTOCOL.md summary, now contains only mandatory session-start instruction + critical safety rules. Solves stale-copy problem at root.
- FIX [Claude Code]: Corrected stale Semaphore IP in GEMINI.md (
.20→.30). - CLEANUP [Claude Code]: Archived
GEM_SYSTEM_PROMPT.mdandansible/sync_gemini_knowledge.yml(both v9 Google Drive era) to.ops/archive/. - FEAT [Claude Code]: Added MEMORY.md as item 9 in KNOWLEDGE.md atomic documentation update checklist to prevent future drift.
- FEAT [Claude Code]: Added PreToolUse hook blocking direct
secrets.ymledits and Stop hook reminding about post-work checklist when infra files are modified (.claude/hooks/). - FEAT [Claude Code]: Added
/post-workand/standupskills to.claude/skills/.
2026-02-19
- RETRO [Claude Code]: First repository retrospective — covered full history (Apr 2025 — Feb 2026). 7 lessons extracted, all actioned. Stored in
.ops/archive/retrospectives/. - FEAT [Claude Code]: Filled all 5 event runbooks in
.ops/RUNBOOKS.mdfrom real incidents (were stubs). Added retrospective to scheduled runbooks. - FIX [Claude Code]: Updated stale references across
.ops/— PROTOCOL.md architecture section + inventory groups + Semaphore URL, ROADMAP.md phases, KNOWLEDGE.md IPs. - FEAT [Claude Code]: Added 3 new KNOWLEDGE.md standing rules — version pinning, atomic documentation updates, credential hygiene.
- FEAT [Claude Code]: Added retrospective cadence, commit quality convention, and runbook-consultation step to PROTOCOL.md.
- SUCCESS [Gemini]: Configured public proxy host for Proxmox Web UI (
proxmox.tudhopenelson.duckdns.org) with Let's Encrypt SSL onnelson-edge. - DOCS [Gemini]: Integrated owned domains (
tudhopenelson.com,palladiumresearch.com,tanzolabs.com) into architecture documentation and created a task to define the public exposure strategy. - SUCCESS [Gemini]: Updated UniFi Port Forwarding rules to point HTTP (80) and HTTPS (443) to the new
nelson-edgenode (.2), completing the proxy migration. - FIX [Gemini]: Resolved Docker-in-LXC startup failure (
sysctl net.ipv4.ip_unprivileged_port_startpermission denied) by settinglxc.apparmor.profile: unconfinedin Proxmox LXC configuration and addingkeyctl=1feature. - DECOMMISSIONED [Gemini]: Stoped and destroyed
nelson-identityLXC (ID 200, .20) after a full snapshot backup to Proxmox storage (NelsonBackups). - SUCCESS [Gemini]: Deployed and configured Nginx Proxy Manager and AdGuard Home on new
nelson-edgenode (.2). - FIX [Gemini]: Resolved AppArmor access denied issues for Docker containers in LXC by setting
security_opt: [apparmor:unconfined]. - REFACTOR [Gemini]: Resolved port 80 conflict on Edge by moving AdGuard dashboard to port 3000.
- SUCCESS [Gemini]: Automated configuration of 13 Proxy Hosts and 17 DNS rewrites via Semaphore.
- FIX [Gemini]: Fixed UniFi Network Audit playbook — updated login endpoint to
/api/loginand corrected cookie property usage (cookies_string). - REFACTOR [Gemini]: Updated
common.ymlanddashboard.ymlto reflect new service locations onnelson-managerandnelson-edge. - DOCS [Gemini]: Added "Semaphore Template Configuration" guidelines to
KNOWLEDGE.mdregarding Variable Groups and Vault requirements. - FIX [Claude Code]: Diagnosed and resolved UniFi outage —
linuxserver/unifi-network-application:latestpulled 2026-02-19 requiresunifi_auditMongoDB permission not provisioned by original init script. Granted role live (no data loss), restored service. - FIX [Claude Code]: Pinned UniFi image to
lscr.io/linuxserver/unifi-network-application:9.0.114after new image version showed setup wizard despite intact data (schema incompatibility). Controller fully restored. - PATCH [Claude Code]: Updated
docker-compose/unifi/initdb/init-mongo.shto includeunifi_auditdbOwner role for future fresh installs. - DOCS [Claude Code]: Added UniFi + MongoDB section to
KNOWLEDGE.md— breaking change, live fix command, and patch details. - DOCS [Claude Code]: Added UniFi force re-adopt runbook to
KNOWLEDGE.md(mca-cli / set-inform procedure). - TASK [Claude Code]: Added Vaultwarden password audit checklist to
SPRINT.md— USG SSH, MongoDB credentials, NPM, Semaphore, Proxmox, Vault passphrase, DuckDNS token. - DECISION [Claude Code]: SSL strategy — defer to Caddy on nelson-gateway. Vaultwarden accessible once NPM moves to nelson-edge.
2026-02-18
- MIGRATED [Gemini]: Moved Semaphore + Postgres from monolith to
nelson-identity(192.168.1.20). Clean database restore. Old monolith instance stopped. - REFACTOR [Gemini]: Updated
common.ymland NPM proxy hosts to routesemaphore.nelson.hometo identity node. - FIX [Gemini]: Recreated
nelson-identityas a Privileged LXC to resolve Docker sysctl/AppArmor issues for Postgres. - FEAT [Claude Code]: Created
configure_adguard_dns.ymlandconfigure_npm_hosts.ymlfor automated DNS/NPM routing management. - FEAT [Claude Code]: Full Homepage dashboard redesign with improved grouping and widgets.
- REFACTOR [Claude Code]: Hardened 8 Ansible playbooks — fixed host group bugs, undefined variables, credential leaks, non-idempotent regex.
- REFACTOR [Claude Code]: Created
group_vars/all/common.ymlas single source for shared variables. - FEAT [Claude Code]: Created
CLAUDE.mdoperating protocol for cross-session context. - FEAT [Claude Code]: Architecture review — identified ghost infra (bolt-claw VM, dead NPM rules), expanded roadmap through Phase 4.
- REFACTOR [Claude Code]: Restructured project management into
.ops/directory (ROADMAP, SPRINT, KNOWLEDGE, STANDUP) replacing flat .gemini/ files. - VERIFIED [Gemini]: Restored and ran full audit suite via Semaphore — 100% pass.
- SUCCESS [Gemini]: Provisioned and bootstrapped
nelson-identityLXC on Proxmox. - SUCCESS [Gemini]: Automated Vaultwarden and UniFi backups to Proxmox storage.
- SUCCESS [Gemini]: Configured AdGuard Home with DNS resilience (Quad9/Cloudflare).
- SUCCESS [Gemini]: Formatted and mounted 5TB WD HDD at
/mnt/nelson-backupson Proxmox. - SUCCESS [Gemini]: Established Nelson Home naming and documentation standard.
- REFACTOR [Gemini]: Centralized Proxmox API identifiers in Vault.
2026-02-17
- INIT [Gemini]: Initialized Gemini memory system — created
GEMINI.md,.gemini/TASKS.md,.gemini/MEMORY.md. - DOCS [Gemini]: Added operator station setup (MacBook Pro SSH config) to README.