Skip to content

Web Builder + Asset Grabber: Handoff (2026-04-22)

Session: web-asset (Opus47). Two missions: site-build scaffolding and browser/asset-grab tooling. Both landed; end-to-end verified with a real page + real grabbed image.


1. What's installed on UDev

System packages (apt)

Package Version Purpose
webp 1.3.2 cwebp encoder
libavif-bin 1.0.4 avifenc encoder
yt-dlp 2024.04.09 Video downloads
python3-venv 3.12.3 venvs
xvfb 21.1.12 Virtual display fallback (not needed: Chromium headless works)

Pre-existing and confirmed: ffmpeg, imagemagick (convert), exiftool.

Python venv (/home/justinwieb/.forge-venvs/assets/)

Outside the git repo so pip installs don't pollute the workspace.

Package Version
playwright 1.58.0
gallery-dl 1.31.10
Pillow 12.2.0
requests 2.33.1

Playwright browser

Chrome Headless Shell 145.0.7632.6 at ~/.cache/ms-playwright/chromium_headless_shell-1208/. System deps (fonts, X libs) installed via playwright install-deps chromium.

Headless smoke test (passed): loaded https://example.com, read title Example Domain.


2. Site scaffold

Directory layout added

sites/_shared/
  tokens.css           Design tokens (CSS variables for color/typography/spacing/motion)
  base.css             Reset + .container/.card/.btn/.eyebrow/.forge-hdr/.forge-ftr
  header.html          Shared header with slots: brand, nav
  footer.html          Shared footer with slots: tagline, links
  forge-include.js     40-line include loader w/ per-page slot overrides
  README.md
sites/_templates/
  single-page.html     PressYourLuck-style starter — self-contained, pulls tokens only
  content-page.html    AustinGuide-style starter — uses include loader for header/footer
scripts/sites/
  new-page.sh          Scaffolds <site>/<slug>/index.html from a template with macro subs
  preview.sh           Unions site + _shared + _templates, serves via python http.server on first free port 8090–8097. Prints Tailscale URL.
  screenshot.sh        Playwright desktop + mobile screenshots → logs/site-screenshots/
  deploy.sh            Generic tar-and-push to CT 102. Delegates if site has its own deploy.sh.
docs/sites-playbook.md

SSG choice, vanilla + tiny include loader

Evaluated: 11ty, Astro, Vite static, vanilla. Picked vanilla HTML + forge-include.js + tokens.css: - Zero build step, zero node_modules/. - Existing sites/gizmo/node_modules/ sprawl was the counter-example. - PressYourLuck-style single-file pages stay supported identically (they only depend on tokens.css). - Content pages get shared header/footer via <div data-forge-include="/_shared/header.html"> + <template data-forge-slot="nav">.

Design tokens strategy

One tokens.css at sites/_shared/. Tokens cover: dark-default palette (--bg, --text, --muted), semantic (--good, --danger, --warn), accent (--accent, --accent-bright, --accent-dim, --accent-border), typography (--font-body, --font-display, --font-mono, --fs-* clamps), spacing (--s-1..--s-10), radius, motion (--dur, --ease), elevation (--shadow-*), layout (--container). Light-mode opt-in via html.light. Brand overrides = a brand-tokens.css per site loaded after tokens.css.

Asset references

New pages have an assets/ dir. Grab/search/optimize scripts accept --out <dir>, point them straight at the page's assets/ to land deploy-ready files.

Deploy

  • justinkrystal.com has its own deploy.sh, generic deploy auto-delegates.
  • justinsforge.com, served by nginx on UDev port 8100 directly from forge/sites/justinsforge.com/. No remote sync needed; edits are instant.
  • New sites, generic flow tars sites/<site>/ + _shared/ + _templates/, pushes to CT 102 under /mnt/storage/appdata/<target>/landing. Add a Cloudflare tunnel ingress entry on CT 102 when adding a hostname.

Previews from Venus

preview.sh symlinks the site root + _shared/ + _templates/ into a temp dir, serves it on a free port, prints http://100.97.43.104:<port>/, works directly in Safari on Venus.


3. Asset grabber

Commands

scripts/assets/run grab      <url>    [--out DIR] [--min-width N] [--max N] [--media images|video|all] [--mobile] [--include-svg]
scripts/assets/run search    <query>  [--engine google|bing|ddg|unsplash|pexels] [--count N] [--out DIR]
scripts/assets/run optimize  <path>   [--out DIR] [--widths 320 640 1280 1920] [--skip-webp] [--skip-avif] [--keep-metadata]
scripts/assets/run catalog            [--roots …] [--out forge/data/assets-catalog.json] [--no-hash]

Where assets land

Flow Default path
grab (URL) /mnt/workspace/Assets/Web-Grabs/<date>_<host>/
search (query) /mnt/workspace/Assets/Web-Grabs/<date>_<query>/
Site-specific forge/sites/<site>/<page>/assets/ (pass --out)
Catalog forge/data/assets-catalog.json

Provenance

Every download emits/merges provenance.json next to the file(s). Fields: filename, source_url, page_url, engine, query, referer, content_type, bytes, sha256, saved_at, extra. Purely a trace, not a license check. Pull anything.

Optimization pipeline

  1. Read raster image (jpg/png/webp/gif/avif/bmp/tiff).
  2. Strip EXIF unless --keep-metadata.
  3. Resize with Pillow LANCZOS to each target width ≤ source width, always include source-width variant.
  4. Encode with cwebp (quality 80) and avifenc (quality 55, speed 6).
  5. Emit optimize.json with the manifest of produced variants and byte sizes.

Security posture

  • Nothing binds to a public interface. All browser automation is subprocess-local.
  • Session cookies (if you enable a persistent Chromium profile): stored at ~/.forge-venvs/assets/browser-profile/, outside the git repo. Never committed.
  • Downloads go through the browser's request context with Referer set to the originating page.
  • Default 250ms delay between downloads inside a single grab run.
  • No OAuth was touched. No personal Gmail involved.

4. Verified end-to-end

Actual run performed in this session:

$ scripts/sites/new-page.sh justinsforge.com demo-scaffold \
    --template content --title "Forge Scaffold Demo" \
    --desc "End-to-end demo..." --eyebrow "DEMO"
  created: sites/justinsforge.com/demo-scaffold/index.html

$ scripts/assets/run grab "https://en.wikipedia.org/wiki/Austin,_Texas" \
    --out sites/justinsforge.com/demo-scaffold/assets/_raw --min-width 400 --max 3
  fetching: https://en.wikipedia.org/wiki/Austin,_Texas
  candidates: 1
  ok   1: 000_960px-downtown-austin-2c-texas-from-the-colorado.jpg
  saved: 1 files   provenance.json written

$ scripts/assets/run optimize sites/justinsforge.com/demo-scaffold/assets/_raw \
    --out sites/justinsforge.com/demo-scaffold/assets --widths 640 1280 1920
  optimizing: 1 files -> ...
  ok: 000_downtown-austin.jpg -> 4 variants  (640w + 960w × webp + avif)
  variants: 4

$ scripts/sites/preview.sh justinsforge.com --port 8094
  serving: .../justinsforge.com
  tailscale: http://100.97.43.104:8094/   ← works from Venus

$ curl -sI http://127.0.0.1:8094/demo-scaffold/    → 200 OK
$ curl -s  http://127.0.0.1:8094/_shared/tokens.css → tokens served

$ scripts/sites/screenshot.sh http://127.0.0.1:8094/demo-scaffold/ --label demo-scaffold
  wrote: logs/site-screenshots/demo-scaffold-desktop.png
  wrote: logs/site-screenshots/demo-scaffold-mobile.png

Desktop screenshot shows: header with justinsforge.com brand + Home/This Page nav, DEMO eyebrow + Playfair "Forge Scaffold Demo" headline, Austin skyline hero image, "Overview" + "Highlights" sections with 3-card grid, footer. Mobile screenshot shows the same content stacked responsively. Dark theme via tokens, no layout breakage. The entire flow took < 3 minutes end-to-end.


5. Bugs found + fixed during verification

  1. grab.py: Playwright request timeout unit. Passed args.timeout/1000.0 (seconds) when Playwright's APIRequestContext.get wants milliseconds. Symptom: Timeout 30ms exceeded. Fixed: pass args.timeout directly.
  2. grab.py, too-loose anchor match. The a[href] matcher for image extensions was pulling Wikipedia's /wiki/File:Foo.jpg HTML pages. Fixed: added !/\/(wiki|file|File:)\//.test(a.pathname) guard.
  3. optimize.py, width logic. The set comprehension filter dropped widths > source, leaving only one target. Fixed to: keep targets < source, always emit a source-width variant. 4 variants now produced correctly (640 + 960 × webp + avif).

6. Tool registry updates

Registered per Justin's feedback_register_tools.md rule:

  • memory/reference_site_scaffold.md, points at docs/sites-playbook.md, summarizes scripts/sites + sites/_shared
  • memory/reference_asset_grabber.md, points at docs/asset-grabber-playbook.md, summarizes scripts/assets + venv location
  • MEMORY.md, two new Tools & Pipelines entries (Site Scaffold, Asset Grabber)

7. Open decisions (numbered, each with a recommendation)

  1. SSG: resolved. Vanilla + forge-include.js. Revisit only if we start shipping >100 pages under one site or need prerendered data-from-API pages.
  2. Cloudflare cache invalidation: not wired. Recommendation: add a --purge HOST flag to scripts/sites/deploy.sh that hits POST /zones/:id/purge_cache with an API token scoped to Zone.Cache Purge only, stored at ~/.forge-secrets/cloudflare.env. Don't do it yet: Cloudflare's default cache is short and no page needs aggressive caching.
  3. Screenshot diff: not wired. Recommendation: a --compare baseline.png flag on screenshot.sh that uses PIL.ImageChops.difference and fails deploy if pixel delta > 2%. Low priority until we're deploying 10+ sites.
  4. Persistent browser profile for auth'd grabs: not wired. Recommendation: when Justin actually needs Shopify/Adobe Stock assets, spin up a one-time headful Chromium over ssh -Y (or VNC if X forwarding is a pain), log in, then all subsequent grab.py runs reuse the profile. Keep under ~/.forge-venvs/assets/browser-profile/.
  5. gallery-dl config: installed but no config yet. Recommendation: when Justin first tries to bulk-pull a social profile, drop a ~/.config/gallery-dl/config.json that routes outputs to /mnt/workspace/Assets/Social-Grabs/{category}/{user}/ and sets a polite 1-req/s rate limit.
  6. Optimize.py AVIF quality: currently 55 (libavif scale where lower=better). Recommendation: leave it; produces ~25-30% smaller files than WebP at visually identical quality on the test image (25KB AVIF vs 30KB WebP at 640w).
  7. Catalog scope: currently scans /mnt/workspace/Assets + forge/assets. Recommendation: add /mnt/workspace/JustinWieb-VR, /mnt/workspace/Nova-Design, /mnt/workspace/Gus-Outdoor-Co, /mnt/workspace/Sip-N-Serve when Justin wants brand assets indexed, but run with --no-hash to keep it fast (these are big dirs).

8. Cost

Everything runs on UDev. No API keys required. Optional:

Key Service Free tier When to bother
UNSPLASH_ACCESS_KEY Unsplash 50 req/hr If Google Images results get rate-limited or you want cleaner metadata
PEXELS_API_KEY Pexels ~200 req/hr Same rationale

Neither needed today. Google/Bing/DDG via Playwright works out of the box and can pull anything from anywhere.


9. Files created this session

sites/_shared/{tokens.css, base.css, header.html, footer.html, forge-include.js, README.md}
sites/_templates/{single-page.html, content-page.html}
sites/justinsforge.com/demo-scaffold/{index.html, assets/{hero.jpg, 4 optimized variants, _raw/, optimize.json}}
scripts/sites/{new-page.sh, preview.sh, screenshot.sh, deploy.sh}
scripts/assets/{grab.py, search.py, optimize.py, catalog.py, run, lib/{__init__.py, provenance.py}}
docs/{sites-playbook.md, asset-grabber-playbook.md}
logs/site-screenshots/{demo-scaffold-desktop.png, demo-scaffold-mobile.png}
memory/{reference_site_scaffold.md, reference_asset_grabber.md}  (in ~/.claude/.../memory/)
MEMORY.md updated with two new Tools & Pipelines entries

Justin reviews. Nothing deployed beyond local preview. Demo page lives at sites/justinsforge.com/demo-scaffold/, safe to delete if not wanted (rm -rf sites/justinsforge.com/demo-scaffold).

, [Claude Code] Opus 4.7, web-asset (Opus47) session