Web Builder + Asset Grabber: Handoff (2026-04-22)¶
Session: web-asset (Opus47). Two missions: site-build scaffolding and browser/asset-grab tooling. Both landed; end-to-end verified with a real page + real grabbed image.
1. What's installed on UDev¶
System packages (apt)¶
| Package | Version | Purpose |
|---|---|---|
webp |
1.3.2 | cwebp encoder |
libavif-bin |
1.0.4 | avifenc encoder |
yt-dlp |
2024.04.09 | Video downloads |
python3-venv |
3.12.3 | venvs |
xvfb |
21.1.12 | Virtual display fallback (not needed: Chromium headless works) |
Pre-existing and confirmed: ffmpeg, imagemagick (convert), exiftool.
Python venv (/home/justinwieb/.forge-venvs/assets/)¶
Outside the git repo so pip installs don't pollute the workspace.
| Package | Version |
|---|---|
playwright |
1.58.0 |
gallery-dl |
1.31.10 |
Pillow |
12.2.0 |
requests |
2.33.1 |
Playwright browser¶
Chrome Headless Shell 145.0.7632.6 at ~/.cache/ms-playwright/chromium_headless_shell-1208/. System deps (fonts, X libs) installed via playwright install-deps chromium.
Headless smoke test (passed): loaded https://example.com, read title Example Domain.
2. Site scaffold¶
Directory layout added¶
sites/_shared/
tokens.css Design tokens (CSS variables for color/typography/spacing/motion)
base.css Reset + .container/.card/.btn/.eyebrow/.forge-hdr/.forge-ftr
header.html Shared header with slots: brand, nav
footer.html Shared footer with slots: tagline, links
forge-include.js 40-line include loader w/ per-page slot overrides
README.md
sites/_templates/
single-page.html PressYourLuck-style starter — self-contained, pulls tokens only
content-page.html AustinGuide-style starter — uses include loader for header/footer
scripts/sites/
new-page.sh Scaffolds <site>/<slug>/index.html from a template with macro subs
preview.sh Unions site + _shared + _templates, serves via python http.server on first free port 8090–8097. Prints Tailscale URL.
screenshot.sh Playwright desktop + mobile screenshots → logs/site-screenshots/
deploy.sh Generic tar-and-push to CT 102. Delegates if site has its own deploy.sh.
docs/sites-playbook.md
SSG choice, vanilla + tiny include loader¶
Evaluated: 11ty, Astro, Vite static, vanilla. Picked vanilla HTML + forge-include.js + tokens.css:
- Zero build step, zero node_modules/.
- Existing sites/gizmo/node_modules/ sprawl was the counter-example.
- PressYourLuck-style single-file pages stay supported identically (they only depend on tokens.css).
- Content pages get shared header/footer via <div data-forge-include="/_shared/header.html"> + <template data-forge-slot="nav">.
Design tokens strategy¶
One tokens.css at sites/_shared/. Tokens cover: dark-default palette (--bg, --text, --muted), semantic (--good, --danger, --warn), accent (--accent, --accent-bright, --accent-dim, --accent-border), typography (--font-body, --font-display, --font-mono, --fs-* clamps), spacing (--s-1..--s-10), radius, motion (--dur, --ease), elevation (--shadow-*), layout (--container). Light-mode opt-in via html.light. Brand overrides = a brand-tokens.css per site loaded after tokens.css.
Asset references¶
New pages have an assets/ dir. Grab/search/optimize scripts accept --out <dir>, point them straight at the page's assets/ to land deploy-ready files.
Deploy¶
justinkrystal.comhas its owndeploy.sh, generic deploy auto-delegates.justinsforge.com, served by nginx on UDev port 8100 directly fromforge/sites/justinsforge.com/. No remote sync needed; edits are instant.- New sites, generic flow tars
sites/<site>/+_shared/+_templates/, pushes to CT 102 under/mnt/storage/appdata/<target>/landing. Add a Cloudflare tunnel ingress entry on CT 102 when adding a hostname.
Previews from Venus¶
preview.sh symlinks the site root + _shared/ + _templates/ into a temp dir, serves it on a free port, prints http://100.97.43.104:<port>/, works directly in Safari on Venus.
3. Asset grabber¶
Commands¶
scripts/assets/run grab <url> [--out DIR] [--min-width N] [--max N] [--media images|video|all] [--mobile] [--include-svg]
scripts/assets/run search <query> [--engine google|bing|ddg|unsplash|pexels] [--count N] [--out DIR]
scripts/assets/run optimize <path> [--out DIR] [--widths 320 640 1280 1920] [--skip-webp] [--skip-avif] [--keep-metadata]
scripts/assets/run catalog [--roots …] [--out forge/data/assets-catalog.json] [--no-hash]
Where assets land¶
| Flow | Default path |
|---|---|
grab (URL) |
/mnt/workspace/Assets/Web-Grabs/<date>_<host>/ |
search (query) |
/mnt/workspace/Assets/Web-Grabs/<date>_<query>/ |
| Site-specific | forge/sites/<site>/<page>/assets/ (pass --out) |
| Catalog | forge/data/assets-catalog.json |
Provenance¶
Every download emits/merges provenance.json next to the file(s). Fields: filename, source_url, page_url, engine, query, referer, content_type, bytes, sha256, saved_at, extra. Purely a trace, not a license check. Pull anything.
Optimization pipeline¶
- Read raster image (jpg/png/webp/gif/avif/bmp/tiff).
- Strip EXIF unless
--keep-metadata. - Resize with Pillow LANCZOS to each target width ≤ source width, always include source-width variant.
- Encode with
cwebp(quality 80) andavifenc(quality 55, speed 6). - Emit
optimize.jsonwith the manifest of produced variants and byte sizes.
Security posture¶
- Nothing binds to a public interface. All browser automation is subprocess-local.
- Session cookies (if you enable a persistent Chromium profile): stored at
~/.forge-venvs/assets/browser-profile/, outside the git repo. Never committed. - Downloads go through the browser's request context with Referer set to the originating page.
- Default 250ms delay between downloads inside a single grab run.
- No OAuth was touched. No personal Gmail involved.
4. Verified end-to-end¶
Actual run performed in this session:
$ scripts/sites/new-page.sh justinsforge.com demo-scaffold \
--template content --title "Forge Scaffold Demo" \
--desc "End-to-end demo..." --eyebrow "DEMO"
created: sites/justinsforge.com/demo-scaffold/index.html
$ scripts/assets/run grab "https://en.wikipedia.org/wiki/Austin,_Texas" \
--out sites/justinsforge.com/demo-scaffold/assets/_raw --min-width 400 --max 3
fetching: https://en.wikipedia.org/wiki/Austin,_Texas
candidates: 1
ok 1: 000_960px-downtown-austin-2c-texas-from-the-colorado.jpg
saved: 1 files provenance.json written
$ scripts/assets/run optimize sites/justinsforge.com/demo-scaffold/assets/_raw \
--out sites/justinsforge.com/demo-scaffold/assets --widths 640 1280 1920
optimizing: 1 files -> ...
ok: 000_downtown-austin.jpg -> 4 variants (640w + 960w × webp + avif)
variants: 4
$ scripts/sites/preview.sh justinsforge.com --port 8094
serving: .../justinsforge.com
tailscale: http://100.97.43.104:8094/ ← works from Venus
$ curl -sI http://127.0.0.1:8094/demo-scaffold/ → 200 OK
$ curl -s http://127.0.0.1:8094/_shared/tokens.css → tokens served
$ scripts/sites/screenshot.sh http://127.0.0.1:8094/demo-scaffold/ --label demo-scaffold
wrote: logs/site-screenshots/demo-scaffold-desktop.png
wrote: logs/site-screenshots/demo-scaffold-mobile.png
Desktop screenshot shows: header with justinsforge.com brand + Home/This Page nav, DEMO eyebrow + Playfair "Forge Scaffold Demo" headline, Austin skyline hero image, "Overview" + "Highlights" sections with 3-card grid, footer. Mobile screenshot shows the same content stacked responsively. Dark theme via tokens, no layout breakage. The entire flow took < 3 minutes end-to-end.
5. Bugs found + fixed during verification¶
grab.py: Playwright request timeout unit. Passedargs.timeout/1000.0(seconds) when Playwright'sAPIRequestContext.getwants milliseconds. Symptom:Timeout 30ms exceeded. Fixed: passargs.timeoutdirectly.grab.py, too-loose anchor match. Thea[href]matcher for image extensions was pulling Wikipedia's/wiki/File:Foo.jpgHTML pages. Fixed: added!/\/(wiki|file|File:)\//.test(a.pathname)guard.optimize.py, width logic. The set comprehension filter dropped widths > source, leaving only one target. Fixed to: keep targets < source, always emit a source-width variant. 4 variants now produced correctly (640 + 960 × webp + avif).
6. Tool registry updates¶
Registered per Justin's feedback_register_tools.md rule:
memory/reference_site_scaffold.md, points atdocs/sites-playbook.md, summarizes scripts/sites + sites/_sharedmemory/reference_asset_grabber.md, points atdocs/asset-grabber-playbook.md, summarizes scripts/assets + venv locationMEMORY.md, two new Tools & Pipelines entries (Site Scaffold, Asset Grabber)
7. Open decisions (numbered, each with a recommendation)¶
- SSG: resolved. Vanilla +
forge-include.js. Revisit only if we start shipping >100 pages under one site or need prerendered data-from-API pages. - Cloudflare cache invalidation: not wired. Recommendation: add a
--purge HOSTflag toscripts/sites/deploy.shthat hitsPOST /zones/:id/purge_cachewith an API token scoped to Zone.Cache Purge only, stored at~/.forge-secrets/cloudflare.env. Don't do it yet: Cloudflare's default cache is short and no page needs aggressive caching. - Screenshot diff: not wired. Recommendation: a
--compare baseline.pngflag onscreenshot.shthat usesPIL.ImageChops.differenceand fails deploy if pixel delta > 2%. Low priority until we're deploying 10+ sites. - Persistent browser profile for auth'd grabs: not wired. Recommendation: when Justin actually needs Shopify/Adobe Stock assets, spin up a one-time headful Chromium over
ssh -Y(or VNC if X forwarding is a pain), log in, then all subsequentgrab.pyruns reuse the profile. Keep under~/.forge-venvs/assets/browser-profile/. - gallery-dl config: installed but no config yet. Recommendation: when Justin first tries to bulk-pull a social profile, drop a
~/.config/gallery-dl/config.jsonthat routes outputs to/mnt/workspace/Assets/Social-Grabs/{category}/{user}/and sets a polite 1-req/s rate limit. - Optimize.py AVIF quality: currently 55 (libavif scale where lower=better). Recommendation: leave it; produces ~25-30% smaller files than WebP at visually identical quality on the test image (25KB AVIF vs 30KB WebP at 640w).
- Catalog scope: currently scans
/mnt/workspace/Assets+forge/assets. Recommendation: add/mnt/workspace/JustinWieb-VR,/mnt/workspace/Nova-Design,/mnt/workspace/Gus-Outdoor-Co,/mnt/workspace/Sip-N-Servewhen Justin wants brand assets indexed, but run with--no-hashto keep it fast (these are big dirs).
8. Cost¶
Everything runs on UDev. No API keys required. Optional:
| Key | Service | Free tier | When to bother |
|---|---|---|---|
UNSPLASH_ACCESS_KEY |
Unsplash | 50 req/hr | If Google Images results get rate-limited or you want cleaner metadata |
PEXELS_API_KEY |
Pexels | ~200 req/hr | Same rationale |
Neither needed today. Google/Bing/DDG via Playwright works out of the box and can pull anything from anywhere.
9. Files created this session¶
sites/_shared/{tokens.css, base.css, header.html, footer.html, forge-include.js, README.md}
sites/_templates/{single-page.html, content-page.html}
sites/justinsforge.com/demo-scaffold/{index.html, assets/{hero.jpg, 4 optimized variants, _raw/, optimize.json}}
scripts/sites/{new-page.sh, preview.sh, screenshot.sh, deploy.sh}
scripts/assets/{grab.py, search.py, optimize.py, catalog.py, run, lib/{__init__.py, provenance.py}}
docs/{sites-playbook.md, asset-grabber-playbook.md}
logs/site-screenshots/{demo-scaffold-desktop.png, demo-scaffold-mobile.png}
memory/{reference_site_scaffold.md, reference_asset_grabber.md} (in ~/.claude/.../memory/)
MEMORY.md updated with two new Tools & Pipelines entries
Justin reviews. Nothing deployed beyond local preview. Demo page lives at sites/justinsforge.com/demo-scaffold/, safe to delete if not wanted (rm -rf sites/justinsforge.com/demo-scaffold).
, [Claude Code] Opus 4.7, web-asset (Opus47) session