Internal Research · Cost & Feasibility

Replacing urlbox with a self-hosted screenshot worker

Can we extend the brand-kit-extractor we shipped last night into a general-purpose screenshot service? The infrastructure already exists. The question is what we save and what it costs to build.

Status: Research only Volume baseline: 140,000 renders / mo Current spend: $650–$924 / mo Plan: urlbox Ultra Generated: 2026-05-20
Verdict: strong yes — backend audit confirms a single migration chokepoint and ~$5–10K/yr in net savings.

Backend audit found 21 urlbox callsites all funneling through one wrapper (UrlboxGenerateSignedUrlAction.php) — migration is one file swap plus a DB backfill. The bulk of the 140k/mo volume is 3× shots per page from GeneratePageScreenshotsAction (desktop+tablet+mobile), not the highlight-overlay path. CSS injection IS used for the red-dashed-border highlight overlay (confirmed at 3 callsites). No stealth, no JS injection, no webhooks. The S3 layer is already R2/Spaces-compatible. Cloudflare Browser Rendering at our volume: $43–$130/mo all-in, plus $0–$285/mo Webshare proxy if we need stealth fallback (we don't pay for it today). Total: $80–$415/mo vs. urlbox Ultra's $924/mo at 140k (math: $99 + 125k × $6.60/1k overage).

urlbox Ultra at our volume
$924 /mo
$99 base + 125k × $6.60/1k overage. You report $650–800 actual — likely volume discount.
Projected self-hosted cost
$80–415 /mo
CF Browser Rendering + optional Webshare stealth
Annual savings (vs. observed spend)
$2,800–10,100
After ~$8–12K build · ROI 2–5 months

Where the money goes today vs. tomorrow

urlbox bills a flat per-render fee that bundles infrastructure, proxies, and managed ops. Self-hosting unbundles them — we pay CF for compute, Webshare for stealth bandwidth, and ourselves once for the build.

Monthly cost comparison

5-year cumulative cost

Cost breakdown — Cloudflare side

Three scenarios based on wall-clock time per screenshot (optimistic 5s, realistic 10s, pessimistic 15s). Wall time is the dominant cost driver because CF charges $0.09 per browser-hour and overage on the 10 included hours.

Per-component monthly cost (140,000 screenshots)

Component Optimistic (5s) Realistic (10s) Pessimistic (15s) Notes
Workers Paid plan (base)$5$5$5Required to use Browser Rendering
Browser-hour overage$17$34$52$0.09/hr after 10hr/mo included
Concurrency overage$20$40$60$2/extra browser, averaged over daily peaks
R2 storage (optional cache)$1$1$1~28GB/mo at $0.015/GB · egress free
Total Cloudflare$43$80$118Add 20–30% margin for safety

Residential proxy cost — the X-factor

You don't currently use urlbox's stealth feature, but most replacements will need it sooner or later. Webshare is already wired in via brand-kit-extractor's smart-fallback path. Cost depends entirely on what % of traffic needs it. Assuming ~3 MB per page load (HTTP Archive 2025 median is 2.65 MB).

Residential proxy spend by stealth %

Stealth tiering strategy

Default
No proxy
$0
Datacenter
Webshare 1k IPs
$27
Residential
Webshare PAYG fallback
$2.25/GB

Most shots use CF's default egress (free). Datacenter pool catches sites that flag CF IPs but don't run bot management. Residential is the last-resort fallback for Cloudflare Bot Management / Akamai / DataDome — exactly where we already invoke Webshare in brand-kit-extractor today.

Scenario Stealth shots/mo Bandwidth Webshare plan Monthly cost
Optimistic7,000 (5%)~21 GB25 GB pack~$65
Realistic21,000 (15%)~63 GB100 GB plan~$165
Conservative42,000 (30%)~126 GB100 GB + overage~$285

Architecture — how much we already have

The brand-kit-extractor worker built last night already contains the hard parts: stealth fingerprinting, residential proxy fallback, cookie banner dismissal, bot challenge detection, full-page stitching at 4000px. Extract the shared bits into a library, build a thin new screenshot-service worker on top.

flowchart LR subgraph BE["api_wpfeedback (Laravel)"] direction TB CS1["GeneratePageScreenshotsAction
3× per page"] CS2["uploadTaskScreenshotViaUrlBox
highlight overlay"] CS3["Site thumb / AI viewport /
16 other callsites"] WRAP["UrlboxGenerateSignedUrlAction
single chokepoint"] CS1 --> WRAP CS2 --> WRAP CS3 --> WRAP end subgraph FE["app_wpfeedback (frontend)"] BD["clarity-feedback
brand discovery"] UI["LoginModal /
RegisterModal thumbs"] end subgraph CF["Cloudflare Workers"] direction TB SS["screenshot-service
NEW · ~600 LOC"] BK["brand-kit-extractor
existing"] SHARED["shared/browser-rendering
~1500 LOC extracted"] SS --> SHARED BK --> SHARED end subgraph Infra["CF + Existing S3"] BR["Browser Rendering
$0.09/browser-hr"] S3["Existing S3/R2 bucket
(AWS_ENDPOINT compat)"] KV["KV cache
dedup & TTL"] end subgraph FB["Stealth fallback (Webshare)"] DC["Datacenter pool
$27/mo"] RES["Residential PAYG
$2.25/GB"] end WRAP -->|HMAC POST| SS BD -->|service binding| SS UI -->|service binding| SS SHARED --> BR SS --> S3 SS --> KV SHARED -.bot challenge.-> DC SHARED -.last resort.-> RES classDef reuse fill:#10b981,stroke:#059669,color:#fff classDef new fill:#3b82f6,stroke:#2563eb,color:#fff classDef infra fill:#8b5cf6,stroke:#7c3aed,color:#fff classDef hot fill:#f97316,stroke:#ea580c,color:#fff class BK,SHARED reuse class SS new class BR,S3,KV,DC,RES infra class WRAP hot
Reusable from brand-kit
~1,500 LOC
Stealth · proxy · cookies · stitching
New code to write
~600 LOC
Capture · highlight overlay · R2 · cache
Existing infra we drop
3 files
urlbox URL gen · UI constants · fixtures

Feature parity matrix

Audit of every urlbox callsite in the monorepo. Most usage is dead-simple: signed URL with width/height/block_ads. No JS or CSS injection. No element selectors. No async webhooks. The replacement surface is small.

urlbox feature Used today? Replacement effort Notes
Signed screenshot URLYesTrivialHMAC pattern from brand-kit-extractor reused as-is
Custom viewport (width/height)Yes — 3 sizesAlready doneDesktop 1280, tablet 768, mobile 375 in GeneratePageScreenshotsAction
CSS injection (highlight overlay)Yes — 3 callsites~½ daypage.addStyleTag() with border: 2px dashed red on selector. SVG-path truncation hack already in PHP — port verbatim.
S3 upload (use_s3=true)YesTrivialBackend already uses S3-compatible endpoint (likely R2). Worker writes to same bucket.
scroll_to (Y position)YesTrivialpage.evaluate(y => window.scrollTo(0, y))
delay (ms before capture)Yes — 2000msTrivialawait page.waitForTimeout(delay)
wait_until (domloaded | requestsfinished)BothAlready doneMaps to waitUntil in page.goto. Brand-kit already does both.
Cookie banner hidingYesAlready doneuBO + autoconsent shipped in brand-kit-extractor
Custom user_agent (atarim-worker)YesTrivialMust keep — sites whitelist this UA. page.setUserAgent().
Custom HTTP header (Proxied-For: Atarim)YesTrivialMust keep — Atarim proxy worker matches on this header. page.setExtraHTTPHeaders().
Full-page stitchingSomeAlready doneLift the 4000px stitch loop from brand-kit-extractor:396
clickAll (click selector)YesTrivialpage.click(selector) in a loop
Thumb resizing (thumb_width)Page model only~½ dayCF Images transforms on R2 delivery OR post-render canvas resize
Quality (jpeg q=40, png lossless)YesTrivialpage.screenshot({quality})
img_fit=cover (crop to viewport)YesTrivialDefault puppeteer behavior — no full-page flag
JS injection (js=)NoSkip v1Not used anywhere. page.evaluate() if ever needed.
Stealth modeNoSkip v1Not paid for, not in any URL. Webshare fallback available if needed.
Async webhook callbacksNoSkip v1Backend uses fire-and-forget GET pattern instead
Multi-format (PDF/MP4)NoSkipNot used. PNG + JPEG only.

Where the 140k volume comes from

Backend audit complete. 21 urlbox callsites across api_wpfeedback, all funneling through UrlboxGenerateSignedUrlAction (the single chokepoint). The dominant volume driver is GeneratePageScreenshotsAction — fires 3 screenshots per page (desktop 1280 + tablet 768 + mobile 375) on new pages, AI reviews, and metadata syncs.

Estimated traffic split (post-audit)

Top callsites by volume

  • GeneratePageScreenshotsAction (50–60%) — 3× per page (D/T/M). Triggered by every SiteController::createSitePage, CollectSiteMetadata listener, and AI review action.
  • uploadTaskScreenshotViaUrlBox (25–35%) — the highlight-overlay path. Every new task with a selected element. This is the CSS-injection use case.
  • Site thumbnails (~10%) — fires on plugin /site/activate, /sitedata/sync, and Rocket onboarding listener (2-min delay job).
  • AI viewport screenshots (~5%) — AiReviewViewportAction, growing with AI feature adoption.
  • ImageController /generate-image — open passthrough endpoint. Volume unknown. Security smell: forwards $request->all() straight to urlbox.

Persistence pattern

Backend stores the signed urlbox URL itself in DB columns (tasks.wpf_task_screenshot, sites.image, pages.screenshot). Async listener does a fire-and-forget GET to warm urlbox and push to S3. The S3 URL is sometimes also stored (atarim_task_screenshot, thumbnail_s3_url).

🔵 Audit confirmed — feature surface is narrow

Used: width/height/full_page/img_fit, scroll_to, css injection, format (png/jpeg), quality, max_height, wait_until (domloaded | requestsfinished), use_s3, s3_path, hide_cookie_banners, skip_scroll, delay, clickAll, custom user_agent, custom header (Proxied-For: Atarim).

NOT used anywhere: stealth, js injection, selector clip, block_ads, cookie= injection, multi-format (PDF/MP4), webhooks, polling. The replacement surface is small.

The migration chokepoint

The backend has 21 callsites but they all go through one wrapper. Replacing urlbox = replacing one file, plus a DB backfill to rewrite persisted urlbox.com URLs to the new CDN.

Primary swap (one file)

  • app/Actions/Support/UrlboxGenerateSignedUrlAction.php — replace $urlbox->generateSignedUrl() with an HMAC POST to the new worker. Same return contract.

Cleanup (cosmetic)

  • composer.json:99 — remove urlbox/screenshots
  • config/app.php:176 — remove UrlboxProvider
  • config/services.php:54-57 — swap key/secret for worker URL/HMAC
  • UrlboxGenerateSignedUrlDTO — rename to ScreenshotRequestDTO

Refactor (2 bypass callsites)

  • RegenerateThumbnail.php & RegenerateAutoScreenShot.php — use the wrapper instead of the urlbox facade directly

DB backfill (the real work)

DB columns currently store https://api.urlbox.com/v1/... URLs. After migration these need to point at the new CDN (or the existing S3 URL where available).

  • tasks.wpf_task_screenshot
  • tasks.atarim_task_screenshot (already S3 — no change)
  • sites.image
  • sites.favicon
  • sites.thumbnail_s3_url (already S3)
  • sites.tablet_screenshot_url
  • sites.mobile_screenshot_url
  • pages.screenshot
  • pages.tablet_screenshot_url
  • pages.mobile_screenshot_url

Easiest path: write Laravel migration that does UPDATE ... SET col = REPLACE(col, 'api.urlbox.com/v1/.../{token}/{fmt}', 'screenshots.atarim.io') with token-aware regex. Chunked, idempotent, runnable in production.

🟢 Good news — S3 is already R2-ready

Backend uses AWS_ENDPOINT + AWS_USE_PATH_STYLE_ENDPOINT + AWS_PUBLIC_URL env vars (config/filesystems.php:60-90). This is an S3-compatible interface, almost certainly pointed at R2 or DO Spaces already. The new worker can write to the same bucket — zero storage migration.

Implementation effort & payback

MVP — 80% replacement
6–8 days
~$6–8K dev cost
  • Extract shared lib (2d)
  • capture + highlight overlay (1d)
  • R2 + cache + signed URLs (1d)
  • PHP integration shim (1d)
  • Staging soak (1–2d)
Full feature parity
10–14 days
~$10–14K total
  • Full-page stitching from brand-kit
  • Thumb resizing via CF Images
  • JS/CSS injection params
  • WebP + quality controls
Production hardening
15–21 days
~$15–21K total
  • Concurrency limiter (Durable Object)
  • Retry-with-proxy on failure
  • Sentry + observability
  • R2 lifecycle rules
  • Integration test corpus

Payback timeline — cumulative cost vs. urlbox

Risks & blockers

🔴 HIGH — CF Browser Rendering concurrency quota

Workers Paid plan ships with 10 concurrent browsers by default. 140k/mo averages ~3 shots/sec but will spike to 30–60/sec during EU/US business hours. We need 50–100 concurrent for safe headroom. The hard account ceiling is 120 (raise via support ticket). Block on confirming our quota before MVP.

🟡 MEDIUM — Session reuse required to avoid cold-start tax

Every puppeteer.launch() is multi-second and counts against the 1-launch-per-second rate limit. Without session reuse (via Durable Object pinning or browser.disconnect() / puppeteer.connect()), we'll cap throughput and bleed cost on cold starts. Adds ~1d to implementation but unblocks scale.

🟡 MEDIUM — DB backfill: ~10 columns store api.urlbox.com URLs

Backend persists the signed urlbox URL itself in tasks.wpf_task_screenshot, sites.image, pages.screenshot, and 7 other columns. After cutover these URLs would 404. Need a chunked, idempotent Laravel migration that rewrites them to the new CDN. ~1 day of work; can run in production with no downtime.

🟡 MEDIUM — 3× per page volume amplification

Backend's GeneratePageScreenshotsAction fires 3 screenshots per page (desktop 1280 + tablet 768 + mobile 375) and is the single largest volume driver (~55%). An easy optimization in the new worker: do all 3 viewport captures in one browser session by changing viewport between captures. Saves 2× session-launch overhead per page, cuts CF browser-hours by ~30%, and reduces concurrency pressure.

🟡 MEDIUM — Sync path blocks PHP workers up to 5 minutes

Several backend callsites use the sync path (Http::timeout(300)->get($signedUrl)) — blocks the PHP-FPM worker for up to 5 minutes per screenshot. With the new worker we can shorten this aggressively (target 5–15s p95) and free up PHP capacity. Latent throughput win not captured in the cost numbers.

🔴 HIGH — ImageController::generateImage is an open passthrough

app/Http/Controllers/ImageController.php:17-25 forwards $request->all() straight to the urlbox SDK. Any client of /generate-image can smuggle arbitrary urlbox params (cookie=, js=, etc.). Lock this down during migration — tightly type the DTO and reject unknown fields.

🔵 LOW — R2 egress & custom domain

R2 egress is free via Cloudflare's network. Binding the bucket to a custom domain (e.g. screenshots.atarim.io) gives public CDN delivery without Workers invocation on read. Verify the current pricing page before launch.

🔵 LOW — No built-in retries (urlbox does this)

urlbox auto-retries failed renders. We'd need a thin retry layer (Cloudflare Queues + DLQ) — covered in the "production hardening" tier above.

Recommendation

Proceed. The economics are strong and the migration path is short.

~$10K one-time build, ~$600/mo recurring savings (vs. $725 observed), 17-month TCO break-even at month 3, 5-year savings ≈ $37K. Backend audit confirmed: single chokepoint (one file), narrow feature surface (CSS injection + standard puppeteer ops, no exotic urlbox features), S3 layer already compatible. Only real unknowns left are (a) CF concurrency quota and (b) whether burst traffic warrants session-reuse architecture from day one.

Suggested next steps:

  • ½ day — File concurrency quota increase request with Cloudflare (need ~50–100 concurrent).
  • ½ day — Instrument brand-kit-extractor for wall-clock distribution (refines the 5–15s estimate).
  • 1 day — Spike: extract shared browser-rendering lib from brand-kit-extractor and prove the highlight-overlay page.addStyleTag() path produces visually equivalent output to urlbox's css= param on 5 representative real-customer URLs.
  • Decision point — commit to MVP (~$8K) or shelve.
  • If green-lit: 2 weeks MVP → 1 week staging soak → gradual rollout starting with low-risk callsites (RegenerateThumbnail artisan command, then uploadTaskScreenshotViaUrlBox, then GeneratePageScreenshotsAction) → DB backfill → urlbox shutoff.

Sources