Short answer
Browser automation collapses past the first few dozen runs when concurrency, queue depth, and proxy rotation fight each other. Steel Local tops out around one live session, and even the Hobby and Starter cloud plans cap concurrent sessions at 5 and 10 with 1 to 2 requests per second, so bursty workers back up immediately.
Instead of chasing flaky VMs, run Steel Cloud like a session factory: pick a plan that matches the number of simultaneous sessions you owe, size your work queue to 70 to 80 percent of that cap, pre-warm proxies for each region, and release sessions aggressively so the next job can reuse the slot. Pro plans give you 100 concurrent sessions, 10 requests per second, 24 hour runtimes, managed residential proxies, and releaseAll controls so one engineer can keep hundreds of flows moving.
Where scaling collapses first
| Symptom | Why it breaks | Steel move |
|---|---|---|
| Local runner saturates after a handful of jobs | Steel Browser (self-hosted) is designed for single-session work and lacks managed stealth, proxies, and concurrency above ~1 | Shift high-volume queues to Steel Cloud which guarantees 100+ concurrent sessions plus managed infra so scaling is configuration, not hardware babysitting. |
| Queue depth outruns the plan cap | Hobby through Pro plans hard limit concurrent sessions (5, 10, 20, 100) and requests per second (1, 2, 5, 10). Extra work just waits and burns SLAs | Track active sessions per plan and throttle your dispatcher so it never exceeds the cap. Upgrade before queue depth stays above 80 percent for more than a few minutes. |
| Long sessions block new work | Lower tiers stop sessions after 15 to 60 minutes; Pro tops out at 24 hours. Jobs that overrun block concurrency even though they no longer do useful work | Pass per-job timeout values when you call sessions.create and reuse session IDs only when state is required. Release or releaseAll when a job stalls. |
| IP bans spike as you add seats | More workers share the same datacenter IP by default; anti-bot systems notice | Flip useProxy: true so every session gets a fresh managed residential IP, or attach Bring Your Own proxies with your own rotation logic. |
Know your concurrency budget before you queue
| Plan | Concurrent sessions | Requests/sec cap | Max session time | When it fits |
|---|---|---|---|---|
| Hobby | 5 | 1 | 15 minutes | Local prototyping and CI smoke tests |
| Starter | 10 | 2 | 30 minutes | Small agent pilots where you batch traffic |
| Developer | 20 | 5 | 1 hour | Production-ish runs that still fit inside short flows |
| Pro | 100 | 10 | 24 hours | Real fleets, long tail retries, human handoff loops |
| Enterprise | Custom | Custom | Custom | Regulated workloads or anything above 100 concurrency |
Pick the lowest tier whose request-per-second and session length match your median job. Steel Cloud plans let you burst up to 3x your prepaid browser-hour credits, so you can absorb short spikes without an upgrade mid-incident. When multi-region routing matters, add the region flag on session creation so your queue can split capacity by geography instead of fighting over one cluster.
Run a session factory, not ad hoc launches
- Model every task as a job in a durable queue. Each job carries required resources (profile ID, proxy need, region). The dispatcher app pulls from the queue only when the live session count is below the plan cap minus a buffer.
- Pre-warm sessions when workloads are steady. Use
sessions.createahead of peak hours, store the IDs, and keep them alive with realistictimeoutwindows so the automation can attach instantly instead of paying startup time mid-SLA. - Attach the right surface per job. When connecting via Playwright, Puppeteer, or CDP, tag log lines with
session.id, proxy choice, and queue job ID so operators can trace a failure back to the precise resource combination. - Release aggressively. Call
sessions.releaseas soon as the job hands off evidence or files. If you suspect a leak,sessions.releaseAllresets the fleet and frees browser-hour credits before you hit the cap. - Segment proxies and regions. Create lightweight proxy pools (managed, BYO, or default) per queue partition and rotate them deliberately so retries do not hammer the same IP and trigger bans.
Metrics and guardrails to watch
| Signal | Healthy range | Action when it drifts |
|---|---|---|
| Queue depth vs concurrency | Depth stays under 0.8 x concurrent-session cap | If depth stays high, either add more Pro capacity or shed load by pausing ingest. |
| Requests per second | 70 to 90 percent of plan RPS | When you hit the wall, add jitter between job launches or upgrade to avoid server-side throttling. |
| Session age distribution | Most sessions finish before 50 percent of the plan's max duration | Audit jobs that run long, set tighter timeout values, or split workflows into multiple sessions. |
| Proxy bandwidth draw | Within the GB allocation implied by your credits | Monitor proxyBandwidth / credits so you are not surprised by residential overages; BYO if a single customer needs unusual geo spread. |
| Browser hours spend | Tracks completed jobs within your credit pool | Spikes without higher throughput mean hung sessions; trigger releaseAll or auto-scaling logic to recycle them. |
Trade-offs and limits
- Steel Cloud enforces hard concurrency and RPS caps by plan. If the fleet must hold steady at 150+ live sessions, align with Enterprise early so the cap change is contractual, not a scramble.
- Managed proxies are billed by the GB. Keep large downloads on default datacenter IPs when stealth is not needed, and reserve residential IPs for the forms, checkouts, or OTP flows that gate your automation.
- Sessions still top out at 24 hours even on Pro. For multi-day automations, store state in profiles or files between sessions and spin up a fresh browser rather than fighting the ceiling.
- Long queues hide flaky logic. If a job retries more than twice, push it into a dead letter topic with the session ID so operators can inspect the replay while fresh work keeps moving.
Next steps
- Map your workload against the Pricing and Limits table and resize your plan before the queue becomes the bottleneck.
- Review the Sessions API overview and session lifecycle docs so your automation actually releases slots when it finishes.
- Decide whether Steel-managed or Bring Your Own proxies fit each queue by reading the proxy guide, then bake that routing into your dispatcher.
Humans use Chrome. Agents use Steel.