Browser Use With Steel

Keep your Browser Use agents exactly as they are. Point their `BrowserSession` at a Steel session over CDP and you get sub-second startup, 24 hour browser lifetimes, live viewer links, and reliable cleanup without changing a single task prompt or tool definition.

Keep your Browser Use agents exactly as they are. Point their BrowserSession at a Steel session over CDP and you get sub-second startup, 24 hour browser lifetimes, live viewer links, and reliable cleanup without changing a single task prompt or tool definition.

Steel-managed browsers handle the painful parts Browser Use leaves to you: managed proxies, CAPTCHA solving with human fallbacks, replay-ready evidence, and concurrency that scales from Steel Local's single session up to the hundreds available on Steel Cloud plans. You still run Browser Use inside your Python stack; Steel just owns the browser runtime and infrastructure contract.

What stays the same

Browser Use concernWhat you keepNotes
Tasks and promptsAgent(task=..., llm=...) definitions stay untouchedKeep LangChain or custom tools as-is
LLM + reasoningSame GPT-4o, Claude 3, Gemini, or DeepSeek modelsSteel never touches your model keys
Tool orchestrationCustom tools, structured outputs, retriesAgent.run() keeps managing substeps
Dev workflowPython 3.11 virtualenv, dotenv, loggingRun locally or in your orchestrator the way you already do
Debug habitsTerminal logs, Browser Use tracesAdd Steel's viewer URL beside them for richer evidence

What Steel adds

Steel surfaceWhy it matters for Browser UseHow to turn it on
Session lifecycleSub-second startup and up to 24 h runs keep long tasks alive even if your host restartssession = client.sessions.create() then release with client.sessions.release(session.id)
ObservabilityLive WebRTC viewer plus replay URL means you can watch and share any runLog session.session_viewer_url and keep it with task metadata
Anti-bot + proxiesManaged residential proxies and CAPTCHA solving reduce false positivesPass use_proxy / solve_captcha flags when creating the session
State + profilesPersist logins when Browser Use needs to resume work without reauthSet persist_profile: true or supply profile_id as you would in other Steel sessions
Scale + cleanlinessSteel Cloud exposes hundreds of concurrent sessions; releasing finished sessions frees plan caps immediatelyInstrument client.sessions.release() in every success and failure path

Minimal integration path

  1. Install steel-sdk, browser-use, and python-dotenv, then create a .env with STEEL_API_KEY, OPENAI_API_KEY, and a TASK string.
  2. Create a Steel client and session:
    client = Steel(steel_api_key=os.getenv("STEEL_API_KEY"))
    session = client.sessions.create()
    print(session.session_viewer_url)
  3. Build the CDP URL Browser Use expects: cdp_url = f"wss://connect.steel.dev?apiKey={STEEL_API_KEY}&sessionId={session.id}".
  4. Instantiate BrowserSession(cdp_url=cdp_url) and pass it to Agent(task=TASK, llm=model, browser_session=browser_session).
  5. Run await agent.run(); keep your existing tool definitions and retries.
  6. Always release the Steel session in finally so replays finish uploading and concurrency slots reopen.

Python example

import asyncio, os, time
from dotenv import load_dotenv
from steel import Steel
from browser_use import Agent, BrowserSession
from browser_use.llm import ChatOpenAI

load_dotenv()
STEEL_KEY = os.getenv("STEEL_API_KEY")
OPENAI_KEY = os.getenv("OPENAI_API_KEY")
TASK = os.getenv("TASK") or "Go to Wikipedia and summarize the latest AI article"

async def main():
    client = Steel(steel_api_key=STEEL_KEY)
    session = client.sessions.create()
    cdp_url = f"wss://connect.steel.dev?apiKey={STEEL_KEY}&sessionId={session.id}"
    model = ChatOpenAI(model="gpt-4o", temperature=0.3, api_key=OPENAI_KEY)
    agent = Agent(task=TASK, llm=model, browser_session=BrowserSession(cdp_url=cdp_url))

    try:
        start = time.time()
        result = await agent.run()
        print(f"Result: {result}")
        print(f"Replay: {session.session_viewer_url}")
        print(f"Elapsed: {time.time() - start:.1f}s")
    finally:
        client.sessions.release(session.id)
        print("Steel session released")

if __name__ == "__main__":
    asyncio.run(main())

This is the quickstart pattern: Browser Use owns reasoning, Steel owns the Chromium instance.

Solve CAPTCHAs and anti-bot pressure

  • Create a Tools() collection and add a wait_for_captcha_solution action that polls client.sessions.captchas.status(session_id) every second until isSolvingCaptcha clears. Steel routes the CAPTCHA to a human solver and you resume the agent after the tool returns.
  • Keep session state in SESSION_CACHE so your tool can read the active session ID without sharing globals through the LLM.
  • Turn on managed proxies and region selection when calling client.sessions.create(use_proxy=True, region="us-east") so Browser Use inherits the right fingerprint and IP without extra code.
  • Pair every CAPTCHA tool call with a Steel session viewer link so operators can watch the human intervention live if needed.

Fit and trade-offs

Works best for

  • Teams already on Browser Use who want Steel's evidence, proxies, and 24 h lifetimes without refactoring prompts.
  • Python stacks on 3.11+ that can bundle steel-sdk, Playwright, and LangChain dependencies.
  • Agent workflows that require human takeover or audit-ready replays.

Not yet ideal when

  • You need non-Python runtimes; Browser Use is Python-first today.
  • Runs must exceed Steel's 24 hour session ceiling or need offline browsers without CDP.
  • You cannot run with modern vision-capable LLMs; Browser Use relies on them for reasoning.

Go-live checklist

  • .env committed to secrets manager with valid Steel and LLM API keys.
  • Session logs capture session.session_viewer_url, task ID, and whether client.sessions.release() succeeded.
  • Proxy, region, and CAPTCHA flags set per target site to avoid false positives.
  • Optional: wire the CAPTCHA tool described above before touching login-heavy sites.
  • Run the official quickstart at docs.steel.dev/integrations/browser-use/quickstart once end to end.

Next step: drop this integration into one Browser Use agent, watch the first run in the Steel viewer, then add CAPTCHA tooling before scaling past a handful of sessions. Humans use Chrome. Agents use Steel.