Skip to main content
Browser Agents are AI-powered systems that can autonomously navigate websites, complete tasks, and extract information using natural language instructions.

What is a Browser Agent?

A Browser Agent combines:
  • Large Language Models (LLMs) for reasoning and decision-making
  • Browser Sessions for executing actions
  • Vision capabilities to understand web pages
  • Autonomous planning to complete multi-step tasks
Unlike scripted automation, agents can adapt to changes, handle unexpected scenarios, and complete tasks without predefined workflows.

Quick Start

Create and run an agent in a few lines:
agent_quickstart.py

from notte_sdk import NotteClient

client = NotteClient()

with client.Session() as session:
    agent = client.Agent(
        session=session,
        reasoning_model="gemini/gemini-2.0-flash",
        max_steps=10
    )

    result = agent.run(
        task="Go to example.com and find the contact email"
    )

    print(result.answer)
Agents run within browser sessions. Use context managers to ensure sessions are automatically stopped when done. This prevents orphaned sessions and unexpected costs.

How Agents Work

1. Observation

The agent observes the current page state:
  • Visible elements and their properties
  • Interactive components (buttons, forms, links)
  • Text content and structure
  • Current URL and page metadata

2. Reasoning

Using the LLM, the agent:
  • Understands the current page
  • Plans the next action to complete the task
  • Decides which element to interact with
  • Determines when the task is complete

3. Action

The agent executes browser actions:
  • Navigate to URLs
  • Click buttons and links
  • Fill forms
  • Extract data
  • Scroll and interact with dynamic content

4. Iteration

This cycle repeats until:
  • The task is successfully completed
  • Maximum steps are reached
  • An error occurs that can’t be resolved

Agents vs Scripted Automation

Both agents and scripted automation run on browser sessions—the cloud browser infrastructure. The difference is how you control what happens in that session.
AspectScripted AutomationAgent
ControlYou write the codeAI decides each step
FlexibilityFixed workflowAdapts to changes
SpeedFast (direct execution)Slower (LLM reasoning per step)
CostBrowser minutes onlyBrowser minutes + LLM calls
ReliabilityDeterministicCan vary based on page state
Use CaseKnown, stable workflowsUnknown or dynamic workflows
Use scripted automation when:
  • You know the exact steps to take
  • Speed and cost are critical
  • The target pages rarely change
Use agents when:
  • You don’t know the exact steps
  • Pages change frequently
  • You need intelligent decision-making
You can combine both approaches: use an agent to figure out a workflow, then convert it to a function for faster, cheaper repeated execution.

Agent Capabilities

Agents come with powerful built-in capabilities:

Key Concepts

Natural Language Tasks

Give instructions in plain English:
from notte_sdk import NotteClient

client = NotteClient()

with client.Session() as session:
    agent = client.Agent(session=session)
    agent.run(task="Find the cheapest laptop under $1000 and add it to cart")

Structured Output

Get responses in a specific format:
from pydantic import BaseModel
from notte_sdk import NotteClient

client = NotteClient()

class ContactInfo(BaseModel):
    email: str
    phone: str | None


with client.Session() as session:
    agent = client.Agent(session=session)
    result = agent.run(
        task="Extract contact information",
        response_format=ContactInfo
    )

Starting URL

Begin at a specific page:
from notte_sdk import NotteClient

client = NotteClient()

with client.Session() as session:
    agent.run(
        task="Find pricing information",
        url="https://example.com/products"
    )

Step Limits

Control maximum actions:
from notte_sdk import NotteClient

client = NotteClient()

with client.Session() as session:
    agent = client.Agent(
        session=session,
        max_steps=20  # Limit to 20 actions
    )

Error Handling

Agents can fail for various reasons. Always check the result:
from notte_sdk import NotteClient

client = NotteClient()

with client.Session() as session:
    result = agent.run(task="Complete task")

    if result.success:
        print(result.answer)
    else:
        print(f"Agent failed: {result.answer}")

Next Steps