Skip to main content
Configure agents with parameters that control their reasoning model, step limits, vision capabilities, and more.

Creating an Agent

Create an agent with configuration parameters:
from notte_sdk import NotteClient

client = NotteClient()

with client.Session() as session:
    agent = client.Agent(
        session=session,
        reasoning_model="gemini/gemini-2.0-flash",
        use_vision=True,
        max_steps=15,
        vault=vault,  # Optional
        persona=persona  # Optional
    )

Agent Creation Parameters

Parameters set when creating the agent instance.

session

session
RemoteSession
required
The browser session the agent will use to execute actions. Must be a Notte session instance.
with client.Session(headless=False) as session:
    agent = client.Agent(session=session)

reasoning_model

reasoning_model
str
default:"gemini/gemini-2.0-flash"
The large language model used for agent reasoning and decision-making. Supported models include gemini/gemini-2.0-flash, anthropic/claude-3.5-sonnet, anthropic/claude-3.5-haiku, openai/gpt-4o, and openai/gpt-4o-mini.
agent = client.Agent(
    session=session,
    reasoning_model="anthropic/claude-3.5-sonnet"
)

use_vision

use_vision
boolean
default:true
Whether to enable vision capabilities for the agent. Vision allows the agent to analyze images, screenshots, and visual page elements. Not all models support vision.
agent = client.Agent(
    session=session,
    use_vision=True  # Agent can understand images
)

max_steps

max_steps
int
default:"varies"
Maximum number of actions the agent can take before stopping. Must be between 1 and 50. Higher values allow more complex tasks but increase cost and execution time.
agent = client.Agent(
    session=session,
    max_steps=20  # Allow up to 20 actions
)

vault

vault
NotteVault
Optional vault instance containing credentials the agent can use for authentication. See Vaults for details.
vault = client.Vault(vault_id="vault_123")

agent = client.Agent(
    session=session,
    vault=vault  # Agent can access vault credentials
)

persona

persona
NottePersona
Optional persona providing the agent with phone numbers, email addresses, and other identity information. See Personas for details.
persona = client.Persona(persona_id="persona_456")

agent = client.Agent(
    session=session,
    persona=persona  # Agent can use persona information
)

notifier

notifier
BaseNotifier
Optional notifier that sends notifications when the agent completes or fails. Useful for long-running tasks.
from notte_core.common.notifier import EmailNotifier

notifier = EmailNotifier(email="user@example.com")

agent = client.Agent(
    session=session,
    notifier=notifier  # Get email when agent finishes
)

Agent Runtime Parameters

Parameters provided when running the agent.

task

task
str
required
Natural language description of what the agent should accomplish. Be specific and clear for best results.
result = agent.run(
    task="Find the cheapest laptop under $1000 and add it to cart"
)

url

url
str
Optional starting URL for the agent. If not provided, the agent starts from the current page in the session.
result = agent.run(
    task="Extract pricing information",
    url="https://example.com/products"
)

response_format

response_format
type[BaseModel]
Optional Pydantic model defining the structure of the agent’s response. Use this to get type-safe, structured output. See Structured Output for details.
from pydantic import BaseModel

class Product(BaseModel):
    name: str
    price: float
    in_stock: bool

result = agent.run(
    task="Extract product information",
    response_format=Product
)

session_offset

session_offset
int
Experimental - The step number from which the agent should gather information from the session history. If not provided, the agent has fresh memory. Use this to make the agent aware of previous actions.
# Execute some actions first
session.execute(type="goto", url="https://example.com")
session.execute(type="click", selector="button.search")

# Agent remembers actions from step 0
result = agent.run(
    task="Continue from where we left off",
    session_offset=0
)

Configuration Examples

Simple Agent

Minimal configuration for basic tasks:
with client.Session() as session:
    agent = client.Agent(session=session)
    result = agent.run(task="Find contact email")

Production Agent

Full configuration for production use:
production_agent.py
from notte_sdk import NotteClient

client = NotteClient()

vault = client.Vault(vault_id="prod_vault")
persona = client.Persona(persona_id="prod_persona")

with client.Session(headless=True, proxies=True) as session:
    agent = client.Agent(
        session=session,
        reasoning_model="anthropic/claude-3.5-sonnet",
        use_vision=True,
        max_steps=30,
        vault=vault,
        persona=persona
    )

    result = agent.run(
        task="Complete checkout process",
        url="https://store.example.com/cart"
    )

    if result.success:
        print(f"Order completed: {result.answer}")
    else:
        print(f"Failed: {result.answer}")

Structured Data Extraction

Agent configured for data extraction:
structured_extraction.py
from notte_sdk import NotteClient
from pydantic import BaseModel

class CompanyInfo(BaseModel):
    name: str
    email: str
    phone: str | None
    address: str | None

client = NotteClient()

with client.Session() as session:
    agent = client.Agent(
        session=session,
        reasoning_model="gemini/gemini-2.0-flash",
        max_steps=10
    )

    result = agent.run(
        task="Extract company contact information",
        url="https://example.com/contact",
        response_format=CompanyInfo
    )

    if result.success:
        company = CompanyInfo.model_validate_json(result.answer)
        print(f"Company: {company.name}")
        print(f"Email: {company.email}")
        print(f"Phone: {company.phone}")

Best Practices

1. Choose Appropriate Step Limits

Match max_steps to task complexity:
# Simple task (3-5 actions)
max_steps=5

# Medium complexity (5-15 actions)
max_steps=15

# Complex multi-page task (15-30 actions)
max_steps=30

2. Balance Cost and Capability

Use cheaper models for simple tasks:
# Simple navigation and extraction
reasoning_model="gemini/gemini-2.0-flash"

# Complex reasoning and decision-making
reasoning_model="anthropic/claude-3.5-sonnet"

3. Use Vision Selectively

Disable vision when not needed to reduce costs:
# Text-only site
agent = client.Agent(session=session, use_vision=False)

# Image-heavy site
agent = client.Agent(session=session, use_vision=True)

4. Provide Context via URL

Start agents at the right page:
# Good - start where needed
agent.run(
    task="Extract product details",
    url="https://example.com/product/123"
)

# Less efficient - agent must navigate first
agent.run(
    task="Go to product page and extract details",
    url="https://example.com"
)

Next Steps