Skip to content

Environment simulation for evaluations

Supported in ADKPython v1.24.0

When evaluating agents that rely on external dependencies — such as APIs, databases, or third-party services — running those tools live during testing can be slow, costly, or unreliable. The Environment Simulator lets you safely intercept these tool calls during agent execution and replace them with controlled, deterministic responses, without modifying the agent itself. This approach can fill a critical gap in the agent improvement loop, allowing you to create hermetic, offline test runs that isolate your agent logic for reliable scoring.

Overall, this feature lets you:

  • Test how an agent handles API errors or edge-case responses.
  • Run evaluations offline, without access to live backends.
  • Generate realistic mock responses automatically using an LLM.
  • Produce reproducible test runs by seeding probabilistic injections.

The Environment Simulation integrates with ADK's tool execution pipeline via the before_tool_callback hook or the plugin system, so no changes to your agent code are required.

The Environment Simulation is an experimental feature. Its API may change in future
releases.

How it works

While User Simulation drives the conversation forward, Environment Simulation provides the stable backend. At a high level, the Environment Simulator sits between your agent and its tools. When the agent calls a tool, the simulator intercepts the call and decides whether to return a synthetic response — either a predefined injection or an LLM-generated mock — or to let the real tool execute.

The decision logic follows this order for each configured tool:

  1. Injection configs are checked first, in order. If a matching injection is found (based on argument matching and probability), its error or response is returned immediately.
  2. Mock strategy is used as a fallback if no injection config applies. The simulator calls an LLM to generate a realistic response based on the tool's schema and any stateful context.
  3. No-op is returned (None) if the tool is not in the simulator config, allowing the real tool to execute normally.

Integration

The EnvironmentSimulationFactory class provides two integration points:

  • create_callback() — Returns an async callable suitable for use as a before_tool_callback on any LlmAgent.
  • create_plugin() — Returns an EnvironmentSimulationPlugin instance that integrates with the ADK plugin system.

Using as a callback

The following example shows how to create an environment simulation as one of the adk agent callbacks.

from google.adk.agents import LlmAgent
from google.adk.tools.environment_simulation import EnvironmentSimulationFactory
from google.adk.tools.environment_simulation.environment_simulation_config import (
    EnvironmentSimulationConfig,
    InjectedError,
    InjectionConfig,
    ToolSimulationConfig,
)

config = EnvironmentSimulationConfig(
    tool_simulation_configs=[
        ToolSimulationConfig(
            tool_name="get_user_profile",
            injection_configs=[
                InjectionConfig(
                    injected_error=InjectedError(
                        injected_http_error_code=503,
                        error_message="Service temporarily unavailable.",
                    )
                )
            ],
        )
    ]
)

agent = LlmAgent(
    name="my_agent",
    model="gemini-2.5-flash",
    tools=[get_user_profile],
    before_tool_callback=EnvironmentSimulationFactory.create_callback(config),
)

Using as a plugin

The following example shows how to create environment simulation as an ADK agent plugin.

from google.adk.apps import App
from google.adk.tools.environment_simulation import EnvironmentSimulationFactory
from google.adk.tools.environment_simulation.environment_simulation_config import (
    EnvironmentSimulationConfig,
    MockStrategy,
    ToolSimulationConfig,
)

config = EnvironmentSimulationConfig(
    tool_simulation_configs=[
        ToolSimulationConfig(
            tool_name="search_products",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        )
    ]
)

app = App(
    agent=my_agent,
    plugins=[EnvironmentSimulationFactory.create_plugin(config)],
)

Configuration reference

You can configure the Environment Simulator with a set of dataclasses. The following sections provide a detailed reference for each configuration object.

EnvironmentSimulationConfig

The top-level configuration object.

Field Type Default Description
tool_simulation_configs List[ToolSimulationConfig] required One entry per tool to simulate. Must not be empty, and tool names must be unique.
simulation_model str "gemini-2.5-flash" The LLM used for tool connection analysis and mock response generation.
simulation_model_configuration GenerateContentConfig thinking enabled LLM generation config for internal simulator calls.
environment_data str \| None None Optional environment context (e.g., a JSON database snapshot) passed to mock strategies to generate more realistic responses.
tracing str \| None None Tracing data (e.g., a prior agent run trace in JSON string format) to provide historical context.

ToolSimulationConfig

Defines how a single named tool should be simulated.

Field Type Default Description
tool_name str required Must match the tool's registered name exactly.
injection_configs List[InjectionConfig] [] Zero or more injection configs, checked in order before the mock strategy.
mock_strategy_type MockStrategy MOCK_STRATEGY_UNSPECIFIED Fallback strategy when no injection is triggered.

InjectionConfig

Controls a single synthetic response that can be injected into a tool call. Exactly one of injected_error or injected_response must be set.

Field Type Default Description
injected_error InjectedError \| None None Error to return (mutually exclusive with injected_response).
injected_response Dict[str, Any] \| None None Fixed response dict to return (mutually exclusive with injected_error).
injection_probability float 1.0 Probability [0.0, 1.0] that this injection fires.
match_args Dict[str, Any] \| None None If set, the injection only fires when the tool's arguments contain all key-value pairs in match_args.
injected_latency_seconds float 0.0 Artificial delay (≤ 120 s) added before returning the injection result.
random_seed int \| None None Seed for the probability check, enabling deterministic injection behavior.

InjectedError

Defines an HTTP-style error response.

Field Type Description
injected_http_error_code int HTTP status code to surface as
: : : "error_code" in the tool response. :
error_message str Human-readable message surfaced as
: : : "error_message" in the tool response. :

MockStrategy

Enum controlling how the simulator generates responses when no injection fires.

Value Description
MOCK_STRATEGY_TOOL_SPEC Uses the tool's schema and stateful context to
: : prompt an LLM to generate a realistic response. :
MOCK_STRATEGY_TRACING (Deprecated) Please use
: : MOCK_STRATEGY_TOOL_SPEC with tracing input. :

Injection mode

Use injection configs to test specific failure or edge-case scenarios. Injections are evaluated in list order; the first one whose match_args criteria are met (and whose probability check passes) is applied.

Injecting errors

The following example shows how to inject errors with specific error code and error message to the agent.

from google.adk.tools.environment_simulation.environment_simulation_config import (
    InjectedError,
    InjectionConfig,
    ToolSimulationConfig,
)

ToolSimulationConfig(
    tool_name="charge_payment",
    injection_configs=[
        InjectionConfig(
            injected_error=InjectedError(
                injected_http_error_code=402,
                error_message="Payment declined.",
            )
        )
    ],
)

The agent will receive {"error_code": 402, "error_message": "Payment declined."} instead of a real tool result, allowing you to evaluate how the agent handles payment failures.

Injecting fixed responses

Use the following InjectionConfig to specify a success response with fixed response payload.

InjectionConfig(
    injected_response={"status": "ok", "order_id": "ORD-9999"}
)

Conditional injection with argument matching

Use match_args to inject only when specific arguments are passed.

InjectionConfig(
    match_args={"item_id": "ITEM-404"},
    injected_error=InjectedError(
        injected_http_error_code=404,
        error_message="Item not found.",
    ),
)

Here, the error is injected only when the tool is called with item_id="ITEM-404". All other calls pass through to the next injection config or to the mock strategy.

Probabilistic injection

Set injection_probability to a value between 0.0 and 1.0 to simulate flaky behavior. For reproducible test runs, pin the random outcome with random_seed.

InjectionConfig(
    injection_probability=0.3,
    random_seed=42,
    injected_error=InjectedError(
        injected_http_error_code=500,
        error_message="Internal server error.",
    ),
)

Injecting latency

Use injected_latency_seconds to simulate slow backend responses, useful for testing timeout handling or user experience under degraded conditions.

InjectionConfig(
    injected_latency_seconds=5.0,
    injected_response={"result": "slow but successful"},
)

Combining multiple injection configs

Multiple injection configs on a single tool are checked in order. You can combine them to test multiple scenarios:

ToolSimulationConfig(
    tool_name="get_inventory",
    injection_configs=[
        # Always fail for a specific out-of-stock item
        InjectionConfig(
            match_args={"sku": "OOS-001"},
            injected_response={"quantity": 0, "available": False},
        ),
        # Randomly fail 20% of the time for all other items
        InjectionConfig(
            injection_probability=0.2,
            random_seed=7,
            injected_error=InjectedError(
                injected_http_error_code=503,
                error_message="Inventory service unavailable.",
            ),
        ),
    ],
)

Mock strategy mode

When you want the simulator to generate plausible responses automatically — rather than returning hand-crafted values — use MOCK_STRATEGY_TOOL_SPEC.

The simulator uses an LLM to:

  1. Analyze the schemas of all tools the agent has access to, and identify stateful dependencies between them (e.g., a create_order tool produces an order_id that get_order consumes).
  2. Track a state store of IDs and resources created during the session.
  3. Generate a response that is consistent with the tool's schema and the current state — returning a 404-style error if a consuming tool requests a resource that was never created.
from google.adk.tools.environment_simulation.environment_simulation_config import (
    EnvironmentSimulationConfig,
    MockStrategy,
    ToolSimulationConfig,
)

config = EnvironmentSimulationConfig(
    tool_simulation_configs=[
        ToolSimulationConfig(
            tool_name="create_order",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        ),
        ToolSimulationConfig(
            tool_name="get_order",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        ),
        ToolSimulationConfig(
            tool_name="cancel_order",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        ),
    ]
)

With this config, the simulator will automatically generate an order_id when create_order is mocked, and use it to return consistent results (or a not-found error) when get_order or cancel_order are subsequently called.

Providing environment data

Pass domain-specific context through environment_data to make mock responses more realistic. This can be a JSON string representing a snapshot of your database or any structured context the LLM should use when generating responses.

import json

db_snapshot = {
    "products": [
        {"id": "P-001", "name": "Wireless Headphones", "price": 79.99, "stock": 12},
        {"id": "P-002", "name": "USB-C Hub", "price": 34.99, "stock": 0},
    ],
    "warehouse_location": "US-WEST-2",
}

config = EnvironmentSimulationConfig(
    tool_simulation_configs=[
        ToolSimulationConfig(
            tool_name="search_products",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        ),
    ],
    environment_data=json.dumps(db_snapshot),
)

The LLM will use this data to return product names, prices, and stock levels that match your domain, rather than generating arbitrary placeholder values.

Providing tracing data

Feed traces generated in the agent to be mocked through tracing to make mock responses more realistic.

import json

agent_traces = [
    {
        "invocation_id": "inv-001",
        "user_content": {"role": "user", "parts": [{"text": "Search for high-end headphones"}]},
        "intermediate_data": {
            "tool_uses": [
                {
                    "name": "search_products",
                    "args": {"query": "high-end headphones"},
                    "response": {"products": [{"id": "P-123", "name": "Premium Wireless ANC Headphones"}]}
                }
            ]
        }
    }
]

config = EnvironmentSimulationConfig(
    tool_simulation_configs=[
        ToolSimulationConfig(
            tool_name="search_products",
            mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
        ),
    ],
    tracing=json.dumps(agent_traces),
)

The LLM will use this data to return product names, prices, and stock levels that match your domain, rather than generating arbitrary placeholder values.

Mixing injections and mock strategy

Injection configs and a mock strategy can be combined on the same tool. Injections are always checked first; the mock strategy fires only when no injection applies.

ToolSimulationConfig(
    tool_name="send_notification",
    injection_configs=[
        # Always fail for a known-bad recipient
        InjectionConfig(
            match_args={"recipient_id": "INVALID"},
            injected_error=InjectedError(
                injected_http_error_code=400,
                error_message="Invalid recipient.",
            ),
        ),
    ],
    # For all other recipients, generate a plausible success response
    mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
)