Runtime Configuration¶
RunConfig controls how agents behave at runtime, including streaming mode,
speech settings, LLM call limits, and live agent options. Pass a RunConfig
to runner.run_async() or runner.run_live() to override default behavior.
Manage sessions and context¶
For long-running sessions, you can control how much history is loaded and whether the context window is compressed:
get_session_config: Limits which events are fetched when loading a session. Usenum_recent_eventsorafter_timestampto avoid loading the full event history on every invocation.context_window_compression: Enables context window compression for LLM input, useful when sessions approach model context limits.
Enable streaming¶
To control how the agent delivers responses, set the streaming_mode parameter:
StreamingMode.NONE(default): The runner returns one complete response per turn. Suitable for CLI tools, batch processing, and synchronous workflows.StreamingMode.SSE: Server-Sent Events streaming. The runner yields partial events as the LLM generates, enabling typewriter-style UIs and real-time chat displays.StreamingMode.BIDI: Reserved for bidirectional streaming, but not used in the standardrun_async()path. For bidirectional streaming, userunner.run_live()instead.
Set support_cfc=True alongside StreamingMode.SSE to enable Compositional
Function Calling (CFC), which allows the model to dynamically compose and
execute function calls. CFC uses the Live API under the hood.
Experimental
CFC support is experimental and its API or behavior may change in future releases.
Configure audio and speech¶
For voice-enabled agents, configure speech synthesis, audio transcription, and response modalities.
speech_config: Sets the voice and language for speech output (e.g., the "Kore" voice withen-US).response_modalities: Controls output formats. Set to["AUDIO", "TEXT"]for agents that both speak and return text.output_audio_transcription/input_audio_transcription: Enable transcription of audio output from the model and audio input from the user. Both default toAudioTranscriptionConfig()in Python.
from google.adk.agents.run_config import RunConfig, StreamingMode
from google.genai import types
config = RunConfig(
speech_config=types.SpeechConfig(
language_code="en-US",
voice_config=types.VoiceConfig(
prebuilt_voice_config=types.PrebuiltVoiceConfig(
voice_name="Kore"
)
),
),
response_modalities=["AUDIO", "TEXT"],
streaming_mode=StreamingMode.SSE,
max_llm_calls=1000,
)
import { RunConfig, StreamingMode } from '@google/adk';
import { Modality } from '@google/genai';
const config: RunConfig = {
speechConfig: {
languageCode: "en-US",
voiceConfig: {
prebuiltVoiceConfig: {
voiceName: "Kore"
}
},
},
responseModalities: [Modality.AUDIO, Modality.TEXT],
streamingMode: StreamingMode.SSE,
maxLlmCalls: 1000,
};
import com.google.adk.agents.RunConfig;
import com.google.adk.agents.RunConfig.StreamingMode;
import com.google.common.collect.ImmutableList;
import com.google.genai.types.Modality;
import com.google.genai.types.PrebuiltVoiceConfig;
import com.google.genai.types.SpeechConfig;
import com.google.genai.types.VoiceConfig;
RunConfig runConfig =
RunConfig.builder()
.streamingMode(StreamingMode.SSE)
.maxLlmCalls(1000)
.responseModalities(ImmutableList.of(new Modality(Modality.Known.AUDIO), new Modality(Modality.Known.TEXT)))
.speechConfig(
SpeechConfig.builder()
.voiceConfig(
VoiceConfig.builder()
.prebuiltVoiceConfig(
PrebuiltVoiceConfig.builder().voiceName("Kore").build())
.build())
.languageCode("en-US")
.build())
.build();
Configure live agents¶
When using runner.run_live(), configure real-time behavior with these
additional parameters:
realtime_input_config: Configures how audio input is received from users.proactivity: Allows the model to respond proactively and ignore irrelevant input.enable_affective_dialog: WhenTrue, the model detects user emotions and adapts its tone accordingly.avatar_config: Configures an avatar for live agents.session_resumption: Enables transparent session resumption across disconnects.save_live_blob: WhenTrue, saves live audio and video data to the session and artifact service.tool_thread_pool_config: Runs tool executions in a background thread pool to keep the event loop responsive to user interruptions.
Not all parameters are available in every language. See the API reference for language-specific details.
from google.adk.agents.run_config import RunConfig, ToolThreadPoolConfig
config = RunConfig(
save_live_blob=True,
tool_thread_pool_config=ToolThreadPoolConfig(max_workers=8),
)
Thread pool and the GIL
Thread pools help with blocking I/O and C extensions that release the
GIL (e.g. time.sleep(), network calls, numpy). They do not help
with pure Python CPU-bound code since the GIL prevents true parallel
execution of Python bytecode.
Configure runtime limits and debugging¶
Use these parameters to control runtime guardrails and debugging:
max_llm_calls: Caps the total number of LLM calls per run (default: 500). Set to 0 or negative for unlimited calls, though this is not recommended for production. Values at or abovesys.maxsizeraise an error.save_input_blobs_as_artifacts: WhenTrue, saves input blobs (e.g., uploaded files) as run artifacts for debugging and auditing.custom_metadata: Adict[str, Any]of arbitrary metadata attached to the invocation, useful for tracing or logging.
API reference¶
For the complete list of fields, types, and defaults, see the API reference for your language: