LiteRT-LM model host for ADK agents¶
You can use the LiteRT-LM library to efficiently run language models locally on various compute devices without requiring specialized processors such as graphics processing units (GPU) or tensor processing units (TPU). LiteRT-LM supports many models, including Google Gemma models as well as third-party models. This guide provides instructions for setting up LiteRT-LM with ADK for the following languages:
Python¶
These instructions describe how to use LiteRT-LM server with ADK in Python
with a Gemma open weights model, including using LiteRT-LM 's local hosting
model server lit.
Install resources¶
You need to download a model to use with LiteRT-LM, and the lit CLI tool
to help you find a model download it.
Install lit CLI tool¶
Download and install the lit CLI tool by following these
instructions
in the LiteRT-LM GitHub repository.
Download a model¶
Before you start the server, you need to download a model. You'll need a
Hugging Face user access token to download a LiteRT-LM model using lit. You
can get a token for your Hugging Face account
here.
To see a list of models available for download, use the lit list command:
Download a model using the lit pull command:
Configure your agent¶
Configure your agent to connect to LiteRT-LM and a hosted model.
When running Gemma models with LiteRT-LM, you configure a Gemini
model class with the model identifier and local network address.
To use LiteRT-LM with ADK and a Gemma model:
- Set
base_urlto the LiteRT-LM server URL, for example:localhost:8001. - Set
modelto the LiteRT-LM model name, for example:gemma3n-e2b.
The following example code shows how to configure an agent to connect to the locally hosted LiteRT-LM instance serving the Gemma model configuration described above:
from google.adk.agents import Agent
from google.adk.models import Gemini
root_agent = Agent(
model=Gemini(
model="gemma3n-e2b",
base_url="http://localhost:8001",
),
name="dice_agent",
description=(
"hello world agent that can roll a die of 8 sides and check prime"
" numbers."
),
instruction="""
You roll dice and answer questions about the outcome of the dice rolls.
""",
tools=[
roll_die,
check_prime,
],
)
Then run the agent as usual:
Running the LiteRT-LM server¶
The LiteRT-LM server is a separate process that serves LiteRT-LM models. It is
started by the LiteRT-LM CLI tool lit.
Run the server¶
After downloading a model, start the LiteRT-LM server locally by running the following command:
Local Server Port Number
You may choose any port number for the LiteRT-LM server as long as it matches the base_url you set in the Gemini class in your agent code.
Debugging¶
To see incoming requests to the LiteRT-LM server and the exact input sent to the
model, use the --verbose flag:
Kotlin¶
These instructions describe how to use LiteRT-LM with ADK in Kotlin using
the com.google.adk.kt.litertlm package.
Install resources¶
You need to download a model to use with LiteRT-LM, and the litert-lm CLI tool
to help you find a model download it.
Install LiteRT-LM CLI¶
Prerequisites: Python 3.10 or higher
To install the CLI, run:
For additional installation methods, such as using uv, see LiteRT-LM CLI Installation Guide.
Download a model¶
Download a model compatible with LiteRT-LM to use the litert-lm CLI tool.
Use litert-lm to download models directly from Hugging Face:
litert-lm import \
--from-huggingface-repo litert-community/gemma-4-E2B-it-litert-lm \
gemma-4-E2B-it.litertlm
Once downloaded, the model is stored locally at:
For more details about litert-lm, refer to the
LiteRT-LM CLI Usage Guide.
Add dependencies¶
ADK Kotlin works with LiteRT-LM through an adapter package,
com.google.adk:google-adk-kotlin-litertlm.
In your build.gradle.kts, add com.google.adk:google-adk-kotlin-litertlm and
com.google.ai.edge.litertlm:litertlm-jvm to your dependencies:
repositories {
mavenCentral()
google()
}
dependencies {
implementation("com.google.adk:google-adk-kotlin-core:0.4.0")
implementation("com.google.adk:google-adk-kotlin-litertlm:0.4.0")
implementation("com.google.ai.edge.litertlm:litertlm-jvm:0.13.1")
// other dependencies...
}
Configure agent model¶
Run a local model for your agent with LiteRT-LM by configuring a
LiteRtLmModel object as part of your LlmAgent object. If you do not
already have a ADK Kotlin project, follow the
Kotlin Quickstart for ADK
getting started guide. The following code example shows you how to
configure an LlmAgent, and set the model parameter to a LiteRtLmModel:
object HelloTimeAgent {
// Get model path from environment variable.
private val modelPath: String by lazy {
System.getenv("LITERT_LM_MODEL_PATH")
?: throw IllegalStateException(
"LITERT_LM_MODEL_PATH environment variable must be set pointing to a .litertlm file."
)
}
@JvmField
val rootAgent =
LlmAgent(
name = "hello_time_agent",
description = "Tells the current time in a specified city.",
model =
LiteRtLmModel.create(
EngineConfig(modelPath = modelPath, backend = Backend.CPU())
),
instruction =
Instruction(
"You are a helpful assistant that tells the current time in a city. " +
"Use the 'getCurrentTime' tool for this purpose."
),
tools = TimeService().generatedTools(),
)
}
In this example, the path to the LiteRT-LM model file is read from the
environment variable LITERT_LM_MODEL_PATH. The model will be run on the CPU.
You can run the model on a GPU by setting backend = Backend.GPU().
When you run the agent, set LITERT_LM_MODEL_PATH to the location of the model
file, for example: ~/.litert-lm/models/gemma-4-E2B-it.litertlm/model.litertlm.
Run your agent¶
If you followed the Kotlin Quickstart for ADK
with the above modifications, you can run your ADK agent using the command-line
REPL with the environment variable LITERT_LM_MODEL_PATH set to the path of
the model file:
Example interaction: