Chapter 4 - Crafting Your First LlmAgent
This article is part of my web book series. All of the chapters can be found here and the code is available on Github. For any issues around this book, contact me on LinkedIn
Previously we surveyed the fundamental components of the Agent Development Kit. Now, it’s time to put that knowledge into practice by building and dissecting our first LlmAgent
. The LlmAgent
(often aliased as Agent
for convenience via from google.adk.agents import Agent
) is the workhorse for creating agents that leverage the power of Large Language Models. This chapter will guide you through its anatomy, how to provide instructions, understand its interaction with LLMs, and customize its behavior using configurations and callbacks.
Anatomy of an LlmAgent
Let’s start by revisiting the basic structure of an LlmAgent
definition. At its simplest, an LlmAgent
requires a name
and a model
.
1
2
3
4
5
6
7
from google.adk.agents import Agent # LlmAgent is aliased as Agent
# A very minimal LlmAgent
basic_responder_agent = Agent(
name="basic_responder",
model="gemini-2.0-flash" # Specifies the LLM to use
)
While this agent will function, it lacks specific guidance. Let’s look at the key parameters you’ll typically use:
name: str
: (Required) A unique string identifier for the agent. This name is used in logs, traces, and when one agent needs to refer to another (e.g., for transfer). It must be a valid Python identifier and cannot be “user”.model: Union[str, BaseLlm]
: (Required) Specifies the Large Language Model the agent will use. This can be:- A string identifier for a model known to the
LLMRegistry
(e.g.,"gemini-2.0-flash"
). - An instance of a
BaseLlm
subclass (e.g.,LiteLlm(model="openai/gpt-4")
).
- A string identifier for a model known to the
instruction: Union[str, InstructionProvider]
: (Optional, but highly recommended) This is the system prompt or the primary set of instructions given to the LLM to guide its behavior, personality, and task execution. It can be a static string or a callable (InstructionProvider
) that dynamically generates the instruction based on the current context.description: str
: (Optional, but important for multi-agent systems) A natural language description of what this agent does. This helps other agents (or an orchestrating LLM) decide if this agent is suitable for a given task.
1
2
3
4
5
6
7
8
from google.adk.agents import Agent
polite_translator_agent = Agent(
name="polite_translator",
model="gemini-2.0-flash",
instruction="You are a polite translator. Translate the user's text into French. If the text is already in French, politely inform the user.",
description="Translates English text to French politely."
)
Best Practice: Meaningful Agent Names and Descriptions
Choose a name that is a good programmatic identifier. The description is crucial for the LLM (and potentially other agents or developers) to understand the agent’s purpose. Make it concise but comprehensive. For example, instead of “Agent that does translations,” use “Translates user input from English to French, handling polite phrasings.”
Working with Instructions: Static vs. Dynamic
The instruction
parameter is fundamental to shaping your agent’s behavior.
Static Instructions:
The simplest way is to provide a static string, as shown in the polite_translator_agent
example. This instruction is sent to the LLM as part of the system prompt with every request.
Dynamic Instructions with InstructionProvider
:
Sometimes, you need the agent’s guiding instructions to change based on the current state of the conversation or external factors. For this, ADK allows you to pass a callable (a function or a method) as the instruction
. This callable is an InstructionProvider
.
An InstructionProvider
is a function that takes a ReadonlyContext
object as input and returns a string (the instruction) or an awaitable that resolves to a string. The ReadonlyContext
gives you access to the current invocation ID, agent name, and session state (read-only).
Dynamic Instructions for Adaptive Behavior
InstructionProvider functions are powerful for making agents adapt to changing contexts (e.g., user roles, time of day, specific data in the session state). Use them when an agent’s core directive needs to be flexible rather than static.
Complexity in Dynamic Instructions
While powerful, overly complex logic within an
InstructionProvider
can make the agent’s behavior harder to predict and debug. Aim for clarity and test these functions thoroughly. Remember, the instruction ultimately guides the LLM, so ensure it’s coherent and unambiguous.
Following is an example code where the agent greets you depending on the time of day by using an InstructionProvider
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from google.adk.agents import Agent
from google.adk.agents.readonly_context import ReadonlyContext
from datetime import datetime
from building_intelligent_agents.utils import load_environment_variables, DEFAULT_LLM
load_environment_variables()
def get_time_based_greeting_instruction(context: ReadonlyContext) -> str:
current_hour = datetime.now().hour
user_name = context.state.get("user:user_name", "there")
if 5 <= current_hour < 12: greeting_time = "morning"
elif 12 <= current_hour < 18: greeting_time = "afternoon"
else: greeting_time = "evening"
return f"You are a cheerful assistant. Greet the user '{user_name}' and wish them a good {greeting_time}. Then, ask how you can help."
dynamic_greeter_agent = Agent(
name="dynamic_greeter", model=DEFAULT_LLM,
instruction=get_time_based_greeting_instruction,
description="Greets the user dynamically based on the time of day and their name."
)
if __name__ == "__main__":
from google.adk.runners import InMemoryRunner
from google.genai.types import Content, Part
runner = InMemoryRunner(agent=dynamic_greeter_agent, app_name="DynamicApp")
user_id = "jane_doe"; session_id = "session_greet_jane"; initial_state = {"user:user_name": "Jane"}
runner.session_service._create_session_impl(app_name="DynamicApp", user_id=user_id, session_id=session_id, state=initial_state)
user_message = Content(parts=[Part(text="Hello")])
print(f"Running dynamic_greeter_agent for user: {user_id}...")
for event in runner.run(user_id=user_id, session_id=session_id, new_message=user_message):
if event.content and event.content.parts:
for part in event.content.parts:
if part.text: print(part.text, end="")
print()
In this example, get_time_based_greeting_instruction
accesses the session state (e.g., user:user_name
) via the ReadonlyContext
to personalize the instruction. The instruction sent to the LLM will change depending on when the agent is run and what’s in the session state.
A Note on State Injection in Instructions
By default, ADK attempts to inject values from the session state into your static string instructions if they contain placeholders like
{my_variable}
or{user:user_name}
. However, when you use anInstructionProvider
function, this automatic state injection is bypassed for the instruction string returned by your provider. This is because your provider function already has access to theReadonlyContext
and can explicitly fetch and format any state variables it needs, offering more control.
Understanding LLM Flows: The Default SingleFlow
When an LlmAgent
is invoked, its interaction with the LLM and tools is managed by an LLM Flow (google.adk.flows.llm_flows.BaseLlmFlow
). The default flow for a standalone LlmAgent
(one not explicitly configured for complex multi-agent transfers) is the SingleFlow
.
The SingleFlow
essentially does the following in a loop:
- Prepares
LlmRequest
: Gathers history, instructions, and tool declarations. - Calls LLM: Sends the request to the LLM.
- Processes
LlmResponse
:- If the LLM returns text, it’s yielded as an
Event
. This might be the final answer. - If the LLM requests a tool call, the
SingleFlow
executes the tool. - The tool’s response is then packaged and sent back to the LLM in the next iteration of the loop (often for summarization or to inform the next step).
- If the LLM returns text, it’s yielded as an
- This loop continues until the LLM provides a final text response without requesting further tool calls, or an error occurs, or a callback/tool signals to end the invocation.
We will explore more complex flows like AutoFlow
(which enables agent-to-agent transfers) in the multi-agent systems part of the book. For now, understanding that SingleFlow
handles the turn-by-turn conversation and tool use is sufficient.
Flows Abstract LLM Interaction Patterns
LLM Flows like SingleFlow (and AutoFlow later) encapsulate common patterns of interacting with an LLM, including preparing requests, handling tool calls, and processing responses. Understanding that a flow manages this loop helps you focus on the agent’s specific instructions and tools.
LLM Interaction: LlmRequest
and LlmResponse
These two Pydantic models are central to how ADK agents communicate with LLMs:
google.adk.models.LlmRequest
: Encapsulates everything an agent sends to the LLM.model: Optional[str]
: The target model name.contents: list[types.Content]
: The conversation history, including previous user messages, agent responses, and tool call/response pairs.config: Optional[types.GenerateContentConfig]
: Contains:system_instruction: Optional[str]
: The compiled system prompt.tools: Optional[list[types.Tool]]
: Declarations of available tools.temperature
,top_p
,max_output_tokens
,safety_settings
, etc.response_schema
: If you expect structured JSON output from the LLM.
tools_dict: dict[str, BaseTool]
: A mapping of tool names to theirBaseTool
instances (used internally by ADK).
google.adk.models.LlmResponse
: Encapsulates what the agent receives from the LLM.content: Optional[types.Content]
: The primary output from the LLM. ThisContent
object can contain:Part(text="...")
: Plain text response.Part(function_call=FunctionCall(...))
: A request by the LLM to call a specific tool.Part(inline_data=Blob(...))
: For multimodal responses (e.g., image generation, though less common for agent text responses).
partial: Optional[bool]
: True if this is part of a streaming text response.usage_metadata: Optional[types.GenerateContentResponseUsageMetadata]
: Information about token counts.error_code: Optional[str]
,error_message: Optional[str]
: If an error occurred during the LLM call.
The Runner
and the LLM Flow
handle the construction of LlmRequest
and the conversion of the raw LLM output into an LlmResponse
and subsequently into Event
objects.
Customizing Model Behavior: generate_content_config
The LlmAgent
allows you to pass a generate_content_config
argument, which should be an instance of google.genai.types.GenerateContentConfig
. This object lets you control various aspects of the LLM’s generation process.
Following is an agent that helps with creative writing, with safety settings turned down using the generate_content_config
argument.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
from google.adk.agents import Agent
from google.genai.types import GenerateContentConfig, SafetySetting, HarmCategory, HarmBlockThreshold
from building_intelligent_agents.utils import load_environment_variables, create_session, DEFAULT_LLM, DEFAULT_REASONING_LLM
load_environment_variables()
custom_safety_settings = [
SafetySetting(category=HarmCategory.HARM_CATEGORY_HARASSMENT, threshold=HarmBlockThreshold.BLOCK_NONE),
SafetySetting(category=HarmCategory.HARM_CATEGORY_HATE_SPEECH, threshold=HarmBlockThreshold.BLOCK_NONE),
]
creative_writer_agent = Agent(
name="creative_writer", model=DEFAULT_REASONING_LLM,
instruction="You are a creative writer. Write a short, imaginative story based on the user's prompt.",
description="Generates short creative stories.",
generate_content_config=GenerateContentConfig(
temperature=0.9, top_p=0.95, top_k=40, max_output_tokens=1024,
safety_settings=custom_safety_settings)
)
if __name__ == "__main__":
from google.adk.runners import InMemoryRunner
from google.genai.types import Content, Part
current_session_id = "creative_story_session"
current_user_id = "creative_writer_user"
runner = InMemoryRunner(agent=creative_writer_agent, app_name="CreativeApp")
create_session(runner, current_session_id, current_user_id)
user_prompt = Content(parts=[Part(text="A brave squirrel on a quest to find the legendary golden acorn.")], role="user")
print("Creative Writer Story:")
for event in runner.run(user_id=current_user_id, session_id=current_session_id, new_message=user_prompt):
if event.content and event.content.parts:
for part in event.content.parts:
if part.text: print(part.text, end="")
print()
Key fields in GenerateContentConfig
:
temperature: float
: Controls randomness. Lower values are more deterministic; higher values are more creative/random.top_p: float
,top_k: int
: Control token sampling strategies.max_output_tokens: int
: Maximum number of tokens to generate.stop_sequences: list[str]
: Sequences of strings that, if generated, will cause the LLM to stop.safety_settings: list[SafetySetting]
: Configure content safety filters (e.g., for harassment, hate speech).response_mime_type
&response_schema
: Used when you expect structured JSON output from the LLM (covered in detail later).
GenerateContentConfig
within ADK
- Don’t set
system_instruction
here directly. Use theLlmAgent.instruction
parameter.- Don’t set
tools
here directly. Use theLlmAgent.tools
parameter.- Don’t set
thinking_config
here directly. Use theLlmAgent.planner
parameter with aBuiltInPlanner
.ADK manages these specific fields through its own dedicated agent parameters to ensure proper integration with its flows and tool handling mechanisms.
Best Practice: Tune temperature for Desired Output
The temperature setting in GenerateContentConfig is one of the most impactful for controlling LLM output.
- Low temperature (e.g., 0.0-0.3): More focused, deterministic, good for factual recall or consistent formatting.
- High temperature (e.g., 0.7-1.0): More creative, diverse, good for brainstorming or story generation.
Experiment to find the right balance for your agent’s task.
Callbacks for Fine-Grained Control
ADK provides several callback points within the LlmAgent
lifecycle, allowing you to inject custom logic before or after key operations. These callbacks receive a CallbackContext
object.
before_agent_callback: Optional[BeforeAgentCallback]
- Called before the agent’s main
_run_async_impl
or_run_live_impl
logic begins. - Receives:
CallbackContext
. - Can return:
Optional[types.Content]
. If content is returned, the agent’s normal run is skipped, and this content is yielded as the agent’s response. Useful for pre-emptive handling or validation.
- Called before the agent’s main
after_agent_callback: Optional[AfterAgentCallback]
- Called after the agent’s main run logic completes but before the final event is solidified.
- Receives:
CallbackContext
. - Can return:
Optional[types.Content]
. If content is returned, it overrides any response the agent might have generated internally and is yielded as the final agent response. Useful for post-processing or adding boilerplate.
before_model_callback: Optional[BeforeModelCallback]
- Called just before the
LlmRequest
is sent to the LLM. - Receives:
CallbackContext
,LlmRequest
(mutable). - Can return:
Optional[LlmResponse]
. If anLlmResponse
is returned, the actual call to the LLM is skipped, and this response is used instead. Useful for caching, request modification, or mocking LLM calls.
- Called just before the
after_model_callback: Optional[AfterModelCallback]
- Called immediately after receiving the
LlmResponse
from the LLM. - Receives:
CallbackContext
,LlmResponse
(mutable). - Can return:
Optional[LlmResponse]
. If anLlmResponse
is returned, it replaces the original LLM response. Useful for response modification, validation, or logging.
- Called immediately after receiving the
Callbacks for Monitoring and Modification
- Use before_model_callback to inspect or modify the exact prompt being sent to the LLM, or to implement caching.
- Use after_model_callback to inspect or modify the raw LLM response before ADK processes it further (e.g., for tool calls). This is also a good place to log token usage from response.usage_metadata.
- before_agent_callback is great for input validation or pre-emptive responses based on session state.
All callback types can be a single callable or a list of callables. If a list, they are executed in order until one returns a non-None value (which then short-circuits further callbacks in that list).
The following agent uses callbacks for logging and even to block certain users using a custom before_agent_callback
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
from google.adk.agents import Agent
from google.adk.agents.callback_context import CallbackContext
from google.adk.models.llm_request import LlmRequest
from google.adk.models.llm_response import LlmResponse
from google.genai.types import Content, Part
from typing import Optional
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
from building_intelligent_agents.utils import load_environment_variables, create_session, DEFAULT_LLM
load_environment_variables()
def my_before_agent_cb(callback_context: CallbackContext) -> Optional[Content]:
logger.info(f"[{callback_context.agent_name}] In before_agent_callback. Invocation ID: {callback_context.invocation_id}")
if "block_user" in callback_context.state.get("user:flags", []):
logger.warning(f"[{callback_context.agent_name}] User is blocked. Skipping agent run.")
return Content(parts=[Part(text="I'm sorry, I cannot process your request at this time.")])
return None
async def my_before_model_cb(callback_context: CallbackContext, llm_request: LlmRequest) -> Optional[LlmResponse]:
logger.info(f"[{callback_context.agent_name}] In before_model_callback. Modifying request.")
if llm_request.contents and llm_request.contents[-1].role == "user":
llm_request.contents[-1].parts[0].text = f"Consider this: {llm_request.contents[-1].parts[0].text}"
return None
def my_after_model_cb(callback_context: CallbackContext, llm_response: LlmResponse) -> Optional[LlmResponse]:
logger.info(f"[{callback_context.agent_name}] In after_model_callback.")
if llm_response.content and llm_response.content.parts and llm_response.content.parts[0].text:
llm_response.content.parts[0].text += " (Processed by after_model_callback)"
llm_response.custom_metadata = {"source": "after_model_cb_modification"}
return llm_response
callback_demo_agent = Agent(
name="callback_demo_agent", model=DEFAULT_LLM,
instruction="You are an echo agent. Repeat the user's message.",
description="Demonstrates agent and model callbacks.",
before_agent_callback=my_before_agent_cb,
before_model_callback=my_before_model_cb,
after_model_callback=my_after_model_cb
)
if __name__ == "__main__":
from google.adk.runners import InMemoryRunner
runner = InMemoryRunner(agent=callback_demo_agent, app_name="CallbackApp")
user_id="cb_user"
session_id="s_normal"
create_session(runner, session_id, user_id)
print("
--- Scenario 1: Normal Run ---")
user_message1 = Content(parts=[Part(text="Hello ADK!")])
for event in runner.run(user_id=user_id, session_id=session_id, new_message=user_message1):
if event.content and event.content.parts:
for part in event.content.parts:
if part.text: print(part.text, end="")
if event.custom_metadata: print(f" [Metadata: {event.custom_metadata}]", end="")
print()
print("
--- Scenario 2: User Blocked ---")
user_message2 = Content(parts=[Part(text="Another message.")])
blocked_user_session_id = "s_blocked"
runner.session_service._create_session_impl(
app_name="CallbackApp", user_id="cb_user_blocked",
session_id=blocked_user_session_id, state={"user:flags": ["block_user"]}
)
for event in runner.run(user_id="cb_user_blocked", session_id=blocked_user_session_id, new_message=user_message2):
if event.content and event.content.parts:
for part in event.content.parts:
if part.text: print(part.text, end="")
print()
Running this script will show log messages from the callbacks and block a user that has certain flags in the state "user:flags": ["block_user"]
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
Creating session: s_normal for user: cb_user on app: CallbackApp
Session created successfully.
--- Scenario 1: Normal Run ---
INFO:__main__:[callback_demo_agent] In before_agent_callback. Invocation ID: e-ce7ed1b6-979a-4a25-bc96-049ae6e20240
INFO:__main__:[callback_demo_agent] In before_model_callback. Modifying request.
INFO:google_adk.google.adk.models.google_llm:Sending out request, model: gemini-2.0-flash, backend: GoogleLLMVariant.GEMINI_API, stream: False
INFO:google_adk.google.adk.models.google_llm:
LLM Request:
-----------------------------------------------------------
System Instruction:
You are an echo agent. Repeat the user's message.
You are an agent. Your internal name is "callback_demo_agent".
The description about you is "Demonstrates agent and model callbacks."
-----------------------------------------------------------
Contents:
{"parts":[{"text":"Handle the requests as specified in the System Instruction."}],"role":"user"}
-----------------------------------------------------------
Functions:
-----------------------------------------------------------
INFO:google_genai.models:AFC is enabled with max remote calls: 10.
INFO:httpx:HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
INFO:google_adk.google.adk.models.google_llm:
LLM Response:
-----------------------------------------------------------
Text:
Handle the requests as specified in the System Instruction.
-----------------------------------------------------------
Function calls:
-----------------------------------------------------------
Raw response:
{"candidates":[{"content":{"parts":[{"text":"Handle the requests as specified in the System Instruction.
"}],"role":"model"},"finish_reason":"STOP","avg_logprobs":5.087303262288597e-7}],"model_version":"gemini-2.0-flash","usage_metadata":{"candidates_token_count":11,"candidates_tokens_details":[{"modality":"TEXT","token_count":11}],"prompt_token_count":54,"prompt_tokens_details":[{"modality":"TEXT","token_count":54}],"total_token_count":65},"automatic_function_calling_history":[]}
-----------------------------------------------------------
INFO:__main__:[callback_demo_agent] In after_model_callback.
Handle the requests as specified in the System Instruction.
(Processed by after_model_callback) [Metadata: {'source': 'after_model_cb_modification'}]
--- Scenario 2: User Blocked ---
INFO:__main__:[callback_demo_agent] In before_agent_callback. Invocation ID: e-77011598-d1b1-4dc8-98c8-95de79e87738
WARNING:__main__:[callback_demo_agent] User is blocked. Skipping agent run.
I'm sorry, I cannot process your request at this time.
Here is a sequence diagram for how the request/response flow is modified.
Best Practice: Keep Callbacks Focused
Callbacks should ideally perform a single, well-defined task (e.g., logging, a specific modification, a validation check). This keeps them maintainable and easier to understand within the overall agent flow. Avoid putting overly complex business logic directly into callbacks if it can be part of the agent’s primary logic or a tool.
Mutable Objects in Callbacks
Be mindful when modifying objects like LlmRequest or LlmResponse within callbacks. Changes made will affect the subsequent processing. This is powerful but requires care to avoid unintended side effects. Always log what you’re changing for easier debugging.
What’s Next?
We’ve now covered the essentials of creating and configuring a single LlmAgent
. While this agent can respond based on its instructions and LLM, its true power is unlocked when it can interact with the outside world. Next, we’ll learn how to give our agents the ability to perform actions by defining and using custom Python tools.