Chapter 15 - ADK in Action - Building the "ADK Expert Agent"
This article is part of my web book series. All of the chapters can be found here and the code is available on Github. For any issues around this book, contact me on LinkedIn
Until now we’ve explored the individual components and capabilities of the Agent Development Kit. Now, let’s bring many of these concepts together by creating a sophisticated example: the “ADK Expert Agent.” This agent system, is designed to be knowledgeable about ADK itself, assist with GitHub issues, generate documents, and create architecture diagrams.
By dissecting its structure and key functionalities, we can see a practical application of multi-agent design, specialized tooling, dynamic instructions, state management, and more, all orchestrated by ADK. The code for this can be found here.
Overview of the ADK Expert Agent System
The ADK Expert Agent is not a single monolithic agent but rather a multi-agent system orchestrated by a root agent. Its primary goal is to assist users with queries related to Google’s Agent Development Kit.
Core Capabilities:
- General ADK Q&A: Answers questions based on a loaded ADK knowledge base (available in
expert-agents/data/google-adk-python-1.2.0.txt
). - GitHub Issue Processing: Fetches details for specified
google/adk-python
GitHub issues and provides ADK-specific guidance. - Document Generation: Creates PDF, HTML slides, or PPTX slides from mermaid content generated by an LLM based on user requests and ADK knowledge.
- Mermaid Diagram Generation: Generates Mermaid diagram syntax for user-described architectures and converts it into a PNG image.
High-Level Architecture: (Please Zoom-In)
The system’s entry point is the root_agent
defined in expert-agents/agent.py
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# expert-agents/agent.py (Simplified Snippet - Top Level)
from google.adk.agents import Agent as ADKAgent
from google.adk.models import Gemini
# ... other imports ...
from .agents.github_issue_processing_agent import github_issue_processing_agent
from .agents.document_generator_agent import document_generator_agent
from .agents.mermaid_diagram_orchestrator_agent import mermaid_diagram_orchestrator_agent
from .tools.prepare_document_content_tool import PrepareDocumentContentTool
from .config import PRO_MODEL_NAME # Using PRO_MODEL_NAME as in actual code
# ... root_agent_instruction_provider and root_agent_after_tool_callback defined elsewhere ...
root_agent_tools = [
AgentTool(agent=github_issue_processing_agent),
AgentTool(agent=document_generator_agent),
AgentTool(agent=mermaid_diagram_orchestrator_agent),
PrepareDocumentContentTool(),
]
root_agent = ADKAgent(
name="adk_expert_orchestrator",
model=Gemini(model=PRO_MODEL_NAME), # Using PRO_MODEL_NAME
instruction=root_agent_instruction_provider,
tools=root_agent_tools,
# ... callbacks and config ...
)
Orchestrator Pattern
The
adk_expert_orchestrator
exemplifies the orchestrator pattern. It doesn’t perform all tasks itself but intelligently delegates to specialized sub-agents (wrapped asAgentTool
s) based on the nature of the user’s query. This promotes modularity and separation of concerns.
The Root Orchestrator: adk_expert_orchestrator
Defined in expert-agents/agent.py
, this is the primary LlmAgent
that users interact with.
Key Features Demonstrated:
Dynamic Instructions (
root_agent_instruction_provider
): The logic withinroot_agent_instruction_provider
inexpert-agents/agent.py
is key to its routing capabilities.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
# expert-agents/agent.py (Snippet of root_agent_instruction_provider) def root_agent_instruction_provider(context: ReadonlyContext) -> str: # ... (imports and helper get_text_from_content) adk_context_for_llm = get_escaped_adk_context_for_llm() invocation_ctx = getattr(context, '_invocation_context', None) user_query_text = get_text_from_content(invocation_ctx.user_content) if invocation_ctx and invocation_ctx.user_content else "" # ... (logic for detecting previous tool calls) ... # --- Logic for routing based on user query --- diagram_keywords = ["diagram", "architecture", "visualize", "mermaid", "graph"] is_diagram_request = any(kw in user_query_text.lower() for kw in diagram_keywords) if is_diagram_request: logger.info(f"RootAgent (instruction_provider): Detected architecture diagram request: '{user_query_text}'") diagram_agent_input_payload = DiagramGeneratorAgentToolInput(diagram_query=user_query_text).model_dump_json() system_instruction = f""" You are an expert orchestrator for Google's Agent Development Kit (ADK). The user is asking for an architecture diagram. Their query is: "{user_query_text}" Your task is to call the '{mermaid_diagram_orchestrator_agent.name}' tool. The tool expects its input as a JSON string. The value for the "request" argument MUST be the following JSON string: {diagram_agent_input_payload} This is your only action for this turn. Output only the tool call. """ # ... (elif blocks for document generation, GitHub issues, and general Q&A) ... else: # General ADK Question logger.info(f"RootAgent (instruction_provider): General ADK query: '{user_query_text}'") system_instruction = f""" You are an expert on Google's Agent Development Kit (ADK) version 1.0.0. Your primary role is to answer general questions about ADK. ADK Knowledge Context (for general ADK questions): --- START OF ADK CONTEXT --- {adk_context_for_llm} --- END OF ADK CONTEXT --- Use your ADK knowledge to answer the user's query: "{user_query_text}" directly. This is your final answer. """ return system_instruction
Callbacks for Control and Logging (
root_agent_after_tool_callback
): This callback is vital for processing the structured (often JSON string) output from the specializedAgentTool
s and presenting it correctly to the orchestrator’s LLM or directly to the user.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
# expert-agents/agent.py (Snippet of root_agent_after_tool_callback) async def root_agent_after_tool_callback( tool: BaseTool, args: dict, tool_context: ToolContext, tool_response: Any ) -> Optional[Any]: if tool.name == github_issue_processing_agent.name: logger.info(f"RootAgent: Processing response from '{github_issue_processing_agent.name}'.") tool_context.actions.skip_summarization = True # Don't let orchestrator LLM summarize this response_text = "Error: Could not process response from GitHub issue agent." try: # The github_issue_processing_agent (a SequentialAgent ending with FormatOutputAgent) # is expected to return a JSON string that validates against SequentialProcessorFinalOutput if isinstance(tool_response, str): response_dict = json.loads(tool_response) elif isinstance(tool_response, dict): # AgentTool might already give a dict if sub-agent output JSON response_dict = tool_response.get("result", tool_response) # ADK wraps AgentTool output else: raise ValueError(f"Unexpected tool_response type: {type(tool_response)}") validated_output = SequentialProcessorFinalOutput.model_validate(response_dict) response_text = validated_output.guidance except Exception as e: # ... (error logging) ... response_text = f"Error processing GitHub agent output: {str(tool_response)[:200]}" return genai_types.Content(parts=[genai_types.Part(text=response_text)]) elif tool.name == mermaid_diagram_orchestrator_agent.name: logger.info(f"RootAgent: Received response from '{mermaid_diagram_orchestrator_agent.name}'.") tool_context.actions.skip_summarization = True # This agent's after_agent_callback ensures its output is a string (URL or error) response_text = str(tool_response.get("result") if isinstance(tool_response, dict) else tool_response) return genai_types.Content(parts=[genai_types.Part(text=response_text)]) # ... (similar handling for document_generator_agent) ... elif tool.name == PrepareDocumentContentTool().name: logger.info(f"RootAgent: 'prepare_document_content_tool' completed.") # Let the LLM summarize/use this structured output to call the next agent tool_context.actions.skip_summarization = False return tool_response # This is a dict return None # Default: let LLM summarize
Callbacks for Inter-Agent Data Transformation
The
after_tool_callback
in theroot_agent
is crucial for transforming the output of sub-agents (which might be JSON strings or complex dicts when called viaAgentTool
) into a format (likegenai_types.Content
) that the orchestrator’s LLM can readily consume for its next reasoning step or for generating the final user-facing response, especially when the defualt summarization is skipped (skip_summarization = True).
Specialized Agents: Divide and Conquer
The adk-expert-agent
effectively uses specialized agents for distinct, complex tasks.
1. github_issue_processing_agent
:
This is a SequentialAgent
, demonstrating nested orchestration. It executes a fixed pipeline of wrapper agents.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Snippet from expert-agents/agents/github_issue_processing_agent.py
get_issue_description_wrapper_agent = LlmAgent(...)
adk_guidance_wrapper_agent = LlmAgent(...)
# ...
class FormatOutputAgent(BaseAgent): # Custom non-LLM agent for final formatting
# ...
async def _run_async_impl(self, ctx: InvocationContext) -> AsyncGenerator[Event, None]:
# ... logic to find last event and format output ...
yield Event(...)
# The Sequential Agent definition
github_issue_processing_agent = SequentialAgent(
name="github_issue_processing_sequential_agent",
description="Processes a GitHub issue by fetching, cleaning, and providing ADK-specific guidance.",
sub_agents=[
get_issue_description_wrapper_agent,
# A cleaning agent could go here if needed
adk_guidance_wrapper_agent,
FormatOutputAgent()
]
)
Custom
BaseAgent
for Deterministic StepsThe
FormatOutputAgent
within thegithub_issue_processing_agent
is a great example of a customBaseAgent
. Its job is purely deterministic: find the output of the previous step and format it into the final JSON structure. This doesn’t require an LLM, making it faster, cheaper, and more reliable than prompting an LLM to do the formatting.
2. mermaid_diagram_orchestrator_agent
:
This LlmAgent
demonstrates a two-step internal process using a sub-agent and a tool.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# expert-agents/agents/mermaid_diagram_orchestrator_agent.py (Definition Snippet)
# ... (Pydantic models and tool imports) ...
from .mermaid_syntax_generator_agent import mermaid_syntax_generator_agent
from ..tools.mermaid_to_png_and_upload_tool import mermaid_gcs_tool_instance
mermaid_diagram_orchestrator_agent = ADKAgent(
name="mermaid_diagram_orchestrator_agent",
model=Gemini(model=DEFAULT_MODEL_NAME), # Can use a simpler model for orchestration
instruction=diagram_orchestrator_instruction_provider, # Key logic here
tools=[
AgentTool(agent=mermaid_syntax_generator_agent), # Calls another agent as a tool
mermaid_gcs_tool_instance, # Calls a direct tool
],
input_schema=DiagramGeneratorAgentToolInput,
after_agent_callback=diagram_orchestrator_after_agent_cb, # Returns GCS link
# ... other configs ...
)
Its instruction_provider
checks if Mermaid syntax has been generated in a previous turn. If not, it instructs the LLM to call the mermaid_syntax_generator_agent
. If syntax is present, it instructs the LLM to call the mermaid_to_png_and_gcs_upload
tool with that syntax.
Chaining
AgentTool
CallsWhen one
LlmAgent
(Orchestrator) calls anotherLlmAgent
(Specialist) viaAgentTool
, theinput_schema
of the Specialist and theafter_agent_callback
of the Specialist are crucial. The Orchestrator’s LLM needs to provide input matching the Specialist’sinput_schema
(often as a JSON string). The Specialist’safter_agent_callback
should ensure its final output (which theAgentTool
returns) is in a format the Orchestrator’safter_tool_callback
can parse and relay effectively.
Crafting Specialized Tools
The expert-agents/tools/
directory is rich with examples of custom tools.
Interacting with External APIs (
github_issue_tool.py
):This tool shows how to use
requests
to call the GitHub API, handle authentication with a PAT fetched from Secret Manager, and parse the JSON response.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# Snippet from expert-agents/tools/github_issue_tool.py class GetGithubIssueDescriptionTool(BaseTool): # ... async def run_async(self, *, args: Dict[str, Any], tool_context: ToolContext) -> Dict[str, Any]: # ... (argument validation) ... api_url = f"https://api.github.com/repos/{owner}/{repo}/issues/{issue_number}" GITHUB_PERSONAL_ACCESS_TOKEN = get_github_pat_from_secret_manager() headers = { "Authorization": f"token {GITHUB_PERSONAL_ACCESS_TOKEN}", # ... other headers } try: response = requests.get(api_url, headers=headers, timeout=10) response.raise_for_status() # ... return description or error from response.json() ... except requests.exceptions.HTTPError as e: # ... handle specific HTTP errors ...
Executing Command-Line Tools (
mermaid_to_png_and_upload_tool.py
):This tool demonstrates how to safely execute an external command-line utility (
mmdc
) using Python’sasyncio.create_subprocess_exec
for non-blocking execution.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
# Snippet from expert-agents/tools/mermaid_to_png_and_upload_tool.py async def run_async(self, args: Dict[str, str], tool_context: ToolContext) -> str: # ... (setup temporary files for input and output) ... try: logger.info(f"Running mmdc: {MERMAID_CLI_PATH} ...") process = await asyncio.create_subprocess_exec( MERMAID_CLI_PATH, "-p", PUPPETEER_CONFIG_PATH, "-i", mmd_file_path, "-o", png_file_path_local_temp, stdout=subprocess.PIPE, stderr=subprocess.PIPE ) stdout, stderr = await process.communicate() if process.returncode != 0: # ... handle and return error ... with open(png_file_path_local_temp, "rb") as f: png_data = f.read() # ... proceed to upload png_data to GCS ... finally: # ... clean up temporary files ...
The same pattern is used in
marp_document_tools.py
to callmarp-cli
.
Best Practice: Tools for IO and External Interactions
Encapsulate all interactions with external systems—APIs, CLIs, file system within tools. This keeps the agent logic (LLM prompts and reasoning) focused on what to do, while tools handle how to do it.
State Management and Data Passing (ToolContext
)
This agent system demonstrates how state is used to pass results from one tool to another, or from a tool to a callback.
In
mermaid_to_png_and_upload_tool.py
andmarp_document_tools.py
:1 2 3 4 5 6 7
# from ..config import DOC_LINK_STATE_KEY # ... inside run_async after getting the signed URL ... output = f"Document generated ... Download: {signed_url_str}" if tool_context: tool_context.state[DOC_LINK_STATE_KEY] = output tool_context.actions.skip_summarization = True return output
In
document_generator_agent.py
, theafter_agent_callback
then retrieves this value to ensure it’s the final output:1 2 3 4 5 6
# from ..tools import DOC_LINK_STATE_KEY async def document_generator_after_agent_cb(callback_context: CallbackContext) -> genai_types.Content | None: gcs_link = callback_context.state.get(DOC_LINK_STATE_KEY) if gcs_link: return genai_types.Content(parts=[genai_types.Part(text=str(gcs_link))]) return genai_types.Content(parts=[genai_types.Part(text="Error: Could not find document link.")])
Use
ToolContext.state
for Intermediate DataUsing the
temp:
state scope (e.g.,State.TEMP_PREFIX + "gcs_link_for_diagram"
) is a good way to pass data between a tool and a callback within a single invocation turn without cluttering the persistent session state.
Configuration, Secrets, and Context Loading
expert-agents/config.py
: Centralizes model names, GCS configuration, and the functions for retrieving secrets from Secret Manager.expert-agents/context_loader.py
: Theget_escaped_adk_context_for_llm
function shows a practical solution for a common problem: safely including text that contains special characters (like{
and}
in code snippets) inside an LLM prompt’s instruction string, where they might otherwise be misinterpreted as placeholders for state injection.1 2 3 4 5 6 7 8
# Snippet from expert-agents/context_loader.py def get_escaped_adk_context_for_llm() -> str: raw_adk_context = load_raw_adk_context_file_once() # Replace characters to avoid being parsed as state injection placeholders adk_context_textually_escaped = raw_adk_context.replace('{', '<curly_brace_open>') \ .replace('}', '<curly_brace_close>') # ... add interpretation note ... return interpretation_note + adk_context_textually_escaped
Deployment (Dockerfile
) and Web UI
- The
expert-agents/Dockerfile
is a comprehensive example showing how to set up a container with a specific Python version, system dependencies (curl
,nodejs
, Puppeteer deps), and globalnpm
packages (mermaid-cli
,marp-cli
). This demonstrates creating a self-contained, reproducible environment for the backend agent. - The
webui/
directory, containing a full Angular application, illustrates how an ADK backend can power a rich, interactive frontend. Thechat.component.ts
logic for calling/chat/invoke
and handling the SSE stream is a practical example of a client-side implementation.
Conclusion: A Symphony of ADK Features
The ADK Expert Agent is a powerful case study that moves beyond simple examples to demonstrate how ADK’s features are designed to work in concert:
- A root orchestrator agent uses dynamic instructions to route tasks.
- Specialized sub-agents, wrapped as
AgentTool
s, handle complex, multi-step sub-problems. - These agents use a variety of custom
FunctionTool
s andBaseTool
s that interact with external CLIs and APIs. - Pydantic models ensure structured data exchange between components.
- Callbacks and state management provide fine-grained control over the execution flow and data passing.
- Secure configuration and secret management are employed for production readiness.
- The system is structured for evaluation and containerized deployment with a separate web frontend.
By studying its architecture and code, you can gain deep insights into applying ADK principles to your own complex agent development projects.
What’s Next?
This detailed look at the ADK Expert Agent concludes our exploration of ADK’s primary features and how they manifest in a substantial example. We’ve covered the journey from basic agent definition to advanced multi-agent systems, tooling, operationalization, and security.
We now transition to “Part 4: Running, Managing, and Operationalizing Agents.” Next, we’ll take a much deeper look at the ADK Runner
and the RunConfig
object, understanding how to manage the runtime execution of our agents, including handling streaming, speech, and other advanced configurations.