Post

Chapter 15 - ADK in Action - Building the "ADK Expert Agent"

This article is part of my web book series. All of the chapters can be found here and the code is available on Github. For any issues around this book, contact me on LinkedIn

Until now we’ve explored the individual components and capabilities of the Agent Development Kit. Now, let’s bring many of these concepts together by creating a sophisticated example: the “ADK Expert Agent.” This agent system, is designed to be knowledgeable about ADK itself, assist with GitHub issues, generate documents, and create architecture diagrams.

By dissecting its structure and key functionalities, we can see a practical application of multi-agent design, specialized tooling, dynamic instructions, state management, and more, all orchestrated by ADK. The code for this can be found here.

Overview of the ADK Expert Agent System

The ADK Expert Agent is not a single monolithic agent but rather a multi-agent system orchestrated by a root agent. Its primary goal is to assist users with queries related to Google’s Agent Development Kit.

Core Capabilities:

  1. General ADK Q&A: Answers questions based on a loaded ADK knowledge base (available in expert-agents/data/google-adk-python-1.2.0.txt).
  2. GitHub Issue Processing: Fetches details for specified google/adk-python GitHub issues and provides ADK-specific guidance.
  3. Document Generation: Creates PDF, HTML slides, or PPTX slides from mermaid content generated by an LLM based on user requests and ADK knowledge.
  4. Mermaid Diagram Generation: Generates Mermaid diagram syntax for user-described architectures and converts it into a PNG image.

High-Level Architecture: (Please Zoom-In)

*Diagram: High-Level Architecture of the ADK Expert Agent system.*

The system’s entry point is the root_agent defined in expert-agents/agent.py.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# expert-agents/agent.py (Simplified Snippet - Top Level)
from google.adk.agents import Agent as ADKAgent
from google.adk.models import Gemini
# ... other imports ...
from .agents.github_issue_processing_agent import github_issue_processing_agent
from .agents.document_generator_agent import document_generator_agent
from .agents.mermaid_diagram_orchestrator_agent import mermaid_diagram_orchestrator_agent
from .tools.prepare_document_content_tool import PrepareDocumentContentTool
from .config import PRO_MODEL_NAME # Using PRO_MODEL_NAME as in actual code

# ... root_agent_instruction_provider and root_agent_after_tool_callback defined elsewhere ...

root_agent_tools = [
    AgentTool(agent=github_issue_processing_agent),
    AgentTool(agent=document_generator_agent),
    AgentTool(agent=mermaid_diagram_orchestrator_agent),
    PrepareDocumentContentTool(),
]

root_agent = ADKAgent(
    name="adk_expert_orchestrator",
    model=Gemini(model=PRO_MODEL_NAME), # Using PRO_MODEL_NAME
    instruction=root_agent_instruction_provider,
    tools=root_agent_tools,
    # ... callbacks and config ...
)

Orchestrator Pattern

The adk_expert_orchestrator exemplifies the orchestrator pattern. It doesn’t perform all tasks itself but intelligently delegates to specialized sub-agents (wrapped as AgentTools) based on the nature of the user’s query. This promotes modularity and separation of concerns.

The Root Orchestrator: adk_expert_orchestrator

Defined in expert-agents/agent.py, this is the primary LlmAgent that users interact with.

Key Features Demonstrated:

  • Dynamic Instructions (root_agent_instruction_provider): The logic within root_agent_instruction_provider in expert-agents/agent.py is key to its routing capabilities.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    
      # expert-agents/agent.py (Snippet of root_agent_instruction_provider)
      def root_agent_instruction_provider(context: ReadonlyContext) -> str:
          # ... (imports and helper get_text_from_content)
          adk_context_for_llm = get_escaped_adk_context_for_llm()
          invocation_ctx = getattr(context, '_invocation_context', None)
          user_query_text = get_text_from_content(invocation_ctx.user_content) if invocation_ctx and invocation_ctx.user_content else ""
            
          # ... (logic for detecting previous tool calls) ...
    
          # --- Logic for routing based on user query ---
          diagram_keywords = ["diagram", "architecture", "visualize", "mermaid", "graph"]
          is_diagram_request = any(kw in user_query_text.lower() for kw in diagram_keywords)
    
          if is_diagram_request:
              logger.info(f"RootAgent (instruction_provider): Detected architecture diagram request: '{user_query_text}'")
              diagram_agent_input_payload = DiagramGeneratorAgentToolInput(diagram_query=user_query_text).model_dump_json()
              system_instruction = f"""
    You are an expert orchestrator for Google's Agent Development Kit (ADK).
    The user is asking for an architecture diagram. Their query is: "{user_query_text}"
    Your task is to call the '{mermaid_diagram_orchestrator_agent.name}' tool.
    The tool expects its input as a JSON string. The value for the "request" argument MUST be the following JSON string:
    {diagram_agent_input_payload}
    This is your only action for this turn. Output only the tool call.
    """
          # ... (elif blocks for document generation, GitHub issues, and general Q&A) ...
          else: # General ADK Question
              logger.info(f"RootAgent (instruction_provider): General ADK query: '{user_query_text}'")
              system_instruction = f"""
    You are an expert on Google's Agent Development Kit (ADK) version 1.0.0.
    Your primary role is to answer general questions about ADK.
    ADK Knowledge Context (for general ADK questions):
    --- START OF ADK CONTEXT ---
    {adk_context_for_llm}
    --- END OF ADK CONTEXT ---
    Use your ADK knowledge to answer the user's query: "{user_query_text}" directly. This is your final answer.
    """
          return system_instruction
    
  • Callbacks for Control and Logging (root_agent_after_tool_callback): This callback is vital for processing the structured (often JSON string) output from the specialized AgentTools and presenting it correctly to the orchestrator’s LLM or directly to the user.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    
      # expert-agents/agent.py (Snippet of root_agent_after_tool_callback)
      async def root_agent_after_tool_callback(
          tool: BaseTool, args: dict, tool_context: ToolContext, tool_response: Any
      ) -> Optional[Any]:
          if tool.name == github_issue_processing_agent.name:
              logger.info(f"RootAgent: Processing response from '{github_issue_processing_agent.name}'.")
              tool_context.actions.skip_summarization = True # Don't let orchestrator LLM summarize this
              response_text = "Error: Could not process response from GitHub issue agent."
              try:
                  # The github_issue_processing_agent (a SequentialAgent ending with FormatOutputAgent)
                  # is expected to return a JSON string that validates against SequentialProcessorFinalOutput
                  if isinstance(tool_response, str):
                       response_dict = json.loads(tool_response)
                  elif isinstance(tool_response, dict): # AgentTool might already give a dict if sub-agent output JSON
                       response_dict = tool_response.get("result", tool_response) # ADK wraps AgentTool output
                  else:
                      raise ValueError(f"Unexpected tool_response type: {type(tool_response)}")
                    
                  validated_output = SequentialProcessorFinalOutput.model_validate(response_dict)
                  response_text = validated_output.guidance
              except Exception as e:
                  # ... (error logging) ...
                  response_text = f"Error processing GitHub agent output: {str(tool_response)[:200]}"
              return genai_types.Content(parts=[genai_types.Part(text=response_text)])
    
          elif tool.name == mermaid_diagram_orchestrator_agent.name:
              logger.info(f"RootAgent: Received response from '{mermaid_diagram_orchestrator_agent.name}'.")
              tool_context.actions.skip_summarization = True
              # This agent's after_agent_callback ensures its output is a string (URL or error)
              response_text = str(tool_response.get("result") if isinstance(tool_response, dict) else tool_response)
              return genai_types.Content(parts=[genai_types.Part(text=response_text)])
            
          # ... (similar handling for document_generator_agent) ...
    
          elif tool.name == PrepareDocumentContentTool().name:
              logger.info(f"RootAgent: 'prepare_document_content_tool' completed.")
              # Let the LLM summarize/use this structured output to call the next agent
              tool_context.actions.skip_summarization = False 
              return tool_response # This is a dict
            
          return None # Default: let LLM summarize
    

Callbacks for Inter-Agent Data Transformation

The after_tool_callback in the root_agent is crucial for transforming the output of sub-agents (which might be JSON strings or complex dicts when called via AgentTool) into a format (like genai_types.Content) that the orchestrator’s LLM can readily consume for its next reasoning step or for generating the final user-facing response, especially when the defualt summarization is skipped (skip_summarization = True).

Specialized Agents: Divide and Conquer

The adk-expert-agent effectively uses specialized agents for distinct, complex tasks.

1. github_issue_processing_agent:

This is a SequentialAgent, demonstrating nested orchestration. It executes a fixed pipeline of wrapper agents.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Snippet from expert-agents/agents/github_issue_processing_agent.py
get_issue_description_wrapper_agent = LlmAgent(...)
adk_guidance_wrapper_agent = LlmAgent(...)
# ...
class FormatOutputAgent(BaseAgent): # Custom non-LLM agent for final formatting
    # ...
    async def _run_async_impl(self, ctx: InvocationContext) -> AsyncGenerator[Event, None]:
        # ... logic to find last event and format output ...
        yield Event(...)

# The Sequential Agent definition
github_issue_processing_agent = SequentialAgent(
    name="github_issue_processing_sequential_agent",
    description="Processes a GitHub issue by fetching, cleaning, and providing ADK-specific guidance.",
    sub_agents=[
        get_issue_description_wrapper_agent,
        # A cleaning agent could go here if needed
        adk_guidance_wrapper_agent,
        FormatOutputAgent() 
    ]
)

Custom BaseAgent for Deterministic Steps

The FormatOutputAgent within the github_issue_processing_agent is a great example of a custom BaseAgent. Its job is purely deterministic: find the output of the previous step and format it into the final JSON structure. This doesn’t require an LLM, making it faster, cheaper, and more reliable than prompting an LLM to do the formatting.

2. mermaid_diagram_orchestrator_agent:

This LlmAgent demonstrates a two-step internal process using a sub-agent and a tool.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# expert-agents/agents/mermaid_diagram_orchestrator_agent.py (Definition Snippet)
# ... (Pydantic models and tool imports) ...
from .mermaid_syntax_generator_agent import mermaid_syntax_generator_agent
from ..tools.mermaid_to_png_and_upload_tool import mermaid_gcs_tool_instance

mermaid_diagram_orchestrator_agent = ADKAgent(
    name="mermaid_diagram_orchestrator_agent",
    model=Gemini(model=DEFAULT_MODEL_NAME), # Can use a simpler model for orchestration
    instruction=diagram_orchestrator_instruction_provider, # Key logic here
    tools=[
        AgentTool(agent=mermaid_syntax_generator_agent), # Calls another agent as a tool
        mermaid_gcs_tool_instance,                   # Calls a direct tool
    ],
    input_schema=DiagramGeneratorAgentToolInput,
    after_agent_callback=diagram_orchestrator_after_agent_cb, # Returns GCS link
    # ... other configs ...
)

Its instruction_provider checks if Mermaid syntax has been generated in a previous turn. If not, it instructs the LLM to call the mermaid_syntax_generator_agent. If syntax is present, it instructs the LLM to call the mermaid_to_png_and_gcs_upload tool with that syntax.

Chaining AgentTool Calls

When one LlmAgent (Orchestrator) calls another LlmAgent (Specialist) via AgentTool, the input_schema of the Specialist and the after_agent_callback of the Specialist are crucial. The Orchestrator’s LLM needs to provide input matching the Specialist’s input_schema (often as a JSON string). The Specialist’s after_agent_callback should ensure its final output (which the AgentTool returns) is in a format the Orchestrator’s after_tool_callback can parse and relay effectively.

Crafting Specialized Tools

The expert-agents/tools/ directory is rich with examples of custom tools.

  • Interacting with External APIs (github_issue_tool.py):

    This tool shows how to use requests to call the GitHub API, handle authentication with a PAT fetched from Secret Manager, and parse the JSON response.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    
      # Snippet from expert-agents/tools/github_issue_tool.py
      class GetGithubIssueDescriptionTool(BaseTool):
      # ...
      async def run_async(self, *, args: Dict[str, Any], tool_context: ToolContext) -> Dict[str, Any]:
          # ... (argument validation) ...
          api_url = f"https://api.github.com/repos/{owner}/{repo}/issues/{issue_number}"
          GITHUB_PERSONAL_ACCESS_TOKEN = get_github_pat_from_secret_manager()
          headers = {
              "Authorization": f"token {GITHUB_PERSONAL_ACCESS_TOKEN}",
              # ... other headers
          }
          try:
              response = requests.get(api_url, headers=headers, timeout=10)
              response.raise_for_status()
              # ... return description or error from response.json() ...
          except requests.exceptions.HTTPError as e:
              # ... handle specific HTTP errors ...
    
  • Executing Command-Line Tools (mermaid_to_png_and_upload_tool.py):

    This tool demonstrates how to safely execute an external command-line utility (mmdc) using Python’s asyncio.create_subprocess_exec for non-blocking execution.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    
      # Snippet from expert-agents/tools/mermaid_to_png_and_upload_tool.py
      async def run_async(self, args: Dict[str, str], tool_context: ToolContext) -> str:
          # ... (setup temporary files for input and output) ...
          try:
              logger.info(f"Running mmdc: {MERMAID_CLI_PATH} ...")
              process = await asyncio.create_subprocess_exec(
                  MERMAID_CLI_PATH,
                  "-p", PUPPETEER_CONFIG_PATH,
                  "-i", mmd_file_path,
                  "-o", png_file_path_local_temp,
                  stdout=subprocess.PIPE, stderr=subprocess.PIPE
              )
              stdout, stderr = await process.communicate()
                
              if process.returncode != 0:
                  # ... handle and return error ...
    
              with open(png_file_path_local_temp, "rb") as f:
                  png_data = f.read()
              # ... proceed to upload png_data to GCS ...
          finally:
              # ... clean up temporary files ...
    

    The same pattern is used in marp_document_tools.py to call marp-cli.

Best Practice: Tools for IO and External Interactions

Encapsulate all interactions with external systems—APIs, CLIs, file system within tools. This keeps the agent logic (LLM prompts and reasoning) focused on what to do, while tools handle how to do it.

State Management and Data Passing (ToolContext)

This agent system demonstrates how state is used to pass results from one tool to another, or from a tool to a callback.

  • In mermaid_to_png_and_upload_tool.py and marp_document_tools.py:

    1
    2
    3
    4
    5
    6
    7
    
      # from ..config import DOC_LINK_STATE_KEY
      # ... inside run_async after getting the signed URL ...
      output = f"Document generated ... Download: {signed_url_str}"
      if tool_context:
          tool_context.state[DOC_LINK_STATE_KEY] = output
          tool_context.actions.skip_summarization = True
      return output
    
  • In document_generator_agent.py, the after_agent_callback then retrieves this value to ensure it’s the final output:

    1
    2
    3
    4
    5
    6
    
      # from ..tools import DOC_LINK_STATE_KEY
      async def document_generator_after_agent_cb(callback_context: CallbackContext) -> genai_types.Content | None:
          gcs_link = callback_context.state.get(DOC_LINK_STATE_KEY)
          if gcs_link:
              return genai_types.Content(parts=[genai_types.Part(text=str(gcs_link))])
          return genai_types.Content(parts=[genai_types.Part(text="Error: Could not find document link.")])
    

Use ToolContext.state for Intermediate Data

Using the temp: state scope (e.g., State.TEMP_PREFIX + "gcs_link_for_diagram") is a good way to pass data between a tool and a callback within a single invocation turn without cluttering the persistent session state.

Configuration, Secrets, and Context Loading

  • expert-agents/config.py: Centralizes model names, GCS configuration, and the functions for retrieving secrets from Secret Manager.

  • expert-agents/context_loader.py: The get_escaped_adk_context_for_llm function shows a practical solution for a common problem: safely including text that contains special characters (like { and } in code snippets) inside an LLM prompt’s instruction string, where they might otherwise be misinterpreted as placeholders for state injection.

    1
    2
    3
    4
    5
    6
    7
    8
    
      # Snippet from expert-agents/context_loader.py
      def get_escaped_adk_context_for_llm() -> str:
          raw_adk_context = load_raw_adk_context_file_once()
          # Replace characters to avoid being parsed as state injection placeholders
          adk_context_textually_escaped = raw_adk_context.replace('{', '<curly_brace_open>') \
                                                         .replace('}', '<curly_brace_close>')
          # ... add interpretation note ...
          return interpretation_note + adk_context_textually_escaped
    

Deployment (Dockerfile) and Web UI

  • The expert-agents/Dockerfile is a comprehensive example showing how to set up a container with a specific Python version, system dependencies (curl, nodejs, Puppeteer deps), and global npm packages (mermaid-cli, marp-cli). This demonstrates creating a self-contained, reproducible environment for the backend agent.
  • The webui/ directory, containing a full Angular application, illustrates how an ADK backend can power a rich, interactive frontend. The chat.component.ts logic for calling /chat/invoke and handling the SSE stream is a practical example of a client-side implementation.

Conclusion: A Symphony of ADK Features

The ADK Expert Agent is a powerful case study that moves beyond simple examples to demonstrate how ADK’s features are designed to work in concert:

  • A root orchestrator agent uses dynamic instructions to route tasks.
  • Specialized sub-agents, wrapped as AgentTools, handle complex, multi-step sub-problems.
  • These agents use a variety of custom FunctionTools and BaseTools that interact with external CLIs and APIs.
  • Pydantic models ensure structured data exchange between components.
  • Callbacks and state management provide fine-grained control over the execution flow and data passing.
  • Secure configuration and secret management are employed for production readiness.
  • The system is structured for evaluation and containerized deployment with a separate web frontend.

By studying its architecture and code, you can gain deep insights into applying ADK principles to your own complex agent development projects.

What’s Next?

This detailed look at the ADK Expert Agent concludes our exploration of ADK’s primary features and how they manifest in a substantial example. We’ve covered the journey from basic agent definition to advanced multi-agent systems, tooling, operationalization, and security.

We now transition to “Part 4: Running, Managing, and Operationalizing Agents.” Next, we’ll take a much deeper look at the ADK Runner and the RunConfig object, understanding how to manage the runtime execution of our agents, including handling streaming, speech, and other advanced configurations.

This post is licensed under CC BY 4.0 by the author.