Chapter 9 - Enabling Agents with Code Execution
This article is part of my web book series. All of the chapters can be found here and the code is available on Github. For any issues around this book, contact me on LinkedIn
So far, our agents have learned to use pre-defined tools and APIs. But what if a task requires dynamic computation, data manipulation beyond the scope of existing tools, or the generation and execution of a script to achieve a goal? This is where Code Execution comes in. ADK provides a robust framework for allowing agents, particularly those powered by advanced LLMs like Gemini, to generate code (typically Python) and have it executed in a controlled environment, with the results fed back to the LLM to inform its next steps.
This capability transforms agents from simple tool users into more versatile problem solvers capable of scripting solutions on the fly.
The BaseCodeExecutor
and its Importance
The foundation of code execution in ADK is the google.adk.code_executors.BaseCodeExecutor
abstract class. Any component that executes code generated by the LLM must implement this interface.
Key aspects managed or defined by a BaseCodeExecutor
:
execute_code(invocation_context, code_execution_input) -> CodeExecutionResult
: The core abstract method that takes aCodeExecutionInput
(containing the code string and any input files) and returns aCodeExecutionResult
(containing stdout, stderr, and any output files).- Environment: The executor defines the environment in which the code runs (e.g., local Python interpreter, Docker container, managed cloud service).
- Statefulness (
stateful: bool
): Some executors can maintain state between code executions within the same session (e.g., variables defined in one code block are available in the next). - File Handling (
optimize_data_file: bool
): Advanced executors can automatically manage data files, making them available to the executed code. - Error Handling & Retries (
error_retry_attempts: int
): Defines how many times to retry if code execution fails. - Delimiters:
code_block_delimiters
: Specifies how the LLM should format code blocks in its output (e.g.,python code
). ADK uses these to extract the code.execution_result_delimiters
: Specifies how ADK should format the code execution results when feeding them back to the LLM (e.g.,tool_output result
).
ADK provides several concrete implementations of BaseCodeExecutor
, each suited for different use cases and security considerations. An LlmAgent
is configured to use a code executor by setting its code_executor
attribute to an instance of one of these classes.
BuiltInCodeExecutor
: Leveraging Model’s Native Capabilities
For certain models, code execution is a native, built-in capability. The model itself can generate, execute (in a sandboxed environment), and reason about Python code.
The google.adk.code_executors.BuiltInCodeExecutor
is a special executor that signals to ADK that the model itself will handle code execution.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
from google.adk.agents import Agent
from google.adk.code_executors import BuiltInCodeExecutor # Key import
from google.adk.runners import InMemoryRunner
from google.genai.types import Content, Part
from building_intelligent_agents.utils import load_environment_variables, create_session, DEFAULT_LLM
load_environment_variables()
# This agent will use the model's internal code interpreter.
# Ensure you are using a model that supports this.
# For Gemini API, this is often enabled by default.
# For Vertex AI, you might need to ensure the model version supports it.
code_savvy_agent_builtin = Agent(
name="code_savvy_agent_builtin",
model=DEFAULT_LLM, # A model with built-in code execution
instruction="You are a helpful assistant that can write and execute Python code to answer questions, especially for calculations or data analysis. When you write code, it will be automatically executed.",
code_executor=BuiltInCodeExecutor() # Assign the executor
)
if __name__ == "__main__":
runner = InMemoryRunner(agent=code_savvy_agent_builtin, app_name="BuiltInCodeApp")
session_id = "s_builtin_code_test"
user_id = "builtin_user"
create_session(runner, session_id, user_id)
prompts = [
"What is the factorial of 7?",
"Calculate the square root of 12345.",
"Generate a list of the first 10 prime numbers."
]
async def main():
for prompt_text in prompts:
print(f"\nYOU: {prompt_text}")
user_message = Content(parts=[Part(text=prompt_text)], role="user")
print("ASSISTANT: ", end="", flush=True)
async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=user_message):
# The trace for this would show the model generating code,
# and then a `code_execution_result` part directly from the model,
# followed by the model's textual interpretation.
if event.content and event.content.parts:
for part in event.content.parts:
if part.text:
print(part.text, end="", flush=True)
elif part.executable_code: # Code generated by LLM
print(f"\n CODE BLOCK:\n{part.executable_code.code.strip()}\n END CODE BLOCK", end="")
elif part.code_execution_result: # Result from model's interpreter
print(f"\n EXECUTION RESULT: {part.code_execution_result.outcome}\n OUTPUT:\n{part.code_execution_result.output.strip()}\n END EXECUTION RESULT", end="")
print()
import asyncio
asyncio.run(main())
How BuiltInCodeExecutor
Works with ADK:
- When
BuiltInCodeExecutor
is assigned to an agent, itsprocess_llm_request
method (called by the LLM Flow) modifies theLlmRequest
to enable the model’s code interpreter tool (e.g., by addingtypes.Tool(code_execution=types.ToolCodeExecution())
torequest.config.tools
). - The LLM, when it deems necessary, generates a
Part
containingexecutable_code
. - The model internally executes this code.
- The LLM then includes another
Part
in its response containing thecode_execution_result
. - ADK receives these parts within the
LlmResponse
and yields correspondingEvent
objects.
Seamless and Secure Code Execution
BuiltInCodeExecutor is the most seamless way to enable code execution if your chosen LLM supports it. The execution happens in a sandboxed environment, offering a high degree of security and abstracting away the complexities of setting up an execution environment.
UnsafeLocalCodeExecutor
: For Development and Trusted Environments
The google.adk.code_executors.UnsafeLocalCodeExecutor
executes Python code directly in the same Python process where your ADK application is running, using Python’s exec()
function.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
from google.adk.agents import Agent
from google.adk.code_executors import UnsafeLocalCodeExecutor # Key import
from google.adk.runners import InMemoryRunner
from google.genai.types import Content, Part
from building_intelligent_agents.utils import load_environment_variables, create_session, DEFAULT_LLM
load_environment_variables()
# ⚠️ DANGER ⚠️: UnsafeLocalCodeExecutor executes arbitrary code from the LLM
# in your local Python environment. ONLY use this in trusted development
# environments and NEVER in production or with untrusted LLM outputs.
unsafe_code_agent = Agent(
name="unsafe_code_agent",
model=DEFAULT_LLM, # Can be any model that generates code
instruction="You are an assistant that can write Python code to solve problems. I will execute the code you provide in my local environment. Focus on simple calculations that don't require external libraries beyond standard Python.",
code_executor=UnsafeLocalCodeExecutor() # Assign the executor
)
if __name__ == "__main__":
print("⚠️ WARNING: Running UnsafeLocalCodeExecutor. This is not recommended for production. ⚠️")
runner = InMemoryRunner(agent=unsafe_code_agent, app_name="UnsafeCodeApp")
session_id = "s_unsafe_code_test"
user_id = "unsafe_user"
create_session(runner, session_id, user_id)
prompts = [
"Define a variable x as 10 and y as 20, then print their sum.",
"What is 2 to the power of 10?",
]
async def main():
for prompt_text in prompts:
print(f"\nYOU: {prompt_text}")
user_message = Content(parts=[Part(text=prompt_text)], role="user")
print("ASSISTANT (via UnsafeLocalCodeExecutor): ", end="", flush=True)
async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=user_message):
# Trace: LLM -> code -> UnsafeLocalCodeExecutor.execute_code() -> result -> LLM -> final text
if event.content and event.content.parts:
for part in event.content.parts:
if part.text: print(part.text, end="", flush=True)
# We might not see executable_code/code_execution_result directly in the final agent output
# if the LLM summarizes it, but they'll be in the Trace.
print()
import asyncio
asyncio.run(main())
How UnsafeLocalCodeExecutor
Works with ADK:
- The LLM generates code, typically formatted with delimiters like
python\ncode\n
. - ADK’s LLM Flow (specifically the
_code_execution.response_processor
) extracts this code from theLlmResponse
. - It creates a
CodeExecutionInput
with the extracted code. - It calls
unsafe_local_executor.execute_code(..., code_input)
. UnsafeLocalCodeExecutor
usesexec(code_input.code, globals_dict, locals_dict)
to run the code. Standard output is captured.- A
CodeExecutionResult
(with stdout/stderr) is returned. - This result is formatted (e.g.,
tool_output\nstdout_content\n
) and sent back to the LLM in the next turn. - The LLM uses this execution result to formulate its final response or decide the next step.
Extreme Security Risk with UnsafeLocalCodeExecutor
The name “Unsafe” is there for a critical reason. This executor runs LLM-generated code directly in your application’s Python environment. A malicious or poorly written piece of code from the LLM could:
- Access/delete local files.
- Make arbitrary network calls.
- Consume excessive resources.
- Introduce security vulnerabilities.
NEVER use
UnsafeLocalCodeExecutor
in production environments or with untrusted models/users. It is strictly for isolated, trusted local development and experimentation.
ContainerCodeExecutor
: Secure, Isolated Execution via Docker
For a more secure way to execute arbitrary code, ADK provides the ContainerCodeExecutor
. This executor runs the LLM-generated Python code inside a Docker container, providing strong isolation from your host system.
Prerequisites:
- Docker installed and running on your system.
- The
docker
Python library installed (pip install docker
).
You can either use a pre-built Python image or provide a path to a Dockerfile to build a custom image.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
from google.adk.agents import Agent
# Ensure 'docker' is installed: pip install google-adk[extensions] or pip install docker
try:
from google.adk.code_executors import ContainerCodeExecutor # Key import
DOCKER_AVAILABLE = True
except ImportError:
print("Docker SDK not found. Please install it ('pip install docker') to run this example.")
DOCKER_AVAILABLE = False
from google.adk.runners import InMemoryRunner
from google.genai.types import Content, Part
import os
import atexit # To ensure container cleanup
from building_intelligent_agents.utils import load_environment_variables, create_session, DEFAULT_LLM
load_environment_variables()
container_agent = None
container_executor_instance = None # To manage its lifecycle
if DOCKER_AVAILABLE:
# Option 1: Use a pre-existing Python image from Docker Hub
# container_executor_instance = ContainerCodeExecutor(image="python:3.10-slim")
# Option 2: Build a custom image from a Dockerfile
# Create a simple Dockerfile in the same directory (e.g., my_python_env/Dockerfile)
dockerfile_dir = "my_python_env"
os.makedirs(dockerfile_dir, exist_ok=True)
with open(os.path.join(dockerfile_dir, "Dockerfile"), "w") as df:
df.write("FROM python:3.10-slim
")
df.write("RUN pip install numpy pandas
") # Example: add libraries
df.write("WORKDIR /app
")
df.write("COPY . /app
") # Not strictly needed if only executing ephemeral code
try:
print("Initializing ContainerCodeExecutor (may take a moment to build/pull image)...")
container_executor_instance = ContainerCodeExecutor(
docker_path=dockerfile_dir # Path to the directory containing the Dockerfile
# image="my-custom-adk-executor:latest" # If you build and tag it manually first
)
print("ContainerCodeExecutor initialized.")
container_agent = Agent(
name="container_code_agent",
model=DEFAULT_LLM,
instruction="You are an assistant that writes Python code. I will execute your code in a sandboxed Docker container. You can use numpy and pandas.",
code_executor=container_executor_instance
)
# Ensure the container is cleaned up on exit - ADK should do it on its own. Provided here only for reference
# def cleanup_container():
# if container_executor_instance and hasattr(container_executor_instance, "_ContainerCodeExecutor__cleanup_container"):
# print("Cleaning up Docker container...")
# # Note: __cleanup_container is an internal method, direct call is for example clarity.
# # Proper resource management would ideally be handled by making ContainerCodeExecutor
# # an async context manager if it holds long-lived resources like a running container.
# # For now, ADK's MCPToolset shows a pattern with AsyncExitStack for resource cleanup.
# # A simpler direct cleanup call if the executor instance itself manages its container:
# if hasattr(container_executor_instance, "_container") and container_executor_instance._container:
# try:
# container_executor_instance._container.stop()
# container_executor_instance._container.remove()
# print(f"Container {container_executor_instance._container.id} stopped and removed.")
# except Exception as e:
# print(f"Error during manual container cleanup: {e}")
# atexit.register(cleanup_container)
except Exception as e:
print(f"Failed to initialize ContainerCodeExecutor. Is Docker running and configured? Error: {e}")
container_agent = None # Fallback
else:
print("Skipping ContainerCodeExecutor example as Docker SDK is not available.")
if __name__ == "__main__":
if not container_agent:
print("Container Agent not initialized. Exiting.")
else:
runner = InMemoryRunner(agent=container_agent, app_name="ContainerCodeApp")
session_id = "s_container_code_test"
user_id = "container_user"
create_session(runner, session_id, user_id)
prompts = [
"Import numpy and create a 3x3 matrix of zeros, then print it.",
"Use pandas to create a DataFrame with two columns, 'Name' and 'Age', and add one row of data. Print the DataFrame."
]
# ... (runner and async main loop as in UnsafeLocalCodeExecutor example) ...
# The interaction flow is similar, but execution is inside Docker.
print("Container agent ready. Note: First execution might be slower due to Docker image layers.")
# Add the async main loop here if you want to run prompts
async def main():
for prompt_text in prompts:
print(f"\nYOU: {prompt_text}")
user_message = Content(parts=[Part(text=prompt_text)], role="user")
print("ASSISTANT (via ContainerCodeExecutor): ", end="", flush=True)
async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=user_message):
if event.content and event.content.parts:
for part in event.content.parts:
if part.text: print(part.text, end="", flush=True)
print()
import asyncio
asyncio.run(main())
How ContainerCodeExecutor
Works with ADK: The flow is similar to UnsafeLocalCodeExecutor
, but step 5 is different: ContainerCodeExecutor.execute_code(...)
starts a Docker container (if not already running for a stateful session, though this example uses non-stateful by default) using the specified image. It then uses docker exec
(or equivalent Docker SDK call) to run the Python code inside the container. Stdout and stderr are captured from the container’s execution.
ContainerCodeExecutor for Enhanced Security
For most use cases involving LLM-generated code, ContainerCodeExecutor offers a much better security posture than UnsafeLocalCodeExecutor due to Docker’s isolation. Define a minimal Docker image with only the necessary Python libraries your agent needs.
Docker Overhead and Configuration
- Running Docker containers introduces some overhead (image pulling/building, container startup time), which might make initial code executions slower.
- Requires Docker to be properly installed and running on the host machine where the ADK application executes.
- Managing Docker images and ensuring they have the correct dependencies can add complexity.
VertexAiCodeExecutor
: Cloud-Native, Managed Code Execution
For a fully managed and scalable code execution environment in the cloud, ADK integrates with Vertex AI Code Interpreter Extension.
The google.adk.code_executors.VertexAiCodeExecutor
uses this Google-managed service.
Prerequisites:
- Google Cloud Project with Vertex AI API enabled.
- Authentication configured (e.g.,
gcloud auth application-default login
or service account). - The
google-cloud-aiplatform
library with the necessary[preview]
extras or ensure your version is recent enough to include code interpreter extensions (pip install "google-cloud-aiplatform>=1.47.0"
or as specified by ADK).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
from google.adk.agents import Agent
# Ensure google-cloud-aiplatform is installed
try:
from google.adk.code_executors import VertexAiCodeExecutor # Key import
VERTEX_SDK_AVAILABLE = True
except ImportError:
print("Vertex AI SDK (with preview features for extensions) not found. Please ensure 'google-cloud-aiplatform' is installed and up to date.")
VERTEX_SDK_AVAILABLE = False
from google.adk.runners import InMemoryRunner
from google.genai.types import Content, Part
import os
from building_intelligent_agents.utils import load_environment_variables, create_session, DEFAULT_LLM
load_environment_variables()
vertex_agent = None
if VERTEX_SDK_AVAILABLE:
# Ensure GOOGLE_CLOUD_PROJECT is set in your environment
if not os.getenv("GOOGLE_CLOUD_PROJECT"):
print("Error: GOOGLE_CLOUD_PROJECT environment variable must be set for VertexAiCodeExecutor.")
else:
try:
print("Initializing VertexAiCodeExecutor...")
# You can optionally provide a resource_name for an existing Code Interpreter Extension instance
# vertex_executor = VertexAiCodeExecutor(resource_name="projects/.../locations/.../extensions/...")
vertex_executor = VertexAiCodeExecutor() # Will create or use an existing one based on env var or default
print(f"VertexAiCodeExecutor initialized. Using extension: {vertex_executor._code_interpreter_extension.gca_resource.name}")
vertex_agent = Agent(
name="vertex_code_agent",
model=DEFAULT_LLM,
instruction="You are an advanced AI assistant. Write Python code to perform calculations or data tasks. Your code will be executed in a secure Vertex AI environment. Default libraries like pandas, numpy, matplotlib are available.",
code_executor=vertex_executor
)
except Exception as e:
print(f"Failed to initialize VertexAiCodeExecutor. Ensure Vertex AI API is enabled and auth is correct. Error: {e}")
else:
print("Skipping VertexAiCodeExecutor example as Vertex AI SDK is not available/configured.")
if __name__ == "__main__":
if not vertex_agent:
print("Vertex Agent not initialized. Exiting.")
else:
runner = InMemoryRunner(agent=vertex_agent, app_name="VertexCodeApp")
session_id = "s_vertex"
user_id = "vertex_user"
create_session(runner, user_id=user_id, session_id=session_id)
prompts = [
"Plot a simple sine wave using matplotlib and save it as 'sine_wave.png'. Describe the plot.",
"Create a pandas DataFrame with columns 'City' and 'Population' for three cities, then print the average population."
]
# ... (runner and async main loop as in UnsafeLocalCodeExecutor example) ...
# The Vertex AI Code Interpreter handles file outputs (like 'sine_wave.png')
# and makes them available in the CodeExecutionResult.
# ADK can then save these as artifacts.
print("Vertex AI Code Interpreter agent ready.")
# Add async main loop here
async def main():
for prompt_text in prompts:
print(f"\nYOU: {prompt_text}")
user_message = Content(parts=[Part(text=prompt_text)], role="user")
print("ASSISTANT (via VertexAiCodeExecutor): ", end="", flush=True)
# Note: The actual plot image won't be printed to console here.
# In the Dev UI or a proper app, you'd handle the output_files
# from the CodeExecutionResult (which are then put into Event.actions.artifact_delta).
async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=user_message):
if event.content and event.content.parts:
for part in event.content.parts:
if part.text: print(part.text, end="", flush=True)
print()
# To see artifacts:
if runner.artifact_service:
artifacts = await runner.artifact_service.list_artifact_keys(
app_name="VertexCodeApp", user_id="vertex_user", session_id="s_vertex"
)
if artifacts:
print(f" (Artifacts created: {artifacts})")
import asyncio
asyncio.run(main())
Managed, Scalable, and Feature-Rich Execution with Vertex AI
VertexAiCodeExecutor is the recommended choice for production cloud deployments.
- Managed Environment: No need to manage Docker or Python environments.
- Security: Runs in a Google-managed sandbox.
- Pre-installed Libraries: Common data science libraries (pandas, numpy, matplotlib, scipy) are typically available.
- File I/O: Supports generating and returning files (e.g., plots, data files), which ADK can then handle as artifacts.
- Stateful Execution: The Vertex AI Code Interpreter can be stateful by default (using
session_id
inexecute_code
), meaning variables and imports persist across code blocks within the same agent session. ADK’sVertexAiCodeExecutor
is also marked asstateful=True
by default.
The Code Execution Cycle
Regardless of the executor used (except BuiltInCodeExecutor
which is more integrated), the general cycle facilitated by ADK’s LLM Flow (specifically _code_execution.py
processors) is:
- LLM Generates Code: The
LlmAgent
, guided by its instruction, generates a code snippet in response to a user query or as part of a plan. This code is typically embedded in its text response, marked by delimiters (e.g.,python ...
). - ADK Extracts Code: The
_code_execution.response_processor
in the LLM Flow detects and extracts this code block from theLlmResponse
. The originalLlmResponse
content (up to the code block) is yielded as a partialEvent
. - ADK Invokes Executor: The processor creates a
CodeExecutionInput
and calls theagent.code_executor.execute_code()
method. - Executor Runs Code: The chosen
BaseCodeExecutor
implementation runs the code in its specific environment. - Executor Returns Result: A
CodeExecutionResult
(containing stdout, stderr, and any output files) is returned to ADK. Output files fromVertexAiCodeExecutor
or a custom executor can be automatically saved as artifacts if anArtifactService
is configured. - ADK Formats Result: The
CodeExecutionResult
is formatted into a string (e.g.,tool_output ...
) and packaged as auser
roleContent
object within a newEvent
. - Result Fed Back to LLM: This new
Event
(containing the execution result) is appended to the conversation history. The LLM Flow then constructs a newLlmRequest
(including this result) and calls the LLM again. - LLM Interprets Result: The LLM uses the code’s output to formulate a final natural language response, generate more code, or decide on its next action.
Managing Code Execution Context (CodeExecutorContext
)
For stateful executors or when optimizing data file inputs, ADK uses CodeExecutorContext
. This object, typically managed internally by the code execution flow processors, stores information relevant to the code execution process within the session state. This can include:
- The
execution_id
for stateful sessions (e.g., forVertexAiCodeExecutor
). - A list of processed input files to avoid redundant processing (
optimize_data_file
). - Error counts for retry logic.
You generally won’t interact with CodeExecutorContext
directly unless you are building a custom code executor or deeply customizing the code execution flow.
Best Practice: Iterative Prompting for Code Generation
Getting an LLM to generate correct and useful code often requires iterative prompting.
- Be Specific: Clearly state the desired inputs, outputs, and any constraints.
- Provide Examples: If possible, include examples of desired code snippets in the agent’s instruction or few-shot examples.
- Error Handling: Instruct the agent on how to interpret error messages from code execution and how to attempt to fix its code. ADK’s
error_retry_attempts
in code executors helps with this.- Start Simple: For complex tasks, ask the LLM to generate code in smaller, verifiable chunks.
What’s Next?
You’ve now unlocked a very advanced capability for your agents: the ability to write and execute code. This opens doors to solving a much wider range of problems. Next, we’ll return to the core of agent intelligence: the LLMs themselves. We’ll explore how ADK interfaces with different models, how to configure requests, and how to interpret their diverse responses, including streaming and multimodal content.