How to Build an AI Agent with Claude and the Anthropic SDK

You can build an AI agent that reads files, runs commands, searches the web, and edits code — all autonomously — in under 50 lines of Python. The Claude Agent SDK makes this possible by giving you the same infrastructure that powers Claude Code as a programmable library.

Definition

The Claude Agent SDK is a Python and TypeScript library from Anthropic that lets you build autonomous AI agents with built-in tools for file operations, command execution, web search, and code editing — the same capabilities that power Claude Code.

TL;DR

The Claude Agent SDK provides built-in tools (Read, Write, Edit, Bash, Glob, Grep, WebSearch) so your agent works immediately without custom tool implementation
Available in both Python (pip install claude-agent-sdk) and TypeScript (npm install @anthropic-ai/claude-agent-sdk)
The SDK handles the entire agent loop — context gathering, action execution, verification, and iteration — automatically
Supports subagents for parallel task delegation, MCP for external integrations, hooks for custom lifecycle logic, and sessions for multi-turn context
Works with Anthropic's API directly, plus Amazon Bedrock, Google Vertex AI, and Microsoft Azure as alternative providers

What Makes the Claude Agent SDK Different

If you've built AI agents before, you know the pain. You define tools, write execution handlers, build a loop that passes results back to the model, manage context windows, handle errors, implement retry logic — and that's before your agent does anything useful.

The Claude Agent SDK eliminates that entire layer. When you call query(), Claude receives your prompt, decides which tools to use, executes them directly, observes the results, and decides what to do next. You don't implement tool execution. You don't manage the loop. The SDK does it.

This isn't a wrapper around the Anthropic Messages API. It's the actual engine behind Claude Code — the same agent loop, the same context management, the same tool execution pipeline. Anthropic extracted it into a library so you can point it at your own problems.

The practical difference is significant. With the standard Anthropic Client SDK, you write something like this: send a message, check if the model wants to call a tool, execute the tool yourself, send the result back, and repeat. With the Agent SDK, you write one query() call and stream the results. Claude handles everything in between.

Step 1: Set Up Your Environment

You need Python 3.10 or higher (or Node.js 18+ for TypeScript) and an Anthropic API key from the Claude Console.

Create a project directory and install the SDK:

mkdir my-agent && cd my-agent
pip install claude-agent-sdk

For TypeScript:

mkdir my-agent && cd my-agent
npm install @anthropic-ai/claude-agent-sdk

Set your API key as an environment variable. Create a .env file in your project directory:

ANTHROPIC_API_KEY=your-api-key-here

The SDK also supports alternative providers. Set CLAUDE_CODE_USE_BEDROCK=1 for Amazon Bedrock, CLAUDE_CODE_USE_VERTEX=1 for Google Vertex AI, or CLAUDE_CODE_USE_FOUNDRY=1 for Microsoft Azure. Each requires its own credential configuration.

Tip

Use uv (the fast Python package manager from Astral) instead of pip for a cleaner setup. Run uv init && uv add claude-agent-sdk — it handles virtual environments automatically and installs packages significantly faster.

Step 2: Build Your First Agent

Here's a complete agent that finds and fixes bugs in a Python file. This is the SDK's quickstart example, and it demonstrates the core pattern you'll use for everything:

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AssistantMessage, ResultMessage


async def main():
    async for message in query(
        prompt="Review utils.py for bugs that would cause crashes. Fix any issues you find.",
        options=ClaudeAgentOptions(
            allowed_tools=["Read", "Edit", "Glob"],
            permission_mode="acceptEdits",
        ),
    ):
        if isinstance(message, AssistantMessage):
            for block in message.content:
                if hasattr(block, "text"):
                    print(block.text)
                elif hasattr(block, "name"):
                    print(f"Tool: {block.name}")
        elif isinstance(message, ResultMessage):
            print(f"Done: {message.subtype}")


asyncio.run(main())

Three things to understand here. The query() function is the entry point — it creates the agent loop and returns an async iterator that streams messages as Claude works. The allowed_tools parameter controls exactly which built-in tools Claude can access. And permission_mode="acceptEdits" auto-approves file changes so the agent runs without interactive prompts.

When you run this, Claude will read the target file, analyze the code, identify bugs, and edit the file to fix them — all autonomously. Each step streams back as a message you can inspect, log, or display.

Step 3: Understand the Built-in Tools

The SDK ships with a complete toolkit. You don't need to implement any of these — they work out of the box:

Tool	What It Does	Common Use Cases
Read	Read any file in the working directory	Code analysis, config inspection, data loading
Write	Create new files	Generating reports, creating configs, scaffolding
Edit	Make precise edits to existing files	Bug fixing, refactoring, updating values
Bash	Run terminal commands and scripts	Testing, git operations, installs, data processing
Glob	Find files by pattern	Locating files across projects, filtering by extension
Grep	Search file contents with regex	Finding usages, tracking TODOs, locating definitions
WebSearch	Search the web for current information	Research, fact-checking, documentation lookup
WebFetch	Fetch and parse web page content	Scraping docs, reading APIs, pulling data

You control tool access per agent. A read-only analysis agent might only get Read, Glob, and Grep. A full automation agent gets everything. This isn't just about convenience — it's a security boundary.

Step 4: Add Subagents for Complex Tasks

For anything beyond simple tasks, you'll want subagents. These are isolated agent instances that handle focused subtasks. The parent agent delegates, and each subagent reports back with results.

Think of it like managing a team: instead of one person doing everything, you assign specialists to specific parts of the work. Each subagent gets its own context window, so it can focus deeply on its task without being distracted by the parent agent's broader context.

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition


async def main():
    async for message in query(
        prompt="Review this codebase for quality issues and security vulnerabilities",
        options=ClaudeAgentOptions(
            allowed_tools=["Read", "Glob", "Grep", "Agent"],
            agents={
                "code-reviewer": AgentDefinition(
                    description="Expert code reviewer for quality and best practices.",
                    prompt="Analyze code quality and suggest improvements.",
                    tools=["Read", "Glob", "Grep"],
                ),
                "security-auditor": AgentDefinition(
                    description="Security specialist for vulnerability analysis.",
                    prompt="Find security vulnerabilities and suggest fixes.",
                    tools=["Read", "Glob", "Grep"],
                ),
            },
        ),
    ):
        if hasattr(message, "result"):
            print(message.result)


asyncio.run(main())

Include Agent in allowed_tools — subagents are invoked through the Agent tool. Messages from subagents include a parent_tool_use_id field so you can track which results came from which agent.

Step 5: Connect External Systems with MCP

The Model Context Protocol (MCP) lets your agent interact with external services — databases, browsers, APIs, SaaS tools — without you writing custom integration code. MCP servers handle authentication and API calls automatically.

Here's an agent with browser automation through Playwright:

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions


async def main():
    async for message in query(
        prompt="Open example.com and describe what you see",
        options=ClaudeAgentOptions(
            mcp_servers={
                "playwright": {
                    "command": "npx",
                    "args": ["@playwright/mcp@latest"],
                }
            }
        ),
    ):
        if hasattr(message, "result"):
            print(message.result)


asyncio.run(main())

There are hundreds of MCP servers available for services like Slack, GitHub, Google Drive, Asana, and databases. You define the server in your config, and Claude gets access to its tools automatically. No OAuth flows, no API client code, no token management.

Step 6: Add Lifecycle Hooks

Hooks let you run custom code at specific points in the agent's lifecycle — before a tool runs, after a tool runs, when the agent finishes, and more. This is how you add logging, validation, cost tracking, or custom approval flows.

import asyncio
from datetime import datetime
from claude_agent_sdk import query, ClaudeAgentOptions, HookMatcher


async def log_file_change(input_data, tool_use_id, context):
    file_path = input_data.get("tool_input", {}).get("file_path", "unknown")
    with open("./audit.log", "a") as f:
        f.write(f"{datetime.now()}: modified {file_path}\n")
    return {}


async def main():
    async for message in query(
        prompt="Refactor utils.py to improve readability",
        options=ClaudeAgentOptions(
            permission_mode="acceptEdits",
            hooks={
                "PostToolUse": [
                    HookMatcher(matcher="Edit|Write", hooks=[log_file_change])
                ]
            },
        ),
    ):
        if hasattr(message, "result"):
            print(message.result)


asyncio.run(main())

Available hooks include PreToolUse, PostToolUse, Stop, SessionStart, SessionEnd, and UserPromptSubmit. The HookMatcher uses a regex pattern to target specific tools — in the example above, the audit log hook only fires when Claude uses Edit or Write.

Step 7: Manage Sessions for Multi-Turn Agents

Sessions let your agent maintain context across multiple exchanges. Claude remembers files it read, analysis it performed, and the full conversation history. You can resume sessions later or fork them to explore different approaches.

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions


async def main():
    session_id = None

    # First query — capture the session ID
    async for message in query(
        prompt="Read the authentication module and summarize how it works",
        options=ClaudeAgentOptions(allowed_tools=["Read", "Glob"]),
    ):
        if hasattr(message, "subtype") and message.subtype == "init":
            session_id = message.session_id

    # Second query — resumes with full context
    async for message in query(
        prompt="Now find all places that call it and check for security issues",
        options=ClaudeAgentOptions(resume=session_id),
    ):
        if hasattr(message, "result"):
            print(message.result)


asyncio.run(main())

When you pass resume=session_id, Claude picks up exactly where it left off. The context from the first query — every file read, every analysis performed — carries forward into the second query. This is essential for building agents that handle complex, multi-step workflows.

Step 8: Control Permissions

The SDK gives you granular control over what your agent can and cannot do. This isn't just a safety feature — it's how you build agents that are appropriate for different deployment contexts.

Four permission modes are available. acceptEdits auto-approves file edits but asks for other actions — good for trusted development workflows. bypassPermissions runs every tool without prompts — only appropriate for fully sandboxed environments like CI pipelines. dontAsk (TypeScript only) denies anything not explicitly in allowed_tools. And default requires you to provide a canUseTool callback that implements your own approval logic.

For production deployments, use the default mode with a custom callback that implements whatever approval policy your use case requires. This might mean logging all tool calls, requiring human approval for destructive operations, or blocking certain file paths entirely.

Practical Agent Ideas to Build

Now that you understand the SDK, here are agents that solve real problems:

A codebase documentation agent that scans your entire project, reads every file, and generates comprehensive documentation — README files, API docs, architecture diagrams, and inline comments. Give it Read, Write, Glob, Grep, and Bash tools.

A research agent that takes a topic, searches the web for sources, reads the full content of top results, cross-references claims, and produces a structured research report with citations. Use WebSearch, WebFetch, Read, and Write tools.

An email assistant that reads incoming messages, categorizes them by priority, drafts responses following your communication style, and queues them for your review. Connect an email MCP server and give it Read and Write tools.

A CI/CD debugging agent that monitors build failures, reads error logs, traces the failure to specific code changes, and either fixes the issue automatically or produces a detailed diagnostic report. This one needs Read, Edit, Bash, Glob, and Grep.

Info

Check out Anthropic's official example agents at github.com/anthropics/claude-agent-sdk-demos for complete working implementations of email assistants, research agents, and more. These are excellent starting points for your own projects.

Agent SDK vs. Client SDK: When to Use Which

The Anthropic Client SDK (anthropic package) gives you direct API access to Claude. You send messages and implement tool execution yourself. The Agent SDK gives you Claude with built-in tool execution and an autonomous agent loop.

Use the Client SDK when you need fine-grained control over every API call, when you're building a simple chatbot without tool use, or when you're integrating Claude into an existing application framework that has its own tool execution pipeline.

Use the Agent SDK when you want autonomous task execution, when your agent needs to interact with the filesystem or run commands, when you're building for CI/CD or production automation, or when you want to leverage built-in tools without implementing execution handlers.

Many teams use both: the CLI (Claude Code) for interactive daily development, and the SDK for production automation pipelines. The capabilities translate directly between them — a workflow you test interactively in Claude Code can be deployed as an SDK agent with minimal changes.

How much does it cost to use the Claude Agent SDK?

The SDK itself is free and open-source. You pay for API usage based on your Claude model consumption. Claude Sonnet 4 pricing is $3 per million input tokens and $15 per million output tokens. A typical agent task (file analysis and editing) might use 10,000–50,000 tokens, costing roughly $0.03–0.15 per task. Costs scale with task complexity and the number of tool calls your agent makes.

Can I use the Claude Agent SDK with Amazon Bedrock or Google Vertex AI?

Yes. Set the CLAUDE_CODE_USE_BEDROCK=1 environment variable for Amazon Bedrock, CLAUDE_CODE_USE_VERTEX=1 for Google Vertex AI, or CLAUDE_CODE_USE_FOUNDRY=1 for Microsoft Azure. Each provider requires its own credential configuration, but the SDK code itself stays the same — you don't change your agent logic based on which provider you use.

What is the difference between the Claude Agent SDK and Claude Code?

They share the same engine. Claude Code is an interactive CLI tool for developers. The Claude Agent SDK is the library version of that same engine, designed for programmatic use in applications, CI/CD pipelines, and custom automation. Workflows you build interactively in Claude Code translate directly to Agent SDK implementations with minimal changes.

Do I need to implement tool execution myself with the Claude Agent SDK?

No. That's the key difference from the Anthropic Client SDK. The Agent SDK includes built-in execution for all core tools — Read, Write, Edit, Bash, Glob, Grep, WebSearch, and WebFetch. Claude calls these tools autonomously during its agent loop, and the SDK handles execution. You just stream the results. For external services, you connect MCP servers that handle their own execution.

Can the Claude Agent SDK run in production environments?

Yes. The SDK supports Docker containerization, cloud deployment, and CI/CD integration. Use bypassPermissions mode in fully sandboxed environments (like Docker containers) where human approval isn't feasible. For production with human oversight, use the default permission mode with a custom canUseTool callback. The SDK also supports session persistence and resumption for long-running workflows.