Key Takeaways:
The Claude Agent SDK is Anthropic's library for building autonomous AI agents in Python and TypeScript. It exposes the same agent loop, built-in tools, and context management that power Claude Code, so your agent can read files, run commands, search the web, and edit code out of the box, without you implementing tool execution.
One history note explains most broken tutorials. The SDK launched as claude-code-sdk and was renamed claude-agent-sdk in late 2025. The options class changed from ClaudeCodeOptions to ClaudeAgentOptions at the same time.
Check any guide against the official Agent SDK documentation before copying its imports.
Built-in tools your agent gets immediately:
Anthropic's own guidance in Building Effective Agents draws the line clearly. Agents fit open-ended problems where you cannot predict the number of steps or hardcode a path.
If the steps are known, a linear workflow is faster, cheaper, and easier to debug. Do not reach for an agent because agents are fashionable.
If an agent is the right call, the next question is which Anthropic surface to build on:
| Agent SDK | Client SDK (Messages API) | Managed Agents | |
|---|---|---|---|
| What it is | Agent loop as a Python/TS library | Raw API; you build the loop | Anthropic-hosted REST agent |
| Runs in | Your process, your infra | Your process | Anthropic infrastructure |
| Tools | Built-in + MCP, executed for you | You implement execution | Claude triggers; you execute |
| Best for | Custom apps, pipelines, automation | Full control, thin integrations | Production without sandbox infra |
Claude Code itself stays in the picture for interactive work and one-off tasks. Many teams use the CLI for daily development and the SDK for production automation. The two share workflows directly.
If you are committed to Claude and want batteries included, use the Agent SDK: the loop, tools, and context management ship working.
If you need multi-model flexibility, graph-style orchestration, or LangSmith observability, LangChain's LangGraph and Deep Agents are model-agnostic and run against any provider.
The OpenAI Agents SDK plays the same primitive-level role as Claude's, built around handoff chains in OpenAI's ecosystem.
Prerequisites are light:
Install (pick your language)
# Python
pip install claude-agent-sdk
# TypeScript
npm install @anthropic-ai/claude-agent-sdk
# Both
export ANTHROPIC_API_KEY=your-api-key
The TypeScript SDK bundles a native Claude Code binary, so there is no separate CLI install. Python users who hit CLINotFoundError need the Claude Code CLI on PATH. If pip reports no matching distribution, your interpreter is older than 3.10.
Enterprise stacks can authenticate through a cloud provider instead of a direct key: set CLAUDE_CODE_USE_BEDROCK=1 for Amazon Bedrock, CLAUDE_CODE_USE_VERTEX=1 for Google Vertex AI, or CLAUDE_CODE_USE_FOUNDRY=1 for Azure AI Foundry, each with that platform's credentials.
One policy note: Anthropic does not allow third-party products to offer claude.ai login to their own end users, so ship on API keys.
The query() function is the whole entry point. It returns an async stream of messages while Claude works.
Python - first_agent.py
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions
async def main():
async for message in query(
prompt="Find all TODO comments in this project and summarize them",
options=ClaudeAgentOptions(allowed_tools=["Read", "Glob", "Grep"]),
):
if hasattr(message, "result"):
print(message.result)
asyncio.run(main()
)
Three message types matter in the stream.
SystemMessage with subtype init carries the session_id and the tool list.
AssistantMessage carries Claude's text and tool-use blocks as it reasons.
ResultMessage closes the run with the final result, token usage, and total_cost_usd. Handle each explicitly and your logs stay readable.
allowed_tools pre-approves what the agent may touch, and disallowed_tools blacklists the rest.
The SDK ships six permission modes: default (unmatched tools route to your canUseTool callback), dontAsk (deny anything not pre-approved), acceptEdits (auto-approve file edits), bypassPermissions (approve everything, trusted environments only), plan (explore and plan without editing), and auto (TypeScript only, a model classifier approves each call).
For fine-grained control, a canUseTool handler decides per call:
TypeScript - custom permission handler
options: {
canUseTool: async (toolName, input) => {
if (["Read", "Glob", "Grep"].includes(toolName)) {
return { behavior: "allow", updatedInput: input };
}
if (toolName === "Write" && input.file_path?.includes(".env")) {
return { behavior: "deny", message: "Cannot modify .env files" };
}
return { behavior: "allow", updatedInput: input };
}
}
Start from least privilege. A read-only agent with Read, Glob, and Grep can analyze anything and damage nothing.
Built-in tools cover the filesystem and the web. Your domain lives behind your own APIs, and that is what custom tools are for.
The SDK serves them through in-process Model Context Protocol (MCP) servers: no network hop, just functions registered with the agent.
Python - a custom tool
from claude_agent_sdk import tool, create_sdk_mcp_server
@tool("lookup_account", "Fetch account plan and renewal date by email", {"email": str})
async def lookup_account(args):
account = await crm.get(args["email"]) # your API call
return {"content": [{"type": "text",
"text": f"plan={account.plan}; renewal={account.renewal_date}"}]}
server = create_sdk_mcp_server(name="crm", version="1.0.0", tools=[lookup_account])
options = ClaudeAgentOptions(
mcp_servers={"crm": server},
allowed_tools=["Read", "mcp__crm__lookup_account"],
)
TypeScript - same tool with Zod
import { tool, createSdkMcpServer } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";
const crmServer = createSdkMcpServer({
name: "crm",
version: "1.0.0",
tools: [
tool("lookup_account", "Fetch account plan and renewal date by email",
{ email: z.string().describe("Customer email") },
async (args) => ({
content: [{ type: "text", text: await lookupAccount(args.email) }]
})
)
]
});
Two conventions trip people up. Tool names are fully qualified as mcp__<server>__<tool>, and that full string goes in allowed_tools. And an in-process server is not a network MCP server; for external servers (Playwright, databases, hundreds more), pass a command or HTTP URL in mcp_servers and the SDK connects as a client.
Why bother with custom tools when WebSearch exists? Grounding and speed. One published benchmark on this exact SDK found the native web search took around 80 seconds on a flight-data query and returned vague results, while a purpose-built API tool answered in about 26 seconds with real data, per SerpApi's Agent SDK tutorial.
Domain tools beat generic ones.
Subagents are focused child agents with their own context windows, tools, and even their own models. The main agent delegates, subagents report back. Define them programmatically in options:
Python - a security-review subagent
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition
options = ClaudeAgentOptions(
allowed_tools=["Read", "Glob", "Grep", "Agent"],
agents={
"security-reviewer": AgentDefinition(
description="Expert reviewer for security issues.",
prompt="Scan for injection flaws, exposed secrets, insecure data handling.",
tools=["Read", "Grep", "Glob"],
)
},
)
Include Agent in allowed_tools or the delegation itself gets blocked. A .claude/agents/ folder with markdown definitions works too and is shared with Claude Code.
Use a cheaper model for simple subagent jobs and keep the count deliberate.
Subagent sprawl is the community's loudest complaint: every extra agent multiplies memory, context, and cost, and production hosts report OOM crashes from exactly this.
Hooks run your code at fixed points in the agent lifecycle: PreToolUse, PostToolUse, Stop, SessionStart, SessionEnd, UserPromptSubmit, and more.
They are the difference between a demo and something you can run unattended.
Python - audit log + dangerous-command blocker
from claude_agent_sdk import query, ClaudeAgentOptions, HookMatcher
from datetime import datetime
async def log_file_change(input_data, tool_use_id, context):
path = input_data.get("tool_input", {}).get("file_path", "unknown")
with open("audit.log", "a") as f:
f.write(f"{datetime.now()}: modified {path}\n")
return {}
async def block_dangerous(input_data, tool_use_id, context):
cmd = input_data.get("tool_input", {}).get("command", "")
if "rm -rf" in cmd or "sudo" in cmd:
return {"hookSpecificOutput": {
"hookEventName": "PreToolUse",
"permissionDecision": "deny",
"permissionDecisionReason": "Dangerous command blocked"}}
return {}
options = ClaudeAgentOptions(
permission_mode="acceptEdits",
hooks={
"PostToolUse": [HookMatcher(matcher="Edit|Write", hooks=[log_file_change])],
"PreToolUse": [HookMatcher(matcher="Bash", hooks=[block_dangerous])],
},
)
The matcher is a regex on tool names. No matcher means the hook fires for every tool call, which is exactly what you want for audit logging.
Each query() call starts fresh unless you resume. Capture the session_id from the init message, then pass resume to continue with full context.
Claude remembers the files it read and the conclusions it drew.
Python - resume a session
session_id = None
async for message in query(
prompt="Read the authentication module",
options=ClaudeAgentOptions(allowed_tools=["Read", "Glob"]),
):
if isinstance(message, SystemMessage) and message.subtype == "init":
session_id = message.data["session_id"]
async for message in query(
prompt="Now find all places that call it",
options=ClaudeAgentOptions(resume=session_id),
):
if isinstance(message, ResultMessage):
print(message.result)
Wrap that pattern in a while loop reading stdin and you have a terminal chat agent in 30 lines. For persistent memory across sessions, the SDK reads CLAUDE.md from the working directory, the same project-context file Claude Code uses.
Long-running agents stay inside the context window through automatic compaction, which summarizes older messages as the limit approaches.
The richer and better organized that working context is, the better the agent performs; it is a miniature version of the context layer production support platforms maintain across every ticket.
Programmatic consumers need JSON, not prose. Pass a JSON Schema as outputFormat and read structured_output from the result:
TypeScript - structured output
for await (const message of query({
prompt: "Review this codebase and list all issues",
options: {
allowedTools: ["Read", "Glob", "Grep"],
outputFormat: { type: "json_schema", schema: reviewSchema }
}
})) {
if (message.type === "result" && message.subtype === "success") {
const review = message.structured_output; // typed, parseable
}
}
Cost visibility is built in. Every ResultMessage carries total_cost_usd and token usage, with a per-model breakdown when subagents run. Watch the orchestrator.
Every call carries the full conversation history plus all subagent descriptions, so long sessions get progressively more expensive.
Cap damage with max_turns, cheaper models for subagents, and session resets when context stops paying rent.
Everything above assembles into something real. This agent reads an incoming ticket, searches a local knowledge-base folder, drafts a sourced reply, and flags churn-risk language through a custom tool.
It is a Claude Agent SDK example you can run as-is, then point at real data.
Python - triage_agent.py
import asyncio, re
from claude_agent_sdk import (
query, ClaudeAgentOptions, tool, create_sdk_mcp_server,
AssistantMessage, ResultMessage,
)
RISK = [r"cancel", r"churn", r"switch(ing)? to", r"competitor",
r"refund", r"frustrat", r"disappoint", r"renewal"]
@tool("flag_churn_risk", "Scan ticket text for churn-risk language",
{"ticket_text": str})
async def flag_churn_risk(args):
hits = [p for p in RISK if re.search(p, args["ticket_text"], re.I)]
level = "HIGH" if len(hits) >= 2 else ("MEDIUM" if hits else "LOW")
return {"content": [{"type": "text",
"text": f"churn_risk={level}; signals={hits}"}]}
signals = create_sdk_mcp_server(name="signals", version="1.0.0",
tools=[flag_churn_risk])
PROMPT = """Triage the support ticket in ticket.txt:
1. Read the ticket.
2. Search ./kb for articles relevant to the customer's problem.
3. Write reply.md: a draft reply citing the KB articles you used.
4. Call flag_churn_risk on the ticket text and append the result
to reply.md as a routing note for the CSM."""
async def main():
options = ClaudeAgentOptions(
system_prompt=("You are a B2B support triage agent. "
"Cite sources. Never invent product facts."),
allowed_tools=["Read", "Grep", "Glob", "Write",
"mcp__signals__flag_churn_risk"],
permission_mode="acceptEdits",
mcp_servers={"signals": signals},
max_turns=15,
)
async for message in query(prompt=PROMPT, options=options):
if isinstance(message, AssistantMessage):
for block in message.content:
if hasattr(block, "text"):
print(block.text)
elif isinstance(message, ResultMessage):
print(f"Done. Cost: ${message.total_cost_usd:.4f}")
asyncio.run(main())
Drop a ticket.txt and a few markdown files in ./kb, run it, and read reply.md. The agent gathers context, drafts with sources, and surfaces a revenue signal, in 80 lines.
The TypeScript translation is mechanical: tool() with a Zod schema, the same options object in camelCase.
Recognize the architecture. Context gathering, sourced drafting, and signal detection is the loop a production B2B support platform runs on every ticket, with evals, channels, and account data attached.
Helply's AI assistant drafts every reply with full account context, and its AI support agent resolves the high-confidence tickets end to end, with churn and upsell signals routed to the right owner.
This toy version teaches the pattern; production is a data problem more than a code problem.
The SDK library is free and open source. You pay for what the agent consumes: input, output, and cache tokens at standard Claude API rates, under Anthropic's Commercial Terms.
The June 15, 2026 change matters if you build on a subscription. Agent SDK and claude -p usage on Claude plans now draws from a separate monthly Agent SDK credit instead of your interactive limits, per Anthropic's official support article:
Past the credit, additional usage flows to usage credits at standard API rates, but only if you have usage credits enabled; otherwise Agent SDK requests stop until the credit refreshes.
The practical pattern: prototype against a Max plan's credit, then ship production traffic on API keys with cost tracking from Step 8 wired into your logs.
You can now create an AI agent with the Claude Agent SDK: stand it up with query(), scope it with permissions, extend it with MCP tools, guard it with hooks, and know what it costs before the bill arrives.
The SDK is moving fast, with a TypeScript V2 in preview, so keep the changelog bookmarked and re-verify examples against current versions.
And if the agent you actually need is a support agent for a B2B software company, the build-vs-buy math deserves a look before you scale the toy version.
Helply runs this architecture productized: AI-drafted replies with full account context, autonomous resolution of high-confidence tickets, and churn, upsell, and competitor signals routed to the right owner.
The SDK library is free and open source; you pay only for Claude API tokens, or usage draws from your plan's monthly Agent SDK credit ($20 to $200) introduced June 15, 2026.
They share the same engine: Claude Code is the interactive CLI for your terminal, while the Agent SDK exposes that engine as a Python and TypeScript library for your own applications.
Yes. Install claude-agent-sdk via pip (Python 3.10+) or @anthropic-ai/claude-agent-sdk via npm (Node 18+).
No. It is Claude-only, though you can authenticate through Amazon Bedrock, Google Vertex AI, or Azure AI Foundry instead of a direct Anthropic API key.
Install the Claude Code CLI and make sure it is on your PATH (Python SDK), or upgrade; the TypeScript SDK bundles the binary so no separate install is needed.
The Agent SDK runs the agent loop inside your own process and infrastructure, while Managed Agents is Anthropic's hosted REST API that runs the agent and its sandbox for you.