How to Build an AI Agent with CrewAI
If you've been building single-prompt workflows and wondering why they keep breaking on edge cases, multi-agent architecture is the answer — and CrewAI is the fastest way to get there.
CrewAI is an open-source Python framework for orchestrating multiple AI agents that collaborate on complex tasks, where each agent has a defined role, set of goals, and access to specific tools — much like assigning a real team to a project.
TL;DR
- CrewAI lets you build teams of specialized AI agents instead of relying on a single overloaded prompt
- Over 100,000 developers are certified through CrewAI's training — it has become the standard for role-based multi-agent systems
- You can build and run your first crew in under 30 minutes with Python 3.10+ and a free API key
- CrewAI's two main primitives are Crews (autonomous collaboration) and Flows (deterministic pipelines) — most production systems use both
- Real use cases in production: content generation, lead scoring, automated customer support, code review pipelines
What Makes CrewAI Different from Other Agent Frameworks
Before you write any code, it's worth understanding why CrewAI specifically — there are other frameworks competing for this space.
CrewAI's core bet is that role-based specialization produces better results than a single all-knowing agent. Instead of asking one LLM to research a topic, analyze it, and write a report, you assign those tasks to three separate agents — a Researcher, an Analyst, and a Writer — each with focused context and appropriate tools.
The practical difference: specialized agents make fewer hallucinations in their domain, produce more coherent outputs, and are easier to debug when something goes wrong.
How CrewAI compares to the alternatives:
| Framework | Architecture | Best For | Learning Curve | Production Readiness |
|---|---|---|---|---|
| CrewAI | Role-based multi-agent | Collaborative task pipelines | Low | High |
| LangGraph | Graph-based state machine | Complex conditional workflows | High | Very High |
| AutoGen / MS Agent Framework | Conversational multi-agent | Research and analysis agents | Medium | High (post-1.0 GA) |
| Single LLM (direct API) | None | Simple, contained tasks | Very Low | Medium |
Note: AutoGen is effectively in maintenance mode — Microsoft merged it with Semantic Kernel into the Microsoft Agent Framework with GA targeted for Q1 2026. If you're starting fresh today, CrewAI and LangGraph are the two serious options.
CrewAI wins on developer ergonomics and speed to first working prototype. LangGraph wins for highly complex state machines where you need explicit control over every transition. For most automation use cases, start with CrewAI.
Step 1: Set Up Your Environment
You need Python 3.10–3.13. Check your version first:
python --version
If you're below 3.10, install the latest Python from python.org before proceeding.
Create a project directory and a virtual environment:
mkdir my-crewai-project
cd my-crewai-project
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
Install CrewAI:
pip install crewai crewai-tools
Set your LLM API key as an environment variable. CrewAI defaults to OpenAI, but you can swap in Claude or any other provider:
export OPENAI_API_KEY="your-key-here"
# Or for Claude:
export ANTHROPIC_API_KEY="your-key-here"
Use a .env file with the python-dotenv package to manage API keys locally instead of exporting them as environment variables each session. This keeps credentials out of your shell history and makes it easy to switch between keys.
Step 2: Understand the Core Primitives
Before writing your crew, you need to understand the three building blocks. Everything in CrewAI is composed of these:
Agents — The individual team members. Each agent has a role, a goal, a backstory, and optionally a list of tools. The backstory sounds like flavor text but it matters — it shapes the agent's decision-making by giving the LLM anchoring context about how this role thinks.
Tasks — The actual work units. Each task has a description (what to do), an expected_output (what a good result looks like), and is assigned to a specific agent. You can also route a task's output into a file.
Crews — The container that wraps agents and tasks together and defines how they collaborate. You choose a process: sequential (tasks run one after another, each feeding into the next) or hierarchical (a manager agent orchestrates the others).
Step 3: Build Your First Crew
Here's a minimal working example — a research and reporting crew that takes a topic, researches it, and outputs a structured report:
from crewai import Agent, Task, Crew, Process
from crewai_tools import SerperDevTool
# Optional: web search tool
search_tool = SerperDevTool()
# Define agents
researcher = Agent(
role="Senior Research Analyst",
goal="Uncover cutting-edge developments and data on {topic}",
backstory="""You're a seasoned researcher with a talent for finding
non-obvious insights. You dig past surface-level summaries and
find the specific data points that matter.""",
tools=[search_tool],
verbose=True
)
writer = Agent(
role="Content Strategist",
goal="Write a clear, actionable report on {topic} for a business audience",
backstory="""You transform complex research into direct, useful reports.
You avoid fluff and always lead with the most important finding.""",
verbose=True
)
# Define tasks
research_task = Task(
description="""Research {topic} thoroughly. Find:
- 3 recent statistics with sources
- Key trends in the last 12 months
- Practical implications for businesses""",
expected_output="A bullet-point research brief with sources cited",
agent=researcher
)
writing_task = Task(
description="""Using the research brief, write a 500-word executive summary
on {topic}. Lead with the most important finding. Use clear headers.""",
expected_output="A formatted executive summary in markdown",
agent=writer,
output_file="report.md"
)
# Assemble the crew
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, writing_task],
process=Process.sequential,
verbose=True
)
# Run it
result = crew.kickoff(inputs={"topic": "AI automation adoption in small businesses"})
print(result)
Run this with python main.py and you'll see both agents working in sequence, with the writer receiving the researcher's output automatically.
Join the Free Skool Community
Get access to workflow templates, weekly live calls, and a private network of AI automation builders.
Join for FreeStep 4: Add Tools to Extend Agent Capabilities
Agents without tools are just prompted LLMs. Tools are what make agents useful in production — they let agents take real actions: search the web, read files, run code, query databases.
CrewAI's crewai-tools package includes ready-made tools:
from crewai_tools import (
SerperDevTool, # Web search via Serper API
FileReadTool, # Read local files
WebsiteSearchTool, # Scrape and search a specific URL
CodeInterpreterTool # Execute Python code
)
You can also write custom tools using the @tool decorator:
from crewai import tool
@tool("Get company info from CRM")
def get_crm_data(company_name: str) -> str:
"""Retrieves contact and deal data for a company from our CRM."""
# Your API call here
return f"Company: {company_name}, Status: Active, Revenue: $2.3M"
Assign tools to agents when you instantiate them. Give each agent only the tools it actually needs — overly-tooled agents make more mistakes because they have too many options to choose from.
Step 5: Use Flows for Production-Ready Control
Crews are great for autonomous collaboration but can be unpredictable in production because agents make their own routing decisions. When you need deterministic, reliable pipelines, use Flows.
A Flow is an event-driven state machine where you define exactly what runs, in what order, with branching logic:
from crewai.flow.flow import Flow, listen, start
from pydantic import BaseModel
class ContentPipelineState(BaseModel):
topic: str = ""
research: str = ""
draft: str = ""
class ContentFlow(Flow[ContentPipelineState]):
@start()
def get_topic(self):
self.state.topic = "AI agent frameworks in 2026"
return self.state.topic
@listen(get_topic)
def run_research(self, topic):
# Run a crew here, or call an LLM directly
self.state.research = research_crew.kickoff({"topic": topic})
return self.state.research
@listen(run_research)
def write_draft(self, research):
self.state.draft = writing_crew.kickoff({"research": research})
return self.state.draft
flow = ContentFlow()
result = flow.kickoff()
The real power of CrewAI is combining Crews (for autonomous reasoning steps) inside Flows (for reliable orchestration). Use a Crew when you want agents to figure out how to solve a problem. Use a Flow when you need to control exactly what happens.
Step 6: Real-World Use Cases Worth Building
CrewAI is already running at scale in production. Here are the patterns companies are actually using:
Content generation pipeline — A Researcher + SEO Analyst + Writer crew that takes a target keyword, researches the topic, analyzes competing content, and writes a full-length article. The workflow outputs a markdown file ready for editorial review.
Lead scoring and outreach — A crew that pulls a new lead from a CRM, researches the company, scores fit against ideal customer profile criteria, and drafts a personalized outreach email. With a Flow wrapping it, this runs on a schedule for every new inbound lead.
Customer support triage — Three agents: one to classify the inquiry type, one to pull relevant knowledge base content, one to draft the response. Integrated with a ticketing system so the draft lands in the agent's queue for one-click approval.
Code review pipeline — A Writer (generates code), a Reviewer (checks for bugs and best practices), and a Tester (writes and validates test cases). Companies using this pattern report significant reduction in basic code review cycle time.
Don't deploy a CrewAI system to production without adding guardrails on agent output. LLMs can produce confidently wrong responses. For anything customer-facing, add a review step — either human-in-the-loop or a validation agent that checks the output against a rubric before it's sent.
Step 7: Debug and Optimize Your Crews
When something goes wrong in a CrewAI system (and it will), the debugging workflow matters.
Enable verbose mode — Set verbose=True on both agents and the crew during development. This logs every decision, tool call, and thought the agents produce. It's noisy but invaluable.
Check task expected_output first — Most failures trace back to a vague expected_output. If you tell an agent to produce "a good analysis," it'll produce whatever it thinks that means. Specify format, length, and structure explicitly.
Isolate agents — Test each agent individually against a mock task before combining them into a crew. If the researcher agent can't produce a reliable brief on its own, the writer agent's output will be garbage too.
Monitor token usage — Crews can burn through tokens faster than you expect, especially with web search tools. Add logging for API costs in development and set hard limits before production.
Do I need to know Python to use CrewAI?
Yes — CrewAI is a Python framework and there's no visual interface. You need to be comfortable writing Python classes and functions, installing packages with pip, and managing virtual environments. You don't need advanced Python skills, but basic Python fluency is required. Most people get to a working crew within a few hours of their first session.
How much does it cost to run a CrewAI agent?
The main cost is LLM API calls. A crew running 2-3 agents on a single task typically consumes 5,000–20,000 tokens depending on task complexity. At Claude 3.5 Sonnet pricing (~$3/million input tokens, $15/million output tokens), a single crew run costs roughly $0.05–$0.30. High-volume production systems should use cost tracking from day one to avoid surprises.
Can CrewAI work with models other than GPT?
Yes. CrewAI supports any LiteLLM-compatible model, which includes Anthropic Claude, Google Gemini, Mistral, Groq, Ollama (local models), and dozens of others. To use Claude, install litellm and set the model at the agent level: llm="claude-3-5-sonnet-20241022". Many developers prefer Claude for its instruction-following consistency in complex multi-step agent workflows.
What is the difference between CrewAI Crews and Flows?
Crews enable autonomous, role-based collaboration where agents decide how to approach and complete tasks. Flows are event-driven pipelines where you define the exact execution sequence with explicit state management. In production, Flows are more reliable because they're deterministic — you know exactly what runs. Use Crews inside Flows for the steps that need genuine reasoning, and Flows for the overall orchestration.
Is CrewAI production-ready in 2026?
Yes. CrewAI has over 100,000 developers trained through its community certification program, and enterprises are moving use cases to production in 30–60 day timelines. Larger enterprises report 23% more production deployments compared to smaller teams. The framework is actively maintained, and the Flows feature was specifically added to address production reliability needs that Crews alone couldn't guarantee.
