How to Build AI Agents with JavaScript and Node.js

If you can write a Node.js route handler, you can build an AI agent. The hard part was never the language — it was understanding the loop, the tool interface, and the safeguards that separate a demo from something you trust in production. This guide walks through the entire stack in JavaScript, using the patterns that actually ship in 2026.

Definition

An AI agent is a program that uses a large language model to reason about a goal, call external tools to act on the world, observe the results, and loop until the goal is complete or a stopping condition is hit.

TL;DR

The modern JavaScript stack for agents is Node.js 20+ with either the OpenAI Agents SDK or a direct SDK call in a ReAct loop
Agents differ from chatbots because they can act — every agent needs tools (functions, APIs, MCP servers) and a loop that calls them
Production agents require iteration caps (10 for simple tasks, 25 for complex), exponential backoff on rate limits, and per-tool error isolation
One uncapped agent can burn through hundreds of iterations on a malformed request, so guardrails are not optional
You can ship a working v1 agent in under 150 lines of JavaScript

Why JavaScript for AI Agents in 2026

For years, Python was the default for anything LLM-related. That gap has closed. The OpenAI Agents SDK now ships a first-class TypeScript implementation, the Vercel AI SDK makes streaming agent responses into a web UI trivial, and the Model Context Protocol has JavaScript server and client libraries that mirror the Python versions feature-for-feature.

The practical upshot: if your stack is already Node.js, React, or Next.js, there is no reason to stand up a separate Python service just to run agents. You can keep everything in one runtime, one deployment pipeline, and one set of dependencies.

The other advantage is streaming. Agents produce intermediate reasoning, partial tool calls, and staged outputs. Handing those to a browser over Server-Sent Events or WebSockets is native territory for Node. Python works here too, but JavaScript removes a translation layer.

The Core Concept: ReAct and the Agent Loop

Every production agent in 2026 runs some variant of ReAct — Reason and Act. The pattern is almost comically simple once you see it:

Send the conversation history plus the tool list to the model
If the model returns a plain message, you're done
If the model returns one or more tool calls, execute each tool
Append the tool results to the conversation history
Go back to step 1

That is the entire loop. Everything else — memory, planning, multi-agent handoffs — is a variation on this structure. Claude, GPT, and Gemini all support this pattern through their native tool calling APIs, so the code stays nearly identical regardless of which model you pick.

Info

If you've built a chatbot that just calls chat.completions.create and returns the message, you are 80% of the way to an agent. The missing 20% is a while loop and a function dispatcher.

Your JavaScript Agent Stack

You have three realistic choices for framework in 2026. Pick based on how much abstraction you want.

Framework	Best For	Abstraction Level	Streaming UI
OpenAI Agents SDK (JS/TS)	Multi-agent workflows, handoffs, tracing	Medium	Built-in
Vercel AI SDK	Web apps with streaming React UI	Low	Native
Direct SDK + custom loop	Full control, minimal deps, edge runtimes	None	DIY

For teams just getting started, the direct approach is often the right call. You learn the loop once, and every framework after that makes sense.

Step 1: Set Up a Node.js Project

Start with Node.js 20 or newer for native fetch and ES modules. Initialize the project and install the OpenAI SDK:

mkdir my-agent && cd my-agent
npm init -y
npm install openai zod dotenv

Add "type": "module" to your package.json so you can use import statements. Store your API key in a .env file and load it with dotenv — never hardcode keys in your source.

Step 2: Define Your Tools

A tool is just a JavaScript function plus a JSON schema that describes its parameters. The schema is what the LLM sees when it decides which tool to call. Zod makes this less painful by letting you define the schema and TypeScript types in one shot.

Here's a minimal two-tool setup — one to get the weather, one to calculate:

const tools = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Get current weather for a city",
      parameters: {
        type: "object",
        properties: {
          city: { type: "string", description: "City name" }
        },
        required: ["city"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "calculate",
      description: "Evaluate a math expression",
      parameters: {
        type: "object",
        properties: {
          expression: { type: "string" }
        },
        required: ["expression"]
      }
    }
  }
];

const toolFunctions = {
  get_weather: async ({ city }) => {
    // Replace with real API call
    return `Weather in ${city}: 72°F, sunny`;
  },
  calculate: async ({ expression }) => {
    try {
      return String(Function(`"use strict"; return (${expression})`)());
    } catch (err) {
      return `Error: ${err.message}`;
    }
  }
};

Notice the calculate tool catches its own errors and returns a structured error string. This is critical. If a tool throws, the LLM cannot reason about what went wrong. If it returns an error message, the LLM can course-correct.

Step 3: Write the Agent Loop

Here is the whole loop in roughly 40 lines:

import OpenAI from "openai";
import "dotenv/config";

const client = new OpenAI();
const MAX_ITERATIONS = 10;

async function runAgent(userMessage) {
  const messages = [
    { role: "system", content: "You are a helpful assistant with access to tools." },
    { role: "user", content: userMessage }
  ];

  for (let i = 0; i < MAX_ITERATIONS; i++) {
    const response = await client.chat.completions.create({
      model: "gpt-4.1",
      messages,
      tools
    });

    const msg = response.choices[0].message;
    messages.push(msg);

    if (!msg.tool_calls) {
      return msg.content;
    }

    for (const call of msg.tool_calls) {
      const fn = toolFunctions[call.function.name];
      const args = JSON.parse(call.function.arguments);
      const result = await fn(args);

      messages.push({
        role: "tool",
        tool_call_id: call.id,
        content: String(result)
      });
    }
  }

  throw new Error("Agent exceeded max iterations");
}

console.log(await runAgent("What's the weather in Austin, and what's 450 times 12?"));

That's it. You have a working agent. It will call get_weather for Austin, call calculate for the math, and then answer the user in plain English once it has both pieces.

Step 4: Add Production Safeguards

The 40-line version works for demos. For anything touching real users or real money, you need four additional patterns.

Iteration cap. Already in the example above as MAX_ITERATIONS. Use 10 for simple workflows, 25 for complex multi-step tasks. An uncapped agent can burn hundreds of iterations on a malformed request before anyone notices.

Exponential backoff on rate limits. Catch 429 and 529 status codes and retry with delays of 1s, 2s, 4s, 8s. Most LLM SDKs have this built in, but confirm it's enabled.

Per-tool error isolation. Every tool function should wrap its body in try/catch and return a structured error. Never let a tool throw into the loop.

Timeouts. Wrap tool calls in Promise.race against a timeout. A stuck HTTP call to a slow API can hang your entire agent.

Warning

Always set hard spend limits at the provider level. Dashboard limits are your last line of defense when application-level caps fail. A single runaway agent can rack up triple-digit token bills in under an hour.

Step 5: Level Up With the OpenAI Agents SDK

Once you've built the loop from scratch, the OpenAI Agents SDK for JavaScript becomes more useful. It gives you agents-as-tools for handoffs between specialized agents, built-in tracing that visualizes every iteration in a dashboard, session management for multi-turn memory, and guardrails for input and output validation.

Install it with npm install @openai/agents and rewrite the same weather-and-math agent in about 15 lines. The tradeoff: less code, slightly less control over the loop internals.

Connecting Tools With MCP

The Model Context Protocol has taken over as the standard way to expose tools to agents in 2026. Instead of defining tools inline in your code, you point your agent at an MCP server — local or remote — and it discovers the available tools automatically.

For JavaScript, the @modelcontextprotocol/sdk package gives you both server and client. A typical pattern: your agent runs in Node.js and connects to an MCP server that wraps your internal APIs, database queries, or third-party integrations. This keeps the tool layer separate from the agent layer and makes the same tools reusable across agents built in Python, JavaScript, or direct Claude/GPT integrations.

Common Pitfalls to Avoid

The three mistakes I see most often when teams ship their first JavaScript agent:

First, forgetting to pass the full message history back on each iteration. The LLM is stateless. Every call needs the complete conversation, including all tool results, or the agent will loop forever asking the same question.

Second, returning objects instead of strings from tools. The content field in a tool message must be a string. If you return an object, serialize it with JSON.stringify first.

Third, assuming the LLM will always produce valid JSON in tool_calls.function.arguments. It usually does, but defensive parsing with a try/catch around JSON.parse prevents a single bad call from crashing the loop.

Do I need TypeScript to build an AI agent in Node.js?

No. Every example in this guide works in plain JavaScript with ES modules. TypeScript helps catch tool schema errors at compile time and integrates more cleanly with the OpenAI Agents SDK, but it's a preference, not a requirement. Start in JavaScript if that's what you know, and migrate later.

Which model should I use for a production agent?

For most production agents in 2026, GPT-4.1 or Claude Sonnet hit the best balance of cost, reasoning, and tool-calling reliability. GPT-4.1-mini works well for simpler agents where each tool call is straightforward. Save GPT-5 or Claude Opus for agents that require deep multi-step reasoning or complex planning, where the extra cost is justified by fewer iterations.

How do I stream agent responses to a browser in Node.js?

Use Server-Sent Events or the Vercel AI SDK. Both LLM providers support streaming responses, and you can forward chunks to the browser as they arrive. The Vercel AI SDK handles this with a single useChat hook on the React side, which also renders partial tool calls so users see the agent's progress rather than waiting for the final answer.

Can I run a JavaScript agent on serverless or edge functions?

Yes, with caveats. Edge runtimes like Vercel Edge Functions and Cloudflare Workers support the OpenAI SDK and fetch natively, but they have execution time limits (usually 30-60 seconds). For longer-running agent loops, use a traditional Node.js deployment or break the agent into shorter steps with durable state storage between iterations.

What's the difference between an AI agent and a chatbot?

A chatbot takes a message and returns a message. An agent takes a goal, reasons about how to achieve it, calls tools to take action, observes results, and loops until done. The structural difference is the loop and the tools. The practical difference is capability: a chatbot can answer questions about weather, an agent can check the forecast, book a flight, and email you the confirmation.

Sources: