Use Fleet agents in code - Docs by LangChain

There are two main ways to use Fleet agents programmatically:

Call from code: Invoke your agent remotely via the LangGraph SDK or REST API, without downloading anything.
Export to code: Download your agent’s configuration and run it locally as a self-contained Python project using the fleet-deepagents-export package.

Call from code

You can invoke LangSmith Fleet agents from your applications using the LangGraph SDK or the REST API. Fleet agents run on Agent Server, so you can use the same API methods as any other LangSmith deployment. The REST API lets you call your agent from any language or platform that supports HTTP requests.

Prerequisites

A LangSmith account with a Fleet agent
A Personal Access Token (PAT) for authentication
(SDK only) The LangGraph SDK installed:

pip install langgraph-sdk python-dotenv

Authentication

To authenticate with your agent’s Fleet deployment, provide a LangSmith Personal Access Token (PAT) to the api_key argument when instantiating the LangGraph SDK client, or via the X-API-Key header. If using X-API-Key, you must also set the X-Auth-Scheme header to langsmith-api-key. If the PAT you pass is not tied to the owner of the agent, your request will be rejected with a 404 Not Found error. If the agent you’re trying to invoke is a and you’re not the owner, you can perform all the same operations as you would in the UI (read-only).

1. Get the agent ID and URL

To get your agent’s agent_id and api_url:

In the LangSmith UI, navigate to your agent’s inbox.
Next to the agent name, click the Edit Agent icon.
Click the Settings icon in the top right corner.
Click View code snippets to see pre-populated values for your agent.

Copy the code below and replace agent_id and api_url with the values from your agent’s code snippets. Create a .env file in your project root with your Personal Access Token:

.env

LANGGRAPH_API_KEY=your-personal-access-token

2. Fetch agent configuration

Verify your connection by fetching your agent’s configuration:

Python
TypeScript
cURL

import os
from dotenv import load_dotenv
from langgraph_sdk.client import get_client

load_dotenv()

agent_id = "your-agent-id"

api_key = os.getenv("LANGGRAPH_API_KEY")
api_url = "<AGENT-BUILDER-URL>.us.langgraph.app"

client = get_client(
    url=api_url,
    api_key=api_key,
    headers={
        "X-Auth-Scheme": "langsmith-api-key",
    },
)

async def get_assistant(agent_id: str):
    agent = await client.assistants.get(agent_id)
    print(agent)

if __name__ == "__main__":
    import asyncio
    asyncio.run(get_assistant(agent_id))

import "dotenv/config";
import { Client } from "@langchain/langgraph-sdk";

const agentId = "your-agent-id";

const apiKey = process.env.LANGGRAPH_API_KEY;
const apiUrl = "<AGENT-BUILDER-URL>.us.langgraph.app";

const client = new Client({
  apiUrl,
  apiKey,
  defaultHeaders: {
    "X-Auth-Scheme": "langsmith-api-key",
  },
});

async function main(agentId: string) {
  const agent = await client.assistants.get(agentId);
  console.log(agent);
}

main(agentId).catch(console.error);

curl --request GET \
    --url "<AGENT-BUILDER-URL>.us.langgraph.app/assistants/your-agent-id" \
    --header 'Content-Type: application/json' \
    --header 'X-Api-Key: your-personal-access-token' \
    --header 'X-Auth-Scheme: langsmith-api-key'

Use a Personal Access Token (PAT) tied to your LangSmith account. Set the X-Auth-Scheme header to langsmith-api-key for authentication.

3. Invoke agent

The examples below show how to send a message to your agent and receive a response. You can use either a stateless run (no thread, no conversation history) or a stateful run (with a thread to maintain conversation history across multiple turns).

Stateless run

A stateless run sends a single request and returns the full response. No conversation history is persisted. This is the simplest way to call your agent:

Python
TypeScript
cURL

import os
from dotenv import load_dotenv
from langgraph_sdk.client import get_client

load_dotenv()

agent_id = "your-agent-id"

api_key = os.getenv("LANGGRAPH_API_KEY")
api_url = "https://<AGENT-BUILDER-URL>.us.langgraph.app"

client = get_client(
    url=api_url,
    api_key=api_key,
    headers={
        "X-Auth-Scheme": "langsmith-api-key",
    },
)

result = await client.runs.wait(
    None,
    agent_id,
    input={
        "messages": [
            {"role": "user", "content": "What can you help me with?"}
        ]
    },
)
print(result)

import "dotenv/config";
import { Client } from "@langchain/langgraph-sdk";

const agentId = "your-agent-id";

const apiKey = process.env.LANGGRAPH_API_KEY;
const apiUrl = "<AGENT-BUILDER-URL>.us.langgraph.app";

const client = new Client({
  apiUrl,
  apiKey,
  defaultHeaders: {
    "X-Auth-Scheme": "langsmith-api-key",
  },
});

const result = await client.runs.wait(
  null,
  agentId,
  {
    input: {
      messages: [
        { role: "user", content: "What can you help me with?" }
      ]
    }
  }
);
console.log(result);

curl --request POST \
    --url "<AGENT-BUILDER-URL>.us.langgraph.app/runs/wait" \
    --header 'Content-Type: application/json' \
    --header 'X-Api-Key: your-personal-access-token' \
    --header 'X-Auth-Scheme: langsmith-api-key' \
    --data '{
        "assistant_id": "your-agent-id",
        "input": {
            "messages": [
                {
                    "role": "user",
                    "content": "What can you help me with?"
                }
            ]
        }
    }'

Stateless streaming run

To stream the response as it is generated rather than waiting for the full result, use the streaming endpoint:

Python
TypeScript
cURL

async for chunk in client.runs.stream(
    None,
    agent_id,
    input={
        "messages": [
            {"role": "user", "content": "What can you help me with?"}
        ]
    },
    stream_mode="updates",
):
    if chunk.data and "run_id" not in chunk.data:
        print(chunk.data)

const streamResponse = client.runs.stream(
  null,
  agentId,
  {
    input: {
      messages: [
        { role: "user", content: "What can you help me with?" }
      ]
    },
    streamMode: "updates"
  }
);
for await (const chunk of streamResponse) {
  if (chunk.data && !("run_id" in chunk.data)) {
    console.log(chunk.data);
  }
}

curl --request POST \
    --url "<AGENT-BUILDER-URL>.us.langgraph.app/runs/stream" \
    --header 'Content-Type: application/json' \
    --header 'X-Api-Key: your-personal-access-token' \
    --header 'X-Auth-Scheme: langsmith-api-key' \
    --data '{
        "assistant_id": "your-agent-id",
        "input": {
            "messages": [
                {
                    "role": "user",
                    "content": "What can you help me with?"
                }
            ]
        },
        "stream_mode": [
            "updates"
        ]
    }'

Stateful run with a thread

To maintain conversation history across multiple interactions, first create a thread and then run your agent on it. Each subsequent run on the same thread has access to the full message history:

Python
TypeScript
cURL

import os
from dotenv import load_dotenv
from langgraph_sdk.client import get_client

load_dotenv()

agent_id = "your-agent-id"

api_key = os.getenv("LANGGRAPH_API_KEY")
api_url = "<AGENT-BUILDER-URL>.us.langgraph.app"

client = get_client(
    url=api_url,
    api_key=api_key,
    headers={
        "X-Auth-Scheme": "langsmith-api-key",
    },
)

thread = await client.threads.create()

async for chunk in client.runs.stream(
    thread["thread_id"],
    agent_id,
    input={
        "messages": [
            {"role": "user", "content": "Hi, my name is Alice."}
        ]
    },
    stream_mode="updates",
):
    if chunk.data and "run_id" not in chunk.data:
        print(chunk.data)

async for chunk in client.runs.stream(
    thread["thread_id"],
    agent_id,
    input={
        "messages": [
            {"role": "user", "content": "What is my name?"}
        ]
    },
    stream_mode="updates",
):
    if chunk.data and "run_id" not in chunk.data:
        print(chunk.data)

import "dotenv/config";
import { Client } from "@langchain/langgraph-sdk";

const agentId = "your-agent-id";

const apiKey = process.env.LANGGRAPH_API_KEY;
const apiUrl = "<AGENT-BUILDER-URL>.us.langgraph.app";

const client = new Client({
  apiUrl,
  apiKey,
  defaultHeaders: {
    "X-Auth-Scheme": "langsmith-api-key",
  },
});

const thread = await client.threads.create();

let streamResponse = client.runs.stream(
  thread["thread_id"],
  agentId,
  {
    input: {
      messages: [
        { role: "user", content: "Hi, my name is Alice." }
      ]
    },
    streamMode: "updates"
  }
);
for await (const chunk of streamResponse) {
  if (chunk.data && !("run_id" in chunk.data)) {
    console.log(chunk.data);
  }
}

streamResponse = client.runs.stream(
  thread["thread_id"],
  agentId,
  {
    input: {
      messages: [
        { role: "user", content: "What is my name?" }
      ]
    },
    streamMode: "updates"
  }
);
for await (const chunk of streamResponse) {
  if (chunk.data && !("run_id" in chunk.data)) {
    console.log(chunk.data);
  }
}

First, create a thread:

curl --request POST \
    --url "<AGENT-BUILDER-URL>.us.langgraph.app/threads" \
    --header 'Content-Type: application/json' \
    --header 'X-Api-Key: your-personal-access-token' \
    --header 'X-Auth-Scheme: langsmith-api-key' \
    --data '{}'

Use the thread_id from the response to send messages on the thread:

curl --request POST \
    --url "<AGENT-BUILDER-URL>.us.langgraph.app/threads/<THREAD_ID>/runs/stream" \
    --header 'Content-Type: application/json' \
    --header 'X-Api-Key: your-personal-access-token' \
    --header 'X-Auth-Scheme: langsmith-api-key' \
    --data '{
        "assistant_id": "your-agent-id",
        "input": {
            "messages": [
                {
                    "role": "user",
                    "content": "Hi, my name is Alice."
                }
            ]
        },
        "stream_mode": [
            "updates"
        ]
    }'

Send a follow-up message on the same thread:

curl --request POST \
    --url "<AGENT-BUILDER-URL>.us.langgraph.app/threads/<THREAD_ID>/runs/stream" \
    --header 'Content-Type: application/json' \
    --header 'X-Api-Key: your-personal-access-token' \
    --header 'X-Auth-Scheme: langsmith-api-key' \
    --data '{
        "assistant_id": "your-agent-id",
        "input": {
            "messages": [
                {
                    "role": "user",
                    "content": "What is my name?"
                }
            ]
        },
        "stream_mode": [
            "updates"
        ]
    }'

REST API reference

The table below summarizes the key endpoints. Replace <API_URL> with your agent’s deployment URL.

Operation	Method	Endpoint
Get agent info	`GET`	`<API_URL>/assistants/<AGENT_ID>`
Create a thread	`POST`	`<API_URL>/threads`
Run (wait for result)	`POST`	`<API_URL>/runs/wait`
Run (streaming)	`POST`	`<API_URL>/runs/stream`
Run on thread (wait)	`POST`	`<API_URL>/threads/<THREAD_ID>/runs/wait`
/langsmith/agent-server-api/thread-runs/create-run-stream-output	`POST`	`<API_URL>/threads/<THREAD_ID>/runs/stream`

All endpoints require the following headers:

Content-Type: application/json
X-Api-Key: your Personal Access Token
X-Auth-Scheme: langsmith-api-key

For the full API specification, see the Agent Server API reference.

Export to code

The Export to code feature lets you download your Fleet agent as a self-contained Python project and run it locally. This is useful when you want to:

Run your agent in your own infrastructure without calling the Fleet API
Extend or customize the agent beyond what the Fleet UI supports (add custom tools, middleware, or skills)
Inspect or version-control the full agent implementation
Use LangGraph Studio for local development and graph inspection

The fleet-deepagents-export package (GitHub) handles reading the exported configuration and wiring up your agent with MCP tools, subagents, and skills.

Prerequisites

Python 3.11+
uv (recommended) for dependency management
A LangSmith Fleet agent to export

1. Copy the starter project

The starter project at examples/template-agent/ is the recommended starting point. Clone the repo and copy the starter:

git clone https://github.com/langchain-ai/fleet-deepagents-export.git
cp -R fleet-deepagents-export/examples/template-agent my-agent
cd my-agent

2. Export your agent from Fleet

In the LangSmith UI, open your agent and export it as a .zip file.

Then drop the contents into the fleet/ directory of your starter project:

unzip path/to/my-export.zip -d fleet/

The fleet/ directory contains everything your agent needs:

AGENTS.md — system prompt
config.json — model configuration and workspace metadata
tools.json — MCP server connections
subagents/ (optional) — subagent definitions
skills/ (optional) — skill instructions

3. Configure your environment

Copy the example env file and fill in the required values:

cp .env.example .env

The three LANGSMITH_*_ID values are in fleet/config.json under metadata. Open that file and copy tenant_id, organization_id, and ls_user_id into your .env:

.env

# Model provider — set the key for whichever provider your agent uses
ANTHROPIC_API_KEY=your-anthropic-api-key

# LangSmith credentials — copy IDs from fleet/config.json → metadata
LANGSMITH_API_KEY=your-langsmith-pat
LANGSMITH_TENANT_ID=your-tenant-id
LANGSMITH_ORGANIZATION_ID=your-organization-id
LANGSMITH_USER_ID=your-user-id       # required if your agent uses OAuth tools

# Built-in MCP tools (Gmail, Calendar, GitHub)
BUILTIN_MCP_URL=https://tools.langchain.com/mcp

4. Install dependencies and run

make setup    # installs dependencies via uv sync

Then choose how to interact with your agent:

make dev    # LangGraph Studio — browser UI for chat and graph inspection
make run    # terminal REPL via cli.py — text-only chat

5. Customize the agent

The starter separates Fleet-owned files from files you own and can freely edit:

File / Directory	Owner	Purpose
`fleet/`	Fleet	Drop export contents here. Re-unzip to update; nothing else is touched.
`agent.py`	You	Graph wiring. Override the model by replacing the `model = components.pop("model")` line.
`custom_tools.py`	You	Add code-defined tools; merged with Fleet MCP tools at runtime.
`custom_middleware.py`	You	Add `AgentMiddleware` instances for logging, filters, pre/post hooks, etc.
`custom_skills/`	You	Drop `<skill-name>/SKILL.md` files; layered on top of `fleet/skills/`.
`cli.py`	You	Terminal REPL; edit freely.

Here is the full agent.py from the starter:

"""Standalone deepagent exported from LangSmith Fleet.

LangGraph Studio / dev server:  make dev
Terminal:                        make run  (see cli.py)

Extension points (edit these, not this file):
- ``custom_tools.py``      — add code-defined tools
- ``custom_middleware.py`` — wrap the agent loop with logging, filters, etc.
- ``custom_skills/``       — drop ``<skill-name>/SKILL.md`` files
"""

from __future__ import annotations

from pathlib import Path
from typing import Any

from dotenv import load_dotenv

load_dotenv()

from custom_middleware import custom_middleware
from custom_tools import custom_tools
from deepagents import create_deep_agent
from fleet_deepagents_export import StaticSkillsLoader, load_agent_components

PROJECT_DIR = Path(__file__).parent
FLEET_DIR = PROJECT_DIR / "fleet"
CUSTOM_SKILLS_DIR = PROJECT_DIR / "custom_skills"

# Read SKILL.md from disk once; middleware injects into state on first turn.
_SKILL_LOADER = StaticSkillsLoader(
    [
        (FLEET_DIR / "skills", "/skills/fleet"),
        (CUSTOM_SKILLS_DIR, "/skills/custom"),
    ]
)


async def graph(runtime: Any):
    """Build and return the agent graph."""
    components = await load_agent_components(FLEET_DIR)
    model = components.pop("model")  # from fleet/config.json; replace to override
    components["tools"] = list(components["tools"]) + list(custom_tools)

    if _SKILL_LOADER.files:
        components["skills"] = _SKILL_LOADER.skill_paths

    return create_deep_agent(
        model=model,
        middleware=[_SKILL_LOADER, *custom_middleware],
        **components,
    ).with_config({"recursion_limit": 1000})

Re-exporting

When you export a new version of your agent from Fleet, simply wipe and re-unzip — your customizations are untouched:

rm -rf fleet && unzip path/to/my-new-export.zip -d fleet/

Supported model providers

The starter ships with langchain-anthropic, langchain-openai, and langchain-google-genai. For any other provider (e.g. bedrock, fireworks), add the matching langchain-<provider> package to pyproject.toml.

MCP authentication

At startup, each tool’s mcp_server_url is resolved against LangSmith’s MCP server registry:

Built-in LangSmith tools (Gmail, Calendar, GitHub) — authenticated via your LANGSMITH_API_KEY.
Static-credential servers (auth_type: "headers") — credentials come from the registry record. Requires mcp-servers:invoke permission.
OAuth servers (auth_type: "oauth") — bearer token fetched from LangSmith’s OAuth broker. A browser window opens on first run for any per-user server that hasn’t been authorized yet.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Edit this page on GitHub or file an issue.

​Call from code

​Prerequisites

​Authentication

​1. Get the agent ID and URL

​2. Fetch agent configuration

​3. Invoke agent

​Stateless run

​Stateless streaming run

​Stateful run with a thread

​REST API reference

​Export to code

​Prerequisites

​1. Copy the starter project

​2. Export your agent from Fleet

​3. Configure your environment

​4. Install dependencies and run

​5. Customize the agent

​Re-exporting

​Supported model providers

​MCP authentication

Call from code

Prerequisites

Authentication

1. Get the agent ID and URL

2. Fetch agent configuration

3. Invoke agent

Stateless run

Stateless streaming run

Stateful run with a thread

REST API reference

Export to code

Prerequisites

1. Copy the starter project

2. Export your agent from Fleet

3. Configure your environment

4. Install dependencies and run

5. Customize the agent

Re-exporting

Supported model providers

MCP authentication