How to add billing to a LangChain agent

PythonLangChainLCEL

Adding billing to a LangChain agent takes two calls: one before the chain runs, one after. No middleware, no monkey-patching. Works with any LangChain component — LCEL chains, AgentExecutor, RetrievalQA, or custom runnables.

Install

pip install agentbill-sdk langchain-openai langchain-core

Pattern 1 — Manual preflight + record

The explicit pattern. Check budget before the chain runs, record units after it completes.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from agentbill import AgentBillClient

# ceiling=50: block any run estimated at more than 50 units
client = AgentBillClient(api_key="agb_your_key", ceiling=50)

def run_research_agent(customer_id: str, topic: str) -> str:
    # 1. Preflight — block before any tokens are consumed
    check = client.preflight(
        agent_id="research_chain",
        estimated_units=10,
        customer_id=customer_id
    )
    if not check.approved:
        raise Exception(f"Blocked for {customer_id}: {check.reason}")

    # 2. Run the LangChain chain normally (LCEL syntax)
    llm = ChatOpenAI(model="gpt-4o")
    prompt = ChatPromptTemplate.from_template("Research this topic in depth: {topic}")
    chain = prompt | llm
    result = chain.invoke({"topic": topic})

    # 3. Record units used
    client.record(agent_id="research_chain", units=10, customer_id=customer_id)
    return result.content
      

Pattern 2 — @gate decorator (cleanest)

The @client.gate() decorator handles preflight and record automatically. Zero boilerplate inside the function.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from agentbill import AgentBillClient

client = AgentBillClient(api_key="agb_your_key", ceiling=50)

@client.gate(agent_id="research_chain", estimated_units=10, customer_id="user_123")
def run_research_agent(topic: str) -> str:
    llm = ChatOpenAI(model="gpt-4o")
    prompt = ChatPromptTemplate.from_template("Research: {topic}")
    chain = prompt | llm
    return chain.invoke({"topic": topic}).content

# preflight runs before, record runs after — automatically
result = run_research_agent("quantum computing")
      

Pattern 3 — Mid-run checkpoint for long chains

For agents that run many steps, use checkpoint() to enforce a ceiling mid-run. The agent is blocked if it has already consumed too many units.

from agentbill import AgentBillClient

client = AgentBillClient(api_key="agb_your_key")

def run_multi_step_agent(customer_id: str, tasks: list) -> list:
    client.preflight(agent_id="multi_step", estimated_units=len(tasks), customer_id=customer_id)

    results = []
    for i, task in enumerate(tasks):
        result = run_single_task(task)
        results.append(result)

        # Check mid-run — stop if ceiling is hit
        cp = client.checkpoint(
            agent_id="multi_step",
            units_so_far=i + 1,
            ceiling=20,
            customer_id=customer_id
        )
        if not cp.approved:
            break  # stopped early — no runaway cost

    client.record(agent_id="multi_step", units=len(results), customer_id=customer_id)
    return results
      

Error handling

from agentbill import AgentBillClient, BudgetExhaustedError, CeilingExceededError, FreeTierExceededError

try:
    result = run_research_agent("user_123", "quantum computing")
except CeilingExceededError:
    return {"error": "run exceeds your per-request ceiling"}
except BudgetExhaustedError:
    return {"error": "customer budget exhausted — top up to continue"}
except FreeTierExceededError as e:
    return {"error": "free tier limit reached", "upgrade_url": e.upgrade_url}
      

Works with any LangChain component

AgentBill wraps at the invocation level — it doesn't care what's inside the chain. Use it with:

LLMChain   AgentExecutor   RetrievalQA   ConversationalChain   LangGraph

Per-customer billing

Pass customer_id to enforce separate budgets per user. Each customer has their own usage counters and free tier allowance.

# Different customers — isolated budgets
check_alice = client.preflight(agent_id="research", estimated_units=10, customer_id="alice")
check_bob   = client.preflight(agent_id="research", estimated_units=10, customer_id="bob")
      

LangGraph support

For LangGraph workflows, call preflight() before entering the graph and record() after the final node completes. Use checkpoint() inside nodes to enforce ceilings mid-graph.


Get your API key — free tier, no credit card

Related guides

How to add billing to a LangChain agent How to add a spend ceiling to an OpenAI agent How to limit cost per agent run