Skip to content

LangChain Integration

Overview

The LangChainCallback tracks costs across all LLM calls in a LangChain chain or agent. Budget is checked before each LLM call and usage is logged after.

Setup

from langchain_openai import ChatOpenAI
from llm_toll import LangChainCallback

handler = LangChainCallback(project="my-chain", max_budget=10.0)
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

# All calls through this LLM instance are tracked
result = llm.invoke("Hello")

Parameters

Parameter Type Default Description
project str "default" Project name for grouping usage
max_budget float \| None None Hard budget cap in USD
store BaseStore \| None None Custom store instance (defaults to shared store)
reporter CostReporter \| None None Custom reporter instance

Callback Methods

on_llm_start(serialized, prompts, **kwargs)

Called before the LLM executes. If max_budget is set, checks the accumulated cost for the project and raises BudgetExceededError if the budget is exceeded.

on_llm_end(response, **kwargs)

Called after a successful LLM completion. Extracts token usage from LangChain's LLMResult.llm_output:

  • Model name from llm_output["model_name"]
  • Input tokens from llm_output["token_usage"]["prompt_tokens"]
  • Output tokens from llm_output["token_usage"]["completion_tokens"]

Calculates cost and logs usage to the store.

on_llm_error(error, **kwargs)

Called after a failed LLM call. No-op -- failed calls are not tracked.

Budget Enforcement

Unlike the LiteLLM callback, the LangChain callback checks budget before each LLM call:

handler = LangChainCallback(project="my-chain", max_budget=5.0)
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

try:
    result = llm.invoke("Hello")
except BudgetExceededError as e:
    print(f"Budget exceeded: ${e.current_cost:.4f} >= ${e.max_budget:.4f}")

With LangChain Chains

from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from llm_toll import LangChainCallback

handler = LangChainCallback(project="summarizer", max_budget=10.0)
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

prompt = ChatPromptTemplate.from_template("Summarize: {text}")
chain = prompt | llm

result = chain.invoke({"text": "Long document here..."})

With LangChain Agents

from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent
from llm_toll import LangChainCallback

handler = LangChainCallback(project="agent", max_budget=20.0)
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

# All LLM calls made by the agent are tracked
agent = create_react_agent(llm, tools, prompt)

Every LLM invocation within the agent's reasoning loop is tracked and counts toward the budget.