LangChain Integration¶

Overview¶

The LangChainCallback tracks costs across all LLM calls in a LangChain chain or agent. Budget is checked before each LLM call and usage is logged after.

Setup¶

from langchain_openai import ChatOpenAI
from llm_toll import LangChainCallback

handler = LangChainCallback(project="my-chain", max_budget=10.0)
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

# All calls through this LLM instance are tracked
result = llm.invoke("Hello")

Parameters¶

Parameter	Type	Default	Description
`project`	`str`	`"default"`	Project name for grouping usage
`max_budget`	`float \\| None`	`None`	Hard budget cap in USD
`store`	`BaseStore \\| None`	`None`	Custom store instance (defaults to shared store)
`reporter`	`CostReporter \\| None`	`None`	Custom reporter instance

Callback Methods¶

`on_llm_start(serialized, prompts, **kwargs)`¶

Called before the LLM executes. If max_budget is set, checks the accumulated cost for the project and raises BudgetExceededError if the budget is exceeded.

`on_llm_end(response, **kwargs)`¶

Called after a successful LLM completion. Extracts token usage from LangChain's LLMResult.llm_output:

Model name from llm_output["model_name"]
Input tokens from llm_output["token_usage"]["prompt_tokens"]
Output tokens from llm_output["token_usage"]["completion_tokens"]

Calculates cost and logs usage to the store.

`on_llm_error(error, **kwargs)`¶

Called after a failed LLM call. No-op -- failed calls are not tracked.

Budget Enforcement¶

Unlike the LiteLLM callback, the LangChain callback checks budget before each LLM call:

handler = LangChainCallback(project="my-chain", max_budget=5.0)
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

try:
    result = llm.invoke("Hello")
except BudgetExceededError as e:
    print(f"Budget exceeded: ${e.current_cost:.4f} >= ${e.max_budget:.4f}")

With LangChain Chains¶

from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from llm_toll import LangChainCallback

handler = LangChainCallback(project="summarizer", max_budget=10.0)
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

prompt = ChatPromptTemplate.from_template("Summarize: {text}")
chain = prompt | llm

result = chain.invoke({"text": "Long document here..."})

With LangChain Agents¶

from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent
from llm_toll import LangChainCallback

handler = LangChainCallback(project="agent", max_budget=20.0)
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

# All LLM calls made by the agent are tracked
agent = create_react_agent(llm, tools, prompt)

Every LLM invocation within the agent's reasoning loop is tracked and counts toward the budget.