Budget Enforcement¶
Overview¶
Budget enforcement prevents runaway LLM costs by halting execution when cumulative spend exceeds a configured threshold.
Setting a Budget¶
from llm_toll import track_costs
@track_costs(project="my_app", max_budget=5.00)
def chat(prompt):
return client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
The max_budget parameter is in USD. When the accumulated cost for the project reaches or exceeds this value, BudgetExceededError is raised before the next API call is made.
How It Works¶
Budget checks happen at two points:
-
Pre-call check -- Before the wrapped function executes, the decorator queries the store for the project's total cost. If
current_cost >= max_budget, the function is never called andBudgetExceededErroris raised immediately. -
Atomic post-call check -- After calculating the cost of a call, the decorator uses
log_usage_if_within_budget()which atomically checks the budget and logs usage in a single transaction. If the new total would exceed the budget, the usage is not logged andBudgetExceededErroris raised.
This two-phase approach prevents both over-budget calls and TOCTOU races in concurrent environments.
BudgetExceededError¶
from llm_toll.exceptions import BudgetExceededError
try:
result = chat("Hello")
except BudgetExceededError as e:
print(f"Project: {e.project}")
print(f"Current cost: ${e.current_cost:.4f}")
print(f"Budget limit: ${e.max_budget:.4f}")
The exception carries structured attributes:
| Attribute | Type | Description |
|---|---|---|
project |
str \| None |
Project name that exceeded its budget |
current_cost |
float \| None |
Accumulated cost at the time of the error |
max_budget |
float \| None |
Configured budget cap |
Budget Reset¶
Reset a project's accumulated cost via the CLI:
Or programmatically:
Streaming Budget Behavior¶
When a streaming response is consumed, the cost is real regardless of budget state. If a streaming call pushes the total over budget:
- The usage is still logged (via
log_usage) to keep the running total accurate - A warning is emitted explaining the budget was exceeded during streaming
- The next call to the decorated function will raise
BudgetExceededErrorin the pre-call check
This design ensures the budget total stays accurate while acknowledging that a stream that has already been consumed cannot be "un-consumed."
Budget Periods¶
The reset parameter controls automatic budget resets:
Note
Budget reset periods are tracked in the budgets table via the last_reset_at column.