Spend caps, rate limits & alarms
Put hard ceilings on spend and traffic, and get notified before a usage meter runs out.
The LIMITS group gives you three controls over cost and load. Spend caps and rate limits enforce β they block when a ceiling is reached. Usage alarms notify β they alert configured channels without blocking. All three are enforced server-side.
Spend caps
Spend caps (/spend-caps) set a hard spending ceiling per scope and period. Alerts fire at the percentages you configure; the hard cap blocks further requests once it is reached.
- Scope and period β cap by workspace, user, or conversation, over a daily, monthly, or per-conversation window.
- Hard cap β a ceiling in USD; requests are blocked when spend reaches it.
- Alert thresholds β a ladder of percentages, each firing an alert as spend climbs toward the cap.
Rate limits
Rate limits (/rate-limits) throttle traffic per scope. A soft limit warns; a hard limit blocks.
- Scope β limit by workspace, tenant, tool, user, or IP address, optionally targeting a specific value.
- Soft and hard limits β the soft limit warns, the hard limit blocks further requests.
- Window β the number of requests allowed per rolling time window, in seconds.
Usage alarms
Usage alarms (/usage-alarms) notify you before a meter runs out. An alarm fires when this monthβs usage crosses a threshold percentage of your plan allowance β it alerts, it does not block. Every workspace starts with alarms at 85% and 100%.
- Threshold β the percentage of your monthly compute-unit allowance at which the alarm fires.
- Channels β notify by email, Slack, or webhook.
- Defaults β the seeded 85% and 100% alarms can be edited, disabled, or deleted, and you can add your own.