skip to main content
Browse documentation

Spend caps, rate limits & alarms

Put hard ceilings on spend and traffic, and get notified before a usage meter runs out.

The LIMITS group gives you three controls over cost and load. Spend caps and rate limits enforce β€” they block when a ceiling is reached. Usage alarms notify β€” they alert configured channels without blocking. All three are enforced server-side.

Spend caps

Spend caps (/spend-caps) set a hard spending ceiling per scope and period. Alerts fire at the percentages you configure; the hard cap blocks further requests once it is reached.

  • Scope and period β€” cap by workspace, user, or conversation, over a daily, monthly, or per-conversation window.
  • Hard cap β€” a ceiling in USD; requests are blocked when spend reaches it.
  • Alert thresholds β€” a ladder of percentages, each firing an alert as spend climbs toward the cap.

Rate limits

Rate limits (/rate-limits) throttle traffic per scope. A soft limit warns; a hard limit blocks.

  • Scope β€” limit by workspace, tenant, tool, user, or IP address, optionally targeting a specific value.
  • Soft and hard limits β€” the soft limit warns, the hard limit blocks further requests.
  • Window β€” the number of requests allowed per rolling time window, in seconds.

Usage alarms

Usage alarms (/usage-alarms) notify you before a meter runs out. An alarm fires when this month’s usage crosses a threshold percentage of your plan allowance β€” it alerts, it does not block. Every workspace starts with alarms at 85% and 100%.

  • Threshold β€” the percentage of your monthly compute-unit allowance at which the alarm fires.
  • Channels β€” notify by email, Slack, or webhook.
  • Defaults β€” the seeded 85% and 100% alarms can be edited, disabled, or deleted, and you can add your own.

Next steps