Multi-tenant AI gateway for IG MSP services.

One endpoint. Per-tenant virtual keys. Hard budget caps. Edge prompt caching. Audit log. PII redaction. Model failover. Built on LiteLLM + Cloudflare AI Gateway. Run by Manny, the Intelligent IT assistant.

Tenants ->Policies ->Audit log ->Virtual keys ->Public demo dashboard ->

Tenants

active LiteLLM teams

Spend MTD

$8,668 / $13,150 cap

65.9% of combined cap

Requests · last 24h

28.1K

across all tenants

Policy violations · 30d

10 active policies · 22 active keys

What ships in the box

Per-tenant isolation

One LiteLLM team per tenant, one virtual key, one budget cap, one model allowlist. Tenants never see each other's traffic, keys, or spend. The master key only ever lives in GSM.

Server-side enforcement

Caps and content blocks are enforced at the gateway, before any provider call. PII / PHI / PCI never reaches a model. "Ignore previous instructions" never moves a budget.

Audit by default

Every request gets an immutable audit row: tenant, user, model, tokens, cost, latency, decision, policy hit. Useful for SOC2, HIPAA, and the "why did our bill spike" conversation.

Model failover & caching

Cloudflare AI Gateway sits in front for edge prompt caching (~30% hit rate on repeat workloads) and provider failover. When Anthropic stutters, traffic shifts to OpenAI inside the same policy.