A local AI proxy that routes OpenAI-compatible requests to Ollama, Claude Code, Codex CLI, and other LLM backends.
Single API endpoint. Multiple providers. Automatic fallback.
AI coding tools speak OpenAI's API format, but you might want to route to Ollama locally, Claude remotely, or both with fallback. Superproxy gives you a single http://localhost:4141/v1 endpoint that handles routing, circuit breaking, and streaming normalization.
# Install
bun install
# Start the proxy
bun run dev
# Or build and run
bun run build
./dist/cli.js servePoint any OpenAI-compatible client at http://localhost:4141/v1:
curl http://localhost:4141/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"llama3","messages":[{"role":"user","content":"Hello"}]}'Create superproxy.config.json in your project root:
{
"server": {
"host": "127.0.0.1",
"port": 4141
},
"providers": {
"ollama": {
"enabled": true,
"baseUrl": "http://localhost:11434"
}
},
"routing": {
"modelMap": {
"gpt-4*": "ollama",
"claude-*": "claude"
},
"fallbackChain": ["ollama"]
}
}All commands accept --config <path> to specify a config file.
# Override host/port
superproxy serve --host 0.0.0.0 --port 8080
# Check provider health
superproxy doctor
# Test a model
superproxy test llama3
# Inspect routing table
superproxy routes inspect
# Validate config
superproxy config validateRequest → Validation → Router → Provider Adapter → Upstream
↓
Circuit Breaker + Fallback Chain
- Router resolves models to providers via exact match, glob patterns, or fallback chain
- Circuit Breaker tracks failures per provider and skips unhealthy ones
- Provider Adapters transform OpenAI-format requests to backend-specific formats (Ollama, Anthropic, etc.)
- SSE Streaming normalizes backend-specific streaming to OpenAI SSE format
| Provider | Status | Notes |
|---|---|---|
| Ollama | Supported | Local inference via /api/chat |
| Claude | Planned | Anthropic Messages API |
| Codex CLI | Planned | OpenAI-compatible |
| OpenAI-compat | Planned | Any OpenAI-compatible endpoint |
| Method | Path | Description |
|---|---|---|
POST |
/v1/chat/completions |
OpenAI-compatible chat completions (streaming + non-streaming) |
GET |
/v1/models |
List available models |
GET |
/health |
Server health (always returns 200) |
GET |
/ready |
Readiness check (verifies providers) |
GET |
/version |
Server version |
GET |
/metrics |
Prometheus-compatible metrics |
| Command | Description |
|---|---|
serve |
Start the proxy server |
doctor |
Check all providers: credentials, reachability, circuit state |
providers list |
List registered providers |
providers check <id> |
Check a single provider |
routes inspect |
Print the routing table |
config validate |
Validate configuration |
test <model> |
Send a test request through the proxy |
Add --json to any status command for programmatic output.
bun install # Install dependencies
bun test # Run tests
bun run typecheck # Type checking
bun run lint # Lint
bun run format # Format code
bun run build # Build for production- Bun — runtime, test runner, bundler
- TypeScript — type safety
- Zod — config and request validation
- Pino — structured logging
- Commander — CLI framework