close
Whitepaper v0.12 Now public — read the design

Redefining how compute runs.

Orchestration infrastructure for a new generation of workloads.

~50 MB node image ~30 MB single-mode control plane < 10s node join FSL-1.1-ALv2
Platform Engineering

Get out from under the stack you maintain.

Stop running etcd, cert-manager, an ingress controller, a service mesh, and a CNI plugin as four independent failure domains. Overdrive ships them as one binary, with three-node HA that fits in 80 MB of memory.

See the architecture
SRE & On-Call

Incidents that investigate themselves.

Every eBPF event carries cryptographic workload identity. The native SRE agent correlates across alerts via SQL joins, attaches signed BPF probes to verify hypotheses, and proposes typed remediations through a graduated approval gate.

See the SRE agent
AI Engineering

Agents that can't exfiltrate what they don't have.

Prompt injection becomes structurally inert. The credential proxy holds the real keys. Domain allowlists run in-kernel via TC eBPF. BPF LSM blocks raw sockets. Security is enforced by infrastructure, not by the model's judgment.

See agent isolation
Use cases

Built for workloads Kubernetes was never shaped for.

Persistent microVMs, structural credential isolation, per-workload WASM sandboxes, and global-gossip service catalogs compose into operational shapes other orchestrators have to bolt together from third-party pieces.

AI coding agents microvm · persistent

Claude Code. Cursor. Devin-style sessions.

State accumulates across turns. 100 GB persistent rootfs per agent. Fast resume from snapshot between turns. The credential proxy holds the real API keys; the content inspector scans every tool response for injection payloads before the model sees them.

  • Persistent rootfs via overdrive-fs
  • Checkpoint/restore with userfaultfd lazy paging
  • SPIFFE-bound credential proxy + domain allowlist
CI runners microvm · persistent

Self-hosted Buildkite, GitLab, GitHub Actions.

Warm layer caches survive across builds. Artifact working sets persist. Scale-to-zero between jobs without paying cold-start on the next one. Per-repo kernel isolation and no shared node state between tenants.

  • Sub-second resume from snapshot
  • Per-workload SPIFFE identity, no shared secrets
  • BPF LSM blocks arbitrary binary execution
Remote dev environments microvm · persistent

Codespaces. Remote Jupyter. Interactive notebooks.

Per-user persistent rootfs. User filesystem survives across restarts, migrations, and idle eviction. Instant resume when the developer reconnects. Cross-workload volume sharing via virtiofs for team-scale data sets.

  • Gateway auto-route — one URL per workload
  • virtiofs volumes shared across workload types
  • Idle eviction with transparent resume on request
Customer-code sandboxes wasm · microvm

SaaS that runs arbitrary tenant code.

Per-tenant kernel-level isolation. Defense in depth across four independent layers: WASM/VM boundary, BPF LSM, kTLS + SPIFFE, XDP network policy. Compromise one, the other three still hold.

  • ~1 ms WASM cold start; warm pool per function
  • WASI capabilities explicitly granted per job spec
  • Wasmtime fuel prevents infinite-loop starvation
Multi-region services all drivers

Regional autonomy without a federation plane.

Each region runs its own Raft. Global observation converges via CRDT gossip in seconds. Under partition, every region keeps submitting jobs and serving traffic. The dataplane never reads remote state. Replay across regions via a single response header.

  • Per-region Raft quorum — no WAN latency on writes
  • Global service catalog via Corrosion / CR-SQLite
  • overdrive-replay header for app-driven routing
Serverless APIs wasm

HTTP endpoints with AI-agent-grade isolation.

~1 ms cold start. Scale-to-zero. The credential proxy and content inspector ship by default — so functions processing untrusted content hit the same structural walls as full AI agents. TypeScript, Rust, Go, Python via the WASM Component Model.

  • Route config via BPF map — no proxy restart
  • Per-invocation SPIFFE identity on every call
  • Instance pool sized by LLM-predicted demand
By the numbers

Architecture decisions, measured at fleet scale.

Not micro-optimizations. These are direct consequences of the design — the kind of differences that turn three racks back into one.

~10×
Less control-plane RAM
~100 MB vs ~1 GB on Kubernetes
~100×
Less mTLS CPU overhead
kTLS in-kernel vs Envoy sidecar
~50×
Faster scheduling
< 100 ms vs 1–10 s on Kubernetes
2.3×
Workload density
~70% utilization vs ~30% baseline
<10s
Node join
vs 2–5 minutes on Kubernetes
~1ms
WASM cold start
Wasmtime warm pool, no Firecracker tax
Compare

Coherent by construction, not by configuration.

When the dataplane, the identity model, the telemetry pipeline, and the service mesh all emerge from the same kernel primitive with the same workload identity attached, you stop gluing together products that were never designed to know about each other.

Component Kubernetes Overdrive
Service routing iptables · O(n) per packet XDP BPF · O(1) in-kernel
mTLS Envoy sidecar · ~0.5 vCPU each kTLS · NIC offload ~0 overhead
Control-plane RAM ~1 GB ~30–80 MB
Network policy Per-packet iptables walk BPF map lookup
Workload types Containers Process · microVM · VM · unikernel · WASM
Observability Scraped logs & Prometheus Kernel-native · identity-tagged
Multi-region Stretched Raft or federation plane Per-region Raft + global CRDT gossip
Extension model Go operators with cluster-admin WASM · sandboxed · hot-reloadable
Node join 2–5 minutes < 10 seconds
Cloud platform

Run your own. Or let us run it.

The source-available core is FSL-1.1-ALv2 and ships in one binary. Every release converts to Apache 2.0 two years after publication. The cloud platform sells the operational complexity we already absorbed for ourselves — metered exactly, by kernel telemetry. No estimation. No sampling.

Tier 1

Managed Overdrive

A full Overdrive cluster as a service. Control plane, worker pool, and CLI — the way you'd run it yourself, only without running it.

per vCPU-hour + GB-hour metered at allocation level
Tier 3

Serverless WASM

Sub-10 ms cold start. Scale-to-zero. The credential proxy and content inspector ship by default — built for the AI agent workloads no one else has a story for.

per invocation + GB-second minimum unit: one invocation
Tier 4

Bare Metal Dedicated

Dedicated physical nodes inside the platform. Full hardware performance. Full Overdrive operational stack. No VM overhead between you and the silicon.

per node-hour reserved capacity discounts

Enterprise self-hosted licensing — FIPS crypto, HSM integration, air-gap tooling, DORA / NIS2 / SOC2 / HIPAA policy packs — available alongside the source-available release.

Why now

Kubernetes was right for 2014.
It is not right for 2026.

Stable eBPF APIs, kernel TLS offload, production Rust systems libraries, and embeddable WASM only matured in the last two years. Overdrive is the orchestrator that becomes possible when all four exist at once.

01 / Own your primitives

Every dependency is a future incident.

No etcd. No Envoy. No SPIRE. No CNI. Every critical subsystem is built into the platform or is a standard Rust library. External processes you didn't write are operational liabilities — they get cut.

02 / The kernel is the dataplane

Userspace proxies become unnecessary.

Service routing, network policy, load balancing, mTLS, and telemetry happen at line rate in the kernel via aya-rs. No sidecar tax. No proxy reconfigurations. No tail-latency spikes from a userspace hop.

03 / Security is structural

mTLS isn't an option you remember to enable.

Every connection is wrapped in kTLS with a SPIFFE identity. Policy is enforced in-kernel by BPF LSM. A compromised workload, a misconfigured pod, and a malicious dependency all hit the same walls.

Architecture

One binary. Any topology.

Control plane and node agent ship in a single Rust binary. Role is declared at bootstrap — single node, three-node HA, dedicated ingress tier, multi-region. No second installer. No second upgrade path.

┌────────────────────────────────────────────────────────────────────┐
                           CLI / API  (gRPC + REST, tonic)          
├────────────────────────────────────────────────────────────────────┤
     CONTROL PLANE  — co-located with node agent or dedicated       
                                                                    
   IntentStore        Reconcilers        Built-in CA  (SPIFFE)      
   single: redb       Rust traits        Scheduler                  
   ha: openraft+redb  WASM extensions    Regorus + WASM policy      
   per region                            DuckLake telemetry         
├────────────────────────────────────────────────────────────────────┤
     NODE AGENT                                                     
                                                                    
   ▸ aya-rs eBPF dataplane                                          
     XDP (routing/LB) · TC (egress) · sockops (mTLS)                
     BPF LSM (MAC) · kprobes (telemetry)                            
                                                                    
   ▸ Drivers:  process · microvm · vm · unikernel · wasm            
   ▸ Gateway:  hyper · rustls · ACMEv2 · in-process route table     
├────────────────────────────────────────────────────────────────────┤
   ObservationStore  (Corrosion — CR-SQLite + SWIM/QUIC gossip)     
   alloc status · service backends · node health · regional peers   
├────────────────────────────────────────────────────────────────────┤
             Object Storage  (Garage, S3-compatible)                
└────────────────────────────────────────────────────────────────────┘
Workload model

Five drivers. One identity. One policy.

VMs, processes, unikernels, containers, and WASM functions share the same SPIFFE identity, the same eBPF dataplane, the same policy engine. Match the primitive to the workload — a VM for hard isolation, WASM for cold-start, a process for native speed.

process

Native binaries

Daemons under cgroups v2, kernel-enforced isolation, zero VM overhead.

tokio::process
microvm

Fast-boot VMs

~200 ms cold start. Hardware isolation. Optional persistent rootfs.

cloud-hypervisor
vm

Full virtualisation

Live CPU and memory hotplug. virtiofs sharing. AArch64 first-class.

cloud-hypervisor
unikernel

Extreme density

Single-purpose images on a hypervisor. Minimal kernel surface.

unikraft + CH
wasm

Serverless functions

~1 ms cold start. Scale-to-zero. Sandbox the model can't talk its way out of.

wasmtime
Job spec

Declare what. The platform handles how.

A single TOML composes drivers, sidecars, security profiles, and ingress. The same spec deploys to a single node and to a multi-region fleet — the platform absorbs the difference.

job.toml · ai-research-agent
# An AI agent with structural credential isolation
# and prompt-injection scanning on ingress.

[job]
name   = "ai-research-agent"
driver = "wasm"

[[job.sidecars]]
name    = "credential-proxy"
module  = "builtin:credential-proxy"
hooks   = ["egress"]

  [job.sidecars.config]
  allowed_domains = ["api.anthropic.com"]
  credentials     = { ANTHROPIC_KEY = { secret = "prod" } }

[[job.sidecars]]
name   = "content-inspector"
module = "builtin:content-inspector"
hooks  = ["ingress"]

[job.security]
no_raw_sockets          = true
no_privilege_escalation = true
egress.mode             = "intercepted"
~/$ overdrive
$ overdrive job submit job.toml
→ ai-research-agent · scheduled · alloc a1b2c3

$ overdrive alloc status a1b2c3
  state          : running
  driver         : wasm
  node           : node-04 (eu-west-1)
  identity       : spiffe://overdrive.local/
                 :   job/ai-research-agent
                 :   alloc/a1b2c3
  cert ttl       : 58m 12s
  sidecars       : 2 attached, healthy

$ overdrive cluster upgrade \
    --mode ha \
    --peers node-2,node-3
→ snapshot exported (LocalStore)
→ RaftStore bootstrapped on 3 peers
→ leader: node-1 · zero downtime
Native SRE Agent

AI that can reason about your cluster.

Every event in Overdrive carries cryptographic SPIFFE identity. Correlation isn't a label-matching heuristic — it's a SQL join. Investigations are first-class resources with a budget, a transcript, and a typed proposal at the end.

  • Investigations as a resource. Lifecycle, budget, transcript — compressed into incident memory on conclusion.
  • Typed remediation. Tier 0 reads auto-execute. Tier 2 writes wait for human ratification.
  • Hypothesis verification. Attach signed BPF probes for one investigation turn. No instrumentation rollout.
  • Deterministic replay. Investigation transcripts re-run in CI under the simulation harness.
Read § 12 of the whitepaper
investigation_idinv-7f3a92
triggeralert · payments.p99 > 800ms
scopejob/payments · eu-west-1
tools_called7
llm_turns4
tokens_spent12,840 / 50,000
probe_attachedtcp_retransmit_trace
diagnosisbackend pool exhausted
proposedScaleJob 3→6 · Tier 1 · auto
statusconcluded · 47s
Multi-region by default

See your fleet as one cluster.

Per-region Raft for intent. Global CRDT gossip for observation. Each region keeps writing through a partition. The dataplane never reads remote state.

us-east-1 healthy
nodes 42 / 42
allocations 2,184
raft leader node-04
gossip lag p99 214 ms
egress mTLS 100%
eu-west-1 healthy
nodes 38 / 38
allocations 1,902
raft leader node-12
gossip lag p99 187 ms
egress mTLS 100%
ap-southeast-1 healthy
nodes 21 / 21
allocations 964
raft leader node-03
gossip lag p99 412 ms
egress mTLS 100%
3 regions · 101 nodes · 5,050 allocations last gossip tick · 2.1s ago

Standing on production-grade Rust primitives

aya-rs openraft wasmtime cloud-hypervisor rustls + kTLS regorus corrosion / cr-sqlite redb
Source-available · FSL-1.1-ALv2 · Apache 2.0 after 2 years

Reimagining the foundation of modern infrastructure.

One binary. Every workload type. Built on the primitives Kubernetes never had.