Overdrive — Redefining how compute runs

Platform Engineering

Get out from under the stack you maintain.

Stop running etcd, cert-manager, an ingress controller, a service mesh, and a CNI plugin as four independent failure domains. Overdrive ships them as one binary, with three-node HA that fits in 80 MB of memory.

See the architecture →

SRE & On-Call

Incidents that investigate themselves.

Every eBPF event carries cryptographic workload identity. The native SRE agent correlates across alerts via SQL joins, attaches signed BPF probes to verify hypotheses, and proposes typed remediations through a graduated approval gate.

See the SRE agent →

AI Engineering

Agents that can't exfiltrate what they don't have.

Prompt injection becomes structurally inert. The credential proxy holds the real keys. Domain allowlists run in-kernel via TC eBPF. BPF LSM blocks raw sockets. Security is enforced by infrastructure, not by the model's judgment.

See agent isolation →

Use cases

Built for workloads Kubernetes was never shaped for.

Persistent microVMs, structural credential isolation, per-workload WASM sandboxes, and global-gossip service catalogs compose into operational shapes other orchestrators have to bolt together from third-party pieces.

AI coding agents microvm · persistent

Claude Code. Cursor. Devin-style sessions.

State accumulates across turns. 100 GB persistent rootfs per agent. Fast resume from snapshot between turns. The credential proxy holds the real API keys; the content inspector scans every tool response for injection payloads before the model sees them.

Persistent rootfs via overdrive-fs
Checkpoint/restore with userfaultfd lazy paging
SPIFFE-bound credential proxy + domain allowlist

CI runners microvm · persistent

Self-hosted Buildkite, GitLab, GitHub Actions.

Warm layer caches survive across builds. Artifact working sets persist. Scale-to-zero between jobs without paying cold-start on the next one. Per-repo kernel isolation and no shared node state between tenants.

Sub-second resume from snapshot
Per-workload SPIFFE identity, no shared secrets
BPF LSM blocks arbitrary binary execution

Remote dev environments microvm · persistent

Codespaces. Remote Jupyter. Interactive notebooks.

Per-user persistent rootfs. User filesystem survives across restarts, migrations, and idle eviction. Instant resume when the developer reconnects. Cross-workload volume sharing via virtiofs for team-scale data sets.

Gateway auto-route — one URL per workload
virtiofs volumes shared across workload types
Idle eviction with transparent resume on request

Customer-code sandboxes wasm · microvm

SaaS that runs arbitrary tenant code.

Per-tenant kernel-level isolation. Defense in depth across four independent layers: WASM/VM boundary, BPF LSM, kTLS + SPIFFE, XDP network policy. Compromise one, the other three still hold.

~1 ms WASM cold start; warm pool per function
WASI capabilities explicitly granted per job spec
Wasmtime fuel prevents infinite-loop starvation

Multi-region services all drivers

Regional autonomy without a federation plane.

Each region runs its own Raft. Global observation converges via CRDT gossip in seconds. Under partition, every region keeps submitting jobs and serving traffic. The dataplane never reads remote state. Replay across regions via a single response header.

Per-region Raft quorum — no WAN latency on writes
Global service catalog via Corrosion / CR-SQLite
overdrive-replay header for app-driven routing

Serverless APIs wasm

HTTP endpoints with AI-agent-grade isolation.

~1 ms cold start. Scale-to-zero. The credential proxy and content inspector ship by default — so functions processing untrusted content hit the same structural walls as full AI agents. TypeScript, Rust, Go, Python via the WASM Component Model.

Route config via BPF map — no proxy restart
Per-invocation SPIFFE identity on every call
Instance pool sized by LLM-predicted demand

Compare

Coherent by construction, not by configuration.

When the dataplane, the identity model, the telemetry pipeline, and the service mesh all emerge from the same kernel primitive with the same workload identity attached, you stop gluing together products that were never designed to know about each other.

Component	Kubernetes	Overdrive
Service routing	iptables · O(n) per packet	XDP BPF · O(1) in-kernel
mTLS	Envoy sidecar · ~0.5 vCPU each	kTLS · NIC offload ~0 overhead
Control-plane RAM	~1 GB	~30–80 MB
Network policy	Per-packet iptables walk	BPF map lookup
Workload types	Containers	Process · microVM · VM · unikernel · WASM
Observability	Scraped logs & Prometheus	Kernel-native · identity-tagged
Multi-region	Stretched Raft or federation plane	Per-region Raft + global CRDT gossip
Extension model	Go operators with cluster-admin	WASM · sandboxed · hot-reloadable
Node join	2–5 minutes	< 10 seconds

Cloud platform

Run your own. Or let us run it.

The source-available core is FSL-1.1-ALv2 and ships in one binary. Every release converts to Apache 2.0 two years after publication. The cloud platform sells the operational complexity we already absorbed for ourselves — metered exactly, by kernel telemetry. No estimation. No sampling.

Tier 1

Managed Overdrive

A full Overdrive cluster as a service. Control plane, worker pool, and CLI — the way you'd run it yourself, only without running it.

per vCPU-hour + GB-hour metered at allocation level

Tier 2 · Most flexible

Managed Workloads

Submit jobs directly. No cluster to size. No control plane to patch. Nomad-style ergonomics with Overdrive's identity and isolation model underneath.

per vCPU-hour + GB-hour tenant namespace, kernel-isolated

Tier 3

Serverless WASM

Sub-10 ms cold start. Scale-to-zero. The credential proxy and content inspector ship by default — built for the AI agent workloads no one else has a story for.

per invocation + GB-second minimum unit: one invocation

Tier 4

Bare Metal Dedicated

Dedicated physical nodes inside the platform. Full hardware performance. Full Overdrive operational stack. No VM overhead between you and the silicon.

per node-hour reserved capacity discounts

Enterprise self-hosted licensing — FIPS crypto, HSM integration, air-gap tooling, DORA / NIS2 / SOC2 / HIPAA policy packs — available alongside the source-available release.

Why now

Kubernetes was right for 2014.
It is not right for 2026.

Stable eBPF APIs, kernel TLS offload, production Rust systems libraries, and embeddable WASM only matured in the last two years. Overdrive is the orchestrator that becomes possible when all four exist at once.

01 / Own your primitives

Every dependency is a future incident.

No etcd. No Envoy. No SPIRE. No CNI. Every critical subsystem is built into the platform or is a standard Rust library. External processes you didn't write are operational liabilities — they get cut.

02 / The kernel is the dataplane

Userspace proxies become unnecessary.

Service routing, network policy, load balancing, mTLS, and telemetry happen at line rate in the kernel via aya-rs. No sidecar tax. No proxy reconfigurations. No tail-latency spikes from a userspace hop.

03 / Security is structural

mTLS isn't an option you remember to enable.

Every connection is wrapped in kTLS with a SPIFFE identity. Policy is enforced in-kernel by BPF LSM. A compromised workload, a misconfigured pod, and a malicious dependency all hit the same walls.

Architecture

One binary. Any topology.

Control plane and node agent ship in a single Rust binary. Role is declared at bootstrap — single node, three-node HA, dedicated ingress tier, multi-region. No second installer. No second upgrade path.

┌────────────────────────────────────────────────────────────────────┐
│                           CLI / API  (gRPC + REST, tonic)          │
├────────────────────────────────────────────────────────────────────┤
│     CONTROL PLANE  — co-located with node agent or dedicated       │
│                                                                    │
│   IntentStore        Reconcilers        Built-in CA  (SPIFFE)      │
│   single: redb       Rust traits        Scheduler                  │
│   ha: openraft+redb  WASM extensions    Regorus + WASM policy      │
│   per region                            DuckLake telemetry         │
├────────────────────────────────────────────────────────────────────┤
│     NODE AGENT                                                     │
│                                                                    │
│   ▸ aya-rs eBPF dataplane                                          │
│     XDP (routing/LB) · TC (egress) · sockops (mTLS)                │
│     BPF LSM (MAC) · kprobes (telemetry)                            │
│                                                                    │
│   ▸ Drivers:  process · microvm · vm · unikernel · wasm            │
│   ▸ Gateway:  hyper · rustls · ACMEv2 · in-process route table     │
├────────────────────────────────────────────────────────────────────┤
│   ObservationStore  (Corrosion — CR-SQLite + SWIM/QUIC gossip)     │
│   alloc status · service backends · node health · regional peers   │
├────────────────────────────────────────────────────────────────────┤
│             Object Storage  (Garage, S3-compatible)                │
└────────────────────────────────────────────────────────────────────┘

Workload model

Five drivers. One identity. One policy.

VMs, processes, unikernels, containers, and WASM functions share the same SPIFFE identity, the same eBPF dataplane, the same policy engine. Match the primitive to the workload — a VM for hard isolation, WASM for cold-start, a process for native speed.

process

Native binaries

Daemons under cgroups v2, kernel-enforced isolation, zero VM overhead.

tokio::process

microvm

Fast-boot VMs

~200 ms cold start. Hardware isolation. Optional persistent rootfs.

cloud-hypervisor

Full virtualisation

Live CPU and memory hotplug. virtiofs sharing. AArch64 first-class.

cloud-hypervisor

unikernel

Extreme density

Single-purpose images on a hypervisor. Minimal kernel surface.

unikraft + CH

wasm

Serverless functions

~1 ms cold start. Scale-to-zero. Sandbox the model can't talk its way out of.

wasmtime

Job spec

Declare what. The platform handles how.

A single TOML composes drivers, sidecars, security profiles, and ingress. The same spec deploys to a single node and to a multi-region fleet — the platform absorbs the difference.

job.toml · ai-research-agent

# An AI agent with structural credential isolation
# and prompt-injection scanning on ingress.

[job]
name   = "ai-research-agent"
driver = "wasm"

[[job.sidecars]]
name    = "credential-proxy"
module  = "builtin:credential-proxy"
hooks   = ["egress"]

  [job.sidecars.config]
  allowed_domains = ["api.anthropic.com"]
  credentials     = { ANTHROPIC_KEY = { secret = "prod" } }

[[job.sidecars]]
name   = "content-inspector"
module = "builtin:content-inspector"
hooks  = ["ingress"]

[job.security]
no_raw_sockets          = true
no_privilege_escalation = true
egress.mode             = "intercepted"

~/$ overdrive

$ overdrive job submit job.toml
→ ai-research-agent · scheduled · alloc a1b2c3

$ overdrive alloc status a1b2c3
  state          : running
  driver         : wasm
  node           : node-04 (eu-west-1)
  identity       : spiffe://overdrive.local/
                 :   job/ai-research-agent
                 :   alloc/a1b2c3
  cert ttl       : 58m 12s
  sidecars       : 2 attached, healthy

$ overdrive cluster upgrade \
    --mode ha \
    --peers node-2,node-3
→ snapshot exported (LocalStore)
→ RaftStore bootstrapped on 3 peers
→ leader: node-1 · zero downtime

Native SRE Agent

AI that can reason about your cluster.

Every event in Overdrive carries cryptographic SPIFFE identity. Correlation isn't a label-matching heuristic — it's a SQL join. Investigations are first-class resources with a budget, a transcript, and a typed proposal at the end.

Investigations as a resource. Lifecycle, budget, transcript — compressed into incident memory on conclusion.
Typed remediation. Tier 0 reads auto-execute. Tier 2 writes wait for human ratification.
Hypothesis verification. Attach signed BPF probes for one investigation turn. No instrumentation rollout.
Deterministic replay. Investigation transcripts re-run in CI under the simulation harness.

Read § 12 of the whitepaper →

investigation_idinv-7f3a92

triggeralert · payments.p99 > 800ms

scopejob/payments · eu-west-1

tools_called7

llm_turns4

tokens_spent12,840 / 50,000

probe_attachedtcp_retransmit_trace

diagnosisbackend pool exhausted

proposedScaleJob 3→6 · Tier 1 · auto

statusconcluded · 47s

Redefining how compute runs.

Get out from under the stack you maintain.

Incidents that investigate themselves.

Agents that can't exfiltrate what they don't have.

Built for workloads Kubernetes was never shaped for.

Claude Code. Cursor. Devin-style sessions.

Self-hosted Buildkite, GitLab, GitHub Actions.

Codespaces. Remote Jupyter. Interactive notebooks.

SaaS that runs arbitrary tenant code.

Regional autonomy without a federation plane.

HTTP endpoints with AI-agent-grade isolation.

Architecture decisions, measured at fleet scale.

Coherent by construction, not by configuration.

Run your own. Or let us run it.

Managed Overdrive

Managed Workloads

Serverless WASM

Bare Metal Dedicated

Kubernetes was right for 2014.
It is not right for 2026.

Every dependency is a future incident.

Userspace proxies become unnecessary.

mTLS isn't an option you remember to enable.

One binary. Any topology.

Five drivers. One identity. One policy.

Native binaries

Fast-boot VMs

Full virtualisation

Extreme density

Serverless functions

Declare what. The platform handles how.

AI that can reason about your cluster.

See your fleet as one cluster.

Reimagining the foundation of modern infrastructure.

Redefining how compute runs.

Get out from under the stack you maintain.

Incidents that investigate themselves.

Agents that can't exfiltrate what they don't have.

Built for workloads Kubernetes was never shaped for.

Claude Code. Cursor. Devin-style sessions.

Self-hosted Buildkite, GitLab, GitHub Actions.

Codespaces. Remote Jupyter. Interactive notebooks.

SaaS that runs arbitrary tenant code.

Regional autonomy without a federation plane.

HTTP endpoints with AI-agent-grade isolation.

Architecture decisions, measured at fleet scale.

Coherent by construction, not by configuration.

Run your own. Or let us run it.

Managed Overdrive

Managed Workloads

Serverless WASM

Bare Metal Dedicated

Kubernetes was right for 2014. It is not right for 2026.

Every dependency is a future incident.

Userspace proxies become unnecessary.

mTLS isn't an option you remember to enable.

One binary. Any topology.

Five drivers. One identity. One policy.

Native binaries

Fast-boot VMs

Full virtualisation

Extreme density

Serverless functions

Declare what. The platform handles how.

AI that can reason about your cluster.

See your fleet as one cluster.

Reimagining the foundation of modern infrastructure.

Kubernetes was right for 2014.
It is not right for 2026.