Multi-engine SQL query routing proxy

4: front protocols
5+: backend engines
~0.35 ms: p50 proxy overhead
up to 56%: cost reduction

Built for multi-engine SQL estates

Same patterns you expect from a serious proxy — explicit routing, capacity limits, and observability — without locking teams to a single database.

Many front doors

Trino HTTP, PostgreSQL wire, MySQL wire, and Arrow Flight SQL — route with rules instead of one engine per deployment.

Heterogeneous backends

Trino, DuckDB, StarRocks, and more — with sqlglot-backed dialect translation when the client and engine disagree on SQL.

Operator ready

Per-group concurrency and queuing, Prometheus metrics, optional PostgreSQL-backed state, and an admin API.

Example use cases

Routing, capacity limits, and load balancing—one proxy, every client, no separate stack per engine.

Multi-engine data platform

QueryFlux is one front door with routing rules: BI tools connect over MySQL wire and land on StarRocks, scheduled jobs stay on Trino, and ad-hoc SELECT *-style exploration can be matched by regex and sent to Athena. Each engine gets the workload it is built for—no new connection strings or drivers for your teams.

Cost-aware workload dispatch

A Python router in QueryFlux inspects every query and steers CPU-heavy joins and windows to compute-priced Trino while scan-heavy Iceberg reads go to scan-priced Athena. You encode the cost model once; every client inherits the same dispatch automatically.

Dashboard SLA protection

Put a maxRunningQueries cap on the StarRocks group so dashboard traffic always has headroom: when the group is full, ad-hoc queries queue at the proxy or spill to a Trino fallback. Grafana can show queue depth in real time—dashboards keep a fast path while analysts wait transparently.

Transparent engine migration

Weighted load balancing across a cluster group runs Trino and StarRocks together—ramp StarRocks from 10% to 100% with zero client changes. QueryFlux Studio query history lines up per-engine latency until you are ready to flip weights and skip a flag day.

Benefits

Cost control, SLA protection, and operational consistency across engines.

Cut query costs by routing to the right engine

Cloud engines charge in fundamentally different ways. Compute-priced backends (Trino, StarRocks) charge for cluster uptime or CPU-seconds. Scan-priced backends (Athena, BigQuery) charge for bytes read. Without a routing layer, every query goes to the same engine regardless of its shape — CPU-heavy joins land on Athena, cold selective filters land on StarRocks, and you pay the wrong model each time.

In our own benchmarking, workload-aware routing — steering CPU-heavy work to compute-priced engines and selective cold-data queries to scan-priced ones — reduced total workload cost by up to 56%, with individual queries sometimes dropping by up to 90% compared with always using a single default.

Enforce latency SLAs without touching clients

A batch ETL job competing with an interactive dashboard on the same Trino cluster degrades both. QueryFlux lets you encode performance intent in routing rules and apply it to all clients uniformly — no application changes, no conventions that drift:

Route all PostgreSQL wire connections (typically interactive tooling) to a low-latency StarRocks pool

Route queries tagged workload:etl to the Trino cluster reserved for batch

Route queries matching SELECT.*LIMIT \d+ to DuckDB for sub-10 ms response

Absorb burst pressure with proxy-side queuing

When a cluster is saturated, the default behavior is engine-specific and invisible across engines. QueryFlux adds a controlled throttle per cluster group: queries queue at the proxy rather than hammering the backend, overflow spills to a secondary group via fallback routing, and queue depth is a first-class Prometheus metric. One pane of glass across all engines instead of fragmented per-engine UIs.

Eliminate the N×M integration problem

One endpoint replaces N×M driver configurations. Clients connect to QueryFlux once; the backend topology — which engines exist, how they are grouped, how load is balanced — is config, not code. Add an engine, change a routing rule, swap a backend: no client changes, no deploys, no coordination.

~0.35 ms proxy overhead

QueryFlux is written in Rust. The measured p50 proxy overhead (routing + dialect translation, from the queryflux-bench suite) is approximately 0.35 ms. For the typical analytical workload, the proxy is not on the critical path.

Documentation

Quick guides and reference — same layout as the sidebar.

Overview

What QueryFlux does

Read the documentation home for quick guides, how routing works, and links into the reference manual — then open Getting started when you are ready to run a stack.

Open documentation overview Skip to Docker Compose

QueryFlux