Idea
The InfluxDB Line Protocol HTTP handler (POST /api/v1/ts/{db}/write) currently calls engine.appendSamples() once per sample in the incoming batch. Each call creates its own internal nested transaction — so a batch of N samples produces N transaction begin/commit cycles executed sequentially.
Problem
For a typical batch of 5,000 samples this means 5,000 sequential TX cycles on the server side, even though the data could be grouped by target shard and written far more efficiently.
Proposed optimisation
- Group by shard up front — before writing anything, assign each sample to its target shard via the existing round-robin counter and collect the sample indices per shard.
- One transaction per shard — instead of N transactions, open one nested transaction per shard and write all of that shard's samples in a single commit. For 32 shards and 5,000 samples this reduces TX cycles from 5,000 to at most 32 (one per active shard).
- Parallel shard writes — dispatch each shard's write to the existing
shardExecutor thread pool so all active shards write concurrently. The existing per-shard appendLock already guarantees that concurrent HTTP requests to the same shard are serialised without MVCC conflicts.
The change lives in two places:
- A new
appendBatch(long[] allTimestamps, Object[][] allColumnValues) method on TimeSeriesEngine that implements the grouping and parallel dispatch.
- The HTTP handler groups samples by measurement type and calls
appendBatch once per measurement instead of once per sample.
Expected impact
Benchmarks on a MacBook M5 Pro (8 GB JVM heap, 8 Python workers, batch size 5,000, localhost):
| Scenario |
Avg throughput |
| Current (1 TX per sample) |
~60,000 metrics/sec |
| Batched (1 TX per shard, parallel) |
~433,000 metrics/sec |
~7× improvement with the same client configuration. With batch size 20,000 the gain reaches ~460,000 metrics/sec. No client-side changes are needed — the optimisation is fully transparent to callers of the HTTP endpoint.
Idea
The InfluxDB Line Protocol HTTP handler (
POST /api/v1/ts/{db}/write) currently callsengine.appendSamples()once per sample in the incoming batch. Each call creates its own internal nested transaction — so a batch of N samples produces N transaction begin/commit cycles executed sequentially.Problem
For a typical batch of 5,000 samples this means 5,000 sequential TX cycles on the server side, even though the data could be grouped by target shard and written far more efficiently.
Proposed optimisation
shardExecutorthread pool so all active shards write concurrently. The existing per-shardappendLockalready guarantees that concurrent HTTP requests to the same shard are serialised without MVCC conflicts.The change lives in two places:
appendBatch(long[] allTimestamps, Object[][] allColumnValues)method onTimeSeriesEnginethat implements the grouping and parallel dispatch.appendBatchonce per measurement instead of once per sample.Expected impact
Benchmarks on a MacBook M5 Pro (8 GB JVM heap, 8 Python workers, batch size 5,000, localhost):
~7× improvement with the same client configuration. With batch size 20,000 the gain reaches ~460,000 metrics/sec. No client-side changes are needed — the optimisation is fully transparent to callers of the HTTP endpoint.