close
Skip to main content
Image

r/mysql



Ready to build your next project? Start now with Codex, available with ChatGPT.
Image Ready to build your next project? Start now with Codex, available with ChatGPT.



How are you doing reproducible MySQL benchmarking across versions or configs?
How are you doing reproducible MySQL benchmarking across versions or configs?
question

I’ve been looking into how people actually benchmark MySQL setups in a way that produces results you can trust and compare over time.

On paper it sounds simple, but once you try to compare across:

  • different MySQL versions

  • config changes

  • environments

it gets messy quite quickly.

Typical issues I keep hearing about:

  • results that are hard to reproduce

  • leftover state affecting runs

  • difficulty explaining why numbers differ, not just that they do

The part that seems especially tricky is controlling the full lifecycle:

  • clean state between runs

  • consistent warmup

  • repeatable execution

  • attaching diagnostics so results are interpretable

We’ve been working on a framework that tries to make this more deterministic:

  • explicit DB lifecycle per iteration

  • hooks for diagnostics/profiling

  • consistent execution + reporting

There’s a beta here if anyone is curious:
https://mariadb.org/mariadb-foundation-releases-the-beta-of-the-test-automation-framework-taf-2-5/

Mostly interested in how others approach this:

  • Do you trust your benchmarking results?

  • How do you ensure reproducibility?

  • Are you using existing tools or mostly custom scripts?

  • What tends to break consistency the most?

Would be great to hear real-world approaches.