workloads

Benchmarking Correctly

How to measure speedups honestly and avoid common benchmarking traps.

Published May 30, 2026

How Pyvorin Benchmarks

Pyvorin's pyvorin benchmark command measures CPython vs Pyvorin execution time honestly:

  • Runs warm executions (not cold compiles)
  • Uses adaptive timing with minimum total duration
  • Checks correctness by comparing output hashes
  • Records timing quality assessment

Common Pitfalls

  • Cold compile timing: Do not measure the compilation step as part of benchmark time.
  • Too-fast functions: Functions completing in <10 ms may show unreliable speedups.
  • Wrong workload: Network I/O and file system operations are not accelerated.
  • Ignoring correctness: Always verify correctness_match=True.

Timing Quality

QualityMeaning
validReliable timing; speedup can be reported.
below_timing_thresholdFunction too fast; increase input size.
native_compiler_unavailableInstall full pyvorin package for native timing.

Running a Benchmark

pyvorin benchmark script.py --function benchmark --runs 5

Full Speed Proof

pyvorin proof script.py --runs-per-day 100