Pyvorin Docs

Supported Workloads

Pyvorin delivers the best results on data-heavy, CPU-bound Python. Speedups are workload-specific and depend on input size, type stability, and compilation tier.

ETL & Data Processing

WorkloadTypical Speedup*Notes
Filter / map pipelines10–1,000×List comprehensions and simple lambdas compile well.
Group-by aggregations20–500×Hash-based grouping on primitive types is highly optimised.
Rolling window calculations15–300×Sliding operations over numeric arrays benefit from loop fusion.
CSV transforms10–50–Parsing and columnar extraction on large files.
JSON transforms5–40×Key extraction and nested filtering.
Log parsing10–100×Regex-lite string splitting and tokenisation.

String Processing

WorkloadTypical Speedup*Notes
Tokenisation100–500×Whitespace and delimiter splitting in tight loops.
Pattern matching50–300×Fixed-string search and simple character-class scans.

Finance Kernels

WorkloadTypical Speedup*Notes
Black-Scholes50–2,000×Closed-form option pricing on large portfolios.
Value at Risk (VaR)20–200×Historical simulation and percentile extraction.
Monte Carlo simulation10–100×Path generation depends on random-number backend.
Present Value (PV)30–500×Cash-flow discounting loops.

What Makes a Good Workload

  • Type-stable loops: Variables keep the same type (e.g., int or float) across iterations.
  • Primitive collections: Lists and dicts of scalars compile better than deeply nested heterogeneous structures.
  • Pure computation: Heavy arithmetic, reductions, and transformations with minimal I/O inside hot paths.
  • Large inputs: Compilation overhead is amortised over millions of operations, not dozens.

What to Avoid

  • Async / await and coroutine-heavy code
  • Dynamic code generation (eval, exec)
  • GPU training or CUDA-dependent libraries
  • Heavy NumPy / Pandas internals (ndarray methods are not yet fully lowered)
  • Network-bound services where Python overhead is negligible

See Unsupported & Fallback for details on how Pyvorin handles these constructs.

* Speedup ranges are indicative only, derived from internal benchmarking on x86_64 Linux. Your results will vary based on hardware, input size, and compiler version.