Benchmarking with the Pyvorin Edge SDK
Complete guide to the built-in benchmark suite, run-pi-benchmarks.sh, the 8 verticals, attested reports, prove-benchmark.sh, and creating custom benchmarks.
Published Jun 2, 2026
Overview
The Pyvorin Edge SDK ships with a comprehensive benchmark suite located in examples/industry_benchmarks/. It measures pipeline latency, data-reduction ratios, and cloud-cost savings across eight real-world industry verticals. The suite is designed to run on Raspberry Pi hardware (Pi 4 or Pi 5) but works on any Linux host with Python 3.11+.
This article covers the two main entry points — run-pi-benchmarks.sh for quick local runs and prove-benchmark.sh for cryptographically attested customer proofs — and explains how to interpret results and add your own benchmarks.
Quick Start: run-pi-benchmarks.sh
The run-pi-benchmarks.sh script is a self-contained bash runner that activates the project virtual environment, executes the full 8-vertical suite, generates case-study markdown, and copies artefacts to a predictable directory.
cd ~/pyvorin-edge
bash pifiles/run-pi-benchmarks.sh
The script performs the following steps:
- Verifies it is running inside a valid
pyvorin-edgeinstallation (/opt/pyvorin-edgeor/home/*/pyvorin-edge). - Locates and activates the virtual environment (
venv/bin/activateor.venv/bin/activate). - Runs the benchmark harness:
python examples/industry_benchmarks/run_all.py - Generates case-study markdown:
python examples/industry_benchmarks/generate_case_studies.py - Copies results to
~/pyvorin-edge/benchmark_results_pi.jsonand~/pyvorin-edge/CASE_STUDIES_PI.md. - Prints a summary table with reduction percentages and cost savings in GBP.
Understanding the 8 Verticals
Each vertical is a self-contained simulation with sensor configurations, anomaly injection, and pipeline rules. The verticals are:
| Vertical | Sensors | Typical Rule |
|---|---|---|
| smart_buildings | Temperature, humidity, CO2, occupancy | HVAC efficiency and over-crowding alerts |
| predictive_maintenance | Vibration, current, temperature | Bearing-fault and motor-overload detection |
| cold_chain | Temperature, humidity, door sensors | Excursion alerts and compliance logging |
| precision_agriculture | Soil moisture, light, temperature | Irrigation trigger and frost warnings |
| solar_farm | DC current, voltage, irradiance | Panel-soiling and inverter-fault alerts |
| telecom_tower | Current, vibration, temperature | Generator-fuel and structural-resonance alerts |
| smart_warehouse | Occupancy, temperature, leak | Inventory-climate and spill detection |
| water_utilities | Pressure, flow, turbidity | Leak and contamination events |
Each vertical runs a 24-hour simulation at 1 Hz sample rate. Anomalies are injected at deterministic time windows (e.g., hours 4–6) so every run produces identical event counts, making the results reproducible across hardware revisions.
Interpreting Results
The JSON output from run_all.py contains the following fields per vertical:
[
{
"vertical": "solar_farm",
"num_sensors": 12,
"total_readings": 1036800,
"raw_bytes": 20736000,
"edge_bytes": 124416,
"reduction_percent": 99.4,
"events_triggered": 42,
"latency_p50_ms": 0.0032,
"cost_model": {
"reduction_percent": 99.4,
"cost_savings": 12.34
}
}
]
- total_readings: 24 h × 1 Hz × num_sensors. This is the raw telemetry volume.
- raw_bytes: Estimated uncompressed JSON payload if every reading were sent to the cloud.
- edge_bytes: Actual payload after windowing and event compression.
- reduction_percent:
(1 - edge_bytes / raw_bytes) × 100. Typical values range from 85% to 99.5%. - latency_p50_ms: Median time to evaluate all rules for one frame of readings. Values below 0.01 ms indicate the pipeline is keeping up with real-time ingestion.
- cost_model.cost_savings: Estimated monthly savings in GBP compared to sending all raw data over a SIM plan to AWS IoT Core and S3.
Attested Reports with prove-benchmark.sh
For customer presentations and procurement workflows, prove-benchmark.sh generates a tamper-evident, hardware-attested benchmark report. The script performs six steps:
- Generate Ed25519 key pair — a one-time signing key for this proof.
- Screen recording setup — checks for
asciinemaand prints instructions. - Hardware sanity check — collects CPU model, serial number, temperature, and throttle state.
- Run attested benchmarks — executes
run_all.pywith--attest --signflags. - Generate markdown report — produces a human-readable report with verification instructions.
- Recording instructions — writes a guide for creating video proof.
cd ~/pyvorin-edge
bash pifiles/prove-benchmark.sh
# Outputs:
# proof/benchmark_attested.json — signed results
# proof/benchmark_report.md — human-readable summary
# proof/public_key.b64 — verification key
# proof/RECORDING.txt — screen-recording guide
Verifying a Signed Proof
Anyone can verify the cryptographic signature and integrity hash without trusting the benchmark author:
# Verify the Ed25519 signature
python3 -c "
import json, base64
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PublicKey
data = json.load(open('proof/benchmark_attested.json'))
pub = Ed25519PublicKey.from_public_bytes(base64.b64decode(data['public_key_b64']))
payload = {k: v for k, v in data.items() if k not in ('signature', 'public_key_b64')}
canonical = json.dumps(payload, sort_keys=True, separators=(',', ':')).encode()
pub.verify(base64.b64decode(data['signature']), canonical)
print('SIGNATURE VALID')
"
# Verify the integrity hash
python3 -c "
import json, hashlib
data = json.load(open('proof/benchmark_attested.json'))
expected = data['integrity_hash']
payload = {k: v for k, v in data.items() if k != 'integrity_hash'}
canonical = json.dumps(payload, sort_keys=True, separators=(',', ':')).encode()
actual = hashlib.sha256(canonical).hexdigest()
assert actual == expected
print('INTEGRITY HASH VALID')
"
Creating Custom Benchmarks
You can add a new vertical to the benchmark suite in four steps:
# Step 1 — Create a config directory and JSON file
# examples/industry_benchmarks/my_vertical/simulator_config.json
{
"sensors": [
{
"id": "tank_level",
"name": "Tank Level",
"type": "pressure",
"unit": "bar",
"normal_range": {"min": 1.0, "max": 5.0},
"alert_threshold": {"min": 0.5, "max": 5.5},
"noise_std": 0.05,
"baseline": 3.0,
"sampling_interval_seconds": 60
}
]
}
# Step 2 — Register the vertical in run_all.py
VERTICALS = [
# ... existing verticals ...
("my_vertical", Path("examples/industry_benchmarks/my_vertical/simulator_config.json")),
]
# Step 3 — Add a scenario injector
def _inject_my_vertical(values, timestamps, sensor_names, config):
# Inject a pressure spike between hours 10 and 12
for i, ts in enumerate(timestamps):
if 36000 <= ts < 43200:
values[i] += 2.0
SCENARIO_INJECTORS = {
# ... existing injectors ...
"my_vertical": _inject_my_vertical,
}
# Step 4 — Run and verify
python examples/industry_benchmarks/run_all.py
python examples/industry_benchmarks/generate_case_studies.py
Cost Model Assumptions
The built-in cost model uses conservative UK pricing:
- SIM plan: £0.01/day active fee + £0.005/MB data
- AWS IoT Core: £1.00 per million messages
- S3 storage: £0.023/GB/month
You can override these when constructing CostModel programmatically:
from pyvorin_edge.cost_model import CostModel, TrafficModel
traffic = TrafficModel(
properties=1,
sensors_per_property=8,
readings_per_sensor_per_day=86400,
raw_payload_bytes=120,
edge_summaries_per_sensor_per_day=24,
edge_payload_bytes=256,
)
cost = CostModel(
traffic=traffic,
sim_active_fee_per_day=0.02, # Your carrier pricing
sim_data_per_mb=0.008,
aws_iot_price_per_million=1.20,
s3_price_per_gb_month=0.025,
)
print(cost.to_dict())
Automating Benchmark Runs in CI
For regression testing, run the benchmark suite in GitHub Actions or GitLab CI:
# .github/workflows/benchmark.yml
name: Edge Benchmark
on: [push]
jobs:
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- run: pip install -e ./edge_sdk
- run: python examples/industry_benchmarks/run_all.py --output benchmark_results.json
- run: python examples/industry_benchmarks/generate_case_studies.py
Note that NEON results will not be representative on x86 runners; use a self-hosted ARM runner or a Raspberry Pi for accurate performance numbers.