guides

Regular Expressions Optimisation

When to use re module vs pure Python string parsing.

Published May 30, 2026

When re Module Wins

The re module is C-accelerated and usually faster than pure Python for pattern matching:

import re
emails = re.findall(r'[\w.]+@[\w.]+', text)

When Pyvorin Wins

For simple character scanning or custom tokenisation, compiled Python can match or beat regex startup overhead:

def extract_numbers(text: str) -> list[int]:
    numbers = []
    current = []
    for ch in text:
        if ch.isdigit():
            current.append(ch)
        else:
            if current:
                numbers.append(int(''.join(current)))
                current = []
    if current:
        numbers.append(int(''.join(current)))
    return numbers

Hybrid Approach

Use regex for complex patterns and Pyvorin for the surrounding transformation logic:

def process_document(text):
    tokens = re.split(r'\s+', text)
    return compiled_transform(tokens)