how it works. Two complementary passes. Positional: for each byte offset, if ≥ T fraction of frames carry the same byte, emit it as a literal; otherwise wildcard. Catches fixed-offset structure (ethernet, IP, port-aware fields). Floating: globally mine recurring byte n-grams that appear in ≥ T of frames at any offset, greedily extend, dedupe, and stitch them into the regex with .* between them. Catches attacks like DNS water torture where the target domain shifts position per packet because the random prefix is variable-length.
switched-position alternation. Within wildcard runs of 4+ bytes, recurring n-grams of length ≥4 that appear at varying offsets in different packet subsets are emitted as (seqA|seqB|…). Capped at 5 alternations per region; if more candidates exist, the longest are kept.
example packet. Pulled from a random matching frame. Left side: hex dump with literal-matched bytes highlighted against muted wildcard positions. Right side: parsed protocol tree (Ethernet / IP / L4 / DNS / HTTP / TLS / NTP) with each field's value highlighted based on whether the regex pins its bytes. Highlighting only reflects the positional pass — floating-substring matches don't land at fixed offsets so don't paint fixed bytes.
slider intuition. 50% target → tightest regex, but only the cleanest cohort matches. 90% target → loosest regex, matches most of the pcap including noise. The "actual match" stat above the cards may differ from the target slightly because positional and floating constraints intersect imperfectly; if it falls well below, the pcap is heterogeneous and you may need to pre-filter.
multi-vector mode. When a single regex can't capture a mixed attack (e.g. DNS amplification + SYN flood + HTTP slowloris in one pcap), switch the mode toggle. Frames are grouped by (L4 proto, dst port), the top 5 groups by frame count are kept, and each gets its own analysis pipeline with an independent coverage slider and example packet. The combined-regex card at the bottom shows two outputs: separate per-vector regexes you can plug into a filter for each protocol/port, plus a single alternation regex (rx1)|(rx2)|… for whole-payload engines like Suricata content+pcre. Enable/disable individual vectors with the pill toggles to focus the combined output.