Rendered at 17:36:45 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
nextaccountic 2 days ago [-]
> input size normal hardened speedup w/ hardened
> 1,000 0.7ms 28us 25x
> 5,000 18ms 146us 123x
> 10,000 73ms 303us 241x
> 50,000 1.8s 1.6ms 1,125x
Why is there a normal mode if hardened mode is faster for all input sizes?
ieviev 2 days ago [-]
Sorry, finished the post just now with more comparisons on other inputs
The reason is just that the normal mode is faster in average non pathological cases
tracnar 2 days ago [-]
Could you have a heuristics based on the input size and the pattern to decide what to use?
ieviev 2 days ago [-]
Yes, this is entirely possible. you can even explore the automaton eagerly and detect if it's possible to loop from an accepting state to a nonaccepting one.
Exciting stuff for future work
nextaccountic 2 days ago [-]
Ripgrep does something like thhis. It has a meta regex engine that switches engine when it finds what looks like pathological cases (or rather, the regex-automata crate does, which is used by the regex crate, which powers ripgrep).
> 1,000 0.7ms 28us 25x
> 5,000 18ms 146us 123x
> 10,000 73ms 303us 241x
> 50,000 1.8s 1.6ms 1,125x
Why is there a normal mode if hardened mode is faster for all input sizes?
The reason is just that the normal mode is faster in average non pathological cases
Exciting stuff for future work
https://docs.rs/regex-automata/latest/regex_automata/meta/st...
Ripgrep in turn exposes some knobs to tweak the heuristics
https://github.com/BurntSushi/ripgrep/blob/master/FAQ.md#how...