regex_automata::hybrid::dfa

Struct Builder

source
pub struct Builder { /* private fields */ }
Expand description

A builder for constructing a lazy deterministic finite automaton from regular expressions.

As a convenience, DFA::builder is an alias for Builder::new. The advantage of the former is that it often lets you avoid importing the Builder type directly.

This builder provides two main things:

  1. It provides a few different build routines for actually constructing a DFA from different kinds of inputs. The most convenient is Builder::build, which builds a DFA directly from a pattern string. The most flexible is Builder::build_from_nfa, which builds a DFA straight from an NFA.
  2. The builder permits configuring a number of things. Builder::configure is used with Config to configure aspects of the DFA and the construction process itself. Builder::syntax and Builder::thompson permit configuring the regex parser and Thompson NFA construction, respectively. The syntax and thompson configurations only apply when building from a pattern string.

This builder always constructs a single lazy DFA. As such, this builder can only be used to construct regexes that either detect the presence of a match or find the end location of a match. A single DFA cannot produce both the start and end of a match. For that information, use a Regex, which can be similarly configured using regex::Builder. The main reason to use a DFA directly is if the end location of a match is enough for your use case. Namely, a Regex will construct two lazy DFAs instead of one, since a second reverse DFA is needed to find the start of a match.

§Example

This example shows how to build a lazy DFA that uses a tiny cache capacity and completely disables Unicode. That is:

  • Things such as \w, . and \b are no longer Unicode-aware. \w and \b are ASCII-only while . matches any byte except for \n (instead of any UTF-8 encoding of a Unicode scalar value except for \n). Things that are Unicode only, such as \pL, are not allowed.
  • The pattern itself is permitted to match invalid UTF-8. For example, things like [^a] that match any byte except for a are permitted.
use regex_automata::{
    hybrid::dfa::DFA,
    nfa::thompson,
    util::syntax,
    HalfMatch, Input,
};

let dfa = DFA::builder()
    .configure(DFA::config().cache_capacity(5_000))
    .thompson(thompson::Config::new().utf8(false))
    .syntax(syntax::Config::new().unicode(false).utf8(false))
    .build(r"foo[^b]ar.*")?;
let mut cache = dfa.create_cache();

let haystack = b"\xFEfoo\xFFar\xE2\x98\xFF\n";
let expected = Some(HalfMatch::must(0, 10));
let got = dfa.try_search_fwd(&mut cache, &Input::new(haystack))?;
assert_eq!(expected, got);

Implementations§

source§

impl Builder

source

pub fn new() -> Builder

Create a new lazy DFA builder with the default configuration.

source

pub fn build(&self, pattern: &str) -> Result<DFA, BuildError>

Build a lazy DFA from the given pattern.

If there was a problem parsing or compiling the pattern, then an error is returned.

source

pub fn build_many<P: AsRef<str>>( &self, patterns: &[P], ) -> Result<DFA, BuildError>

Build a lazy DFA from the given patterns.

When matches are returned, the pattern ID corresponds to the index of the pattern in the slice given.

source

pub fn build_from_nfa(&self, nfa: NFA) -> Result<DFA, BuildError>

Build a DFA from the given NFA.

Note that this requires owning a thompson::NFA. While this may force you to clone the NFA, such a clone is not a deep clone. Namely, NFAs are defined internally to support shared ownership such that cloning is very cheap.

§Example

This example shows how to build a lazy DFA if you already have an NFA in hand.

use regex_automata::{
    hybrid::dfa::DFA,
    nfa::thompson,
    HalfMatch, Input,
};

let haystack = "foo123bar";

// This shows how to set non-default options for building an NFA.
let nfa = thompson::Compiler::new()
    .configure(thompson::Config::new().shrink(true))
    .build(r"[0-9]+")?;
let dfa = DFA::builder().build_from_nfa(nfa)?;
let mut cache = dfa.create_cache();
let expected = Some(HalfMatch::must(0, 6));
let got = dfa.try_search_fwd(&mut cache, &Input::new(haystack))?;
assert_eq!(expected, got);
source

pub fn configure(&mut self, config: Config) -> &mut Builder

Apply the given lazy DFA configuration options to this builder.

source

pub fn syntax(&mut self, config: Config) -> &mut Builder

Set the syntax configuration for this builder using syntax::Config.

This permits setting things like case insensitivity, Unicode and multi line mode.

These settings only apply when constructing a lazy DFA directly from a pattern.

source

pub fn thompson(&mut self, config: Config) -> &mut Builder

Set the Thompson NFA configuration for this builder using nfa::thompson::Config.

This permits setting things like whether the DFA should match the regex in reverse or if additional time should be spent shrinking the size of the NFA.

These settings only apply when constructing a DFA directly from a pattern.

Trait Implementations§

source§

impl Clone for Builder

source§

fn clone(&self) -> Builder

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for Builder

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> CloneToUninit for T
where T: Clone,

source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> ToOwned for T
where T: Clone,

source§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

source§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.