aho_corasick::dfa

Struct DFA

source
pub struct DFA { /* private fields */ }
Expand description

A DFA implementation of Aho-Corasick.

When possible, prefer using AhoCorasick instead of this type directly. Using a DFA directly is typically only necessary when one needs access to the Automaton trait implementation.

This DFA can only be built by first constructing a noncontiguous::NFA. Both DFA::new and Builder::build do this for you automatically, but Builder::build_from_noncontiguous permits doing it explicitly.

A DFA provides the best possible search performance (in this crate) via two mechanisms:

  • All states use a dense representation for their transitions.
  • All failure transitions are pre-computed such that they are never explicitly handled at search time.

These two facts combined mean that every state transition is performed using a constant number of instructions. However, this comes at great cost. The memory usage of a DFA can be quite exorbitant. It is potentially multiple orders of magnitude greater than a contiguous::NFA for example. In exchange, a DFA will typically have better search speed than a contiguous::NFA, but not by orders of magnitude.

Unless you have a small number of patterns or memory usage is not a concern and search performance is critical, a DFA is usually not the best choice.

Moreover, unlike the NFAs in this crate, it is costly for a DFA to support for anchored and unanchored search configurations. Namely, since failure transitions are pre-computed, supporting both anchored and unanchored searches requires a duplication of the transition table, making the memory usage of such a DFA ever bigger. (The NFAs in this crate unconditionally support both anchored and unanchored searches because there is essentially no added cost for doing so.) It is for this reason that a DFA’s support for anchored and unanchored searches can be configured via Builder::start_kind. By default, a DFA only supports unanchored searches.

§Example

This example shows how to build an DFA directly and use it to execute Automaton::try_find:

use aho_corasick::{
    automaton::Automaton,
    dfa::DFA,
    Input, Match,
};

let patterns = &["b", "abc", "abcd"];
let haystack = "abcd";

let nfa = DFA::new(patterns).unwrap();
assert_eq!(
    Some(Match::must(0, 1..2)),
    nfa.try_find(&Input::new(haystack))?,
);

It is also possible to implement your own version of try_find. See the Automaton documentation for an example.

Implementations§

source§

impl DFA

source

pub fn new<I, P>(patterns: I) -> Result<DFA, BuildError>
where I: IntoIterator<Item = P>, P: AsRef<[u8]>,

Create a new Aho-Corasick DFA using the default configuration.

Use a Builder if you want to change the configuration.

source

pub fn builder() -> Builder

A convenience method for returning a new Aho-Corasick DFA builder.

This usually permits one to just import the DFA type.

Trait Implementations§

source§

impl Automaton for DFA

source§

fn start_state(&self, anchored: Anchored) -> Result<StateID, MatchError>

Returns the starting state for the given anchor mode. Read more
source§

fn next_state(&self, _anchored: Anchored, sid: StateID, byte: u8) -> StateID

Performs a state transition from sid for byte and returns the next state. Read more
source§

fn is_special(&self, sid: StateID) -> bool

Returns true if the given ID represents a “special” state. A special state is a dead, match or start state. Read more
source§

fn is_dead(&self, sid: StateID) -> bool

Returns true if the given ID represents a dead state. Read more
source§

fn is_match(&self, sid: StateID) -> bool

Returns true if the given ID represents a match state. Read more
source§

fn is_start(&self, sid: StateID) -> bool

Returns true if the given ID represents a start state. Read more
source§

fn match_kind(&self) -> MatchKind

Returns the match semantics that this automaton was built with.
source§

fn patterns_len(&self) -> usize

Returns the total number of patterns compiled into this automaton.
source§

fn pattern_len(&self, pid: PatternID) -> usize

Returns the length of the pattern for the given ID. Read more
source§

fn min_pattern_len(&self) -> usize

Returns the length, in bytes, of the shortest pattern in this automaton.
source§

fn max_pattern_len(&self) -> usize

Returns the length, in bytes, of the longest pattern in this automaton.
source§

fn match_len(&self, sid: StateID) -> usize

Returns the total number of matches for the given state ID. Read more
source§

fn match_pattern(&self, sid: StateID, index: usize) -> PatternID

Returns the pattern ID for the match state given by sid at the index given. Read more
source§

fn memory_usage(&self) -> usize

Returns the heap memory usage, in bytes, used by this automaton.
source§

fn prefilter(&self) -> Option<&Prefilter>

Returns a prefilter, if available, that can be used to accelerate searches for this automaton. Read more
source§

fn try_find(&self, input: &Input<'_>) -> Result<Option<Match>, MatchError>

Executes a non-overlapping search with this automaton using the given configuration. Read more
source§

fn try_find_overlapping( &self, input: &Input<'_>, state: &mut OverlappingState, ) -> Result<(), MatchError>

Executes a overlapping search with this automaton using the given configuration. Read more
source§

fn try_find_iter<'a, 'h>( &'a self, input: Input<'h>, ) -> Result<FindIter<'a, 'h, Self>, MatchError>
where Self: Sized,

Returns an iterator of non-overlapping matches with this automaton using the given configuration. Read more
source§

fn try_find_overlapping_iter<'a, 'h>( &'a self, input: Input<'h>, ) -> Result<FindOverlappingIter<'a, 'h, Self>, MatchError>
where Self: Sized,

Returns an iterator of overlapping matches with this automaton using the given configuration. Read more
source§

fn try_replace_all<B>( &self, haystack: &str, replace_with: &[B], ) -> Result<String, MatchError>
where Self: Sized, B: AsRef<str>,

Replaces all non-overlapping matches in haystack with strings from replace_with depending on the pattern that matched. The replace_with slice must have length equal to Automaton::patterns_len. Read more
source§

fn try_replace_all_bytes<B>( &self, haystack: &[u8], replace_with: &[B], ) -> Result<Vec<u8>, MatchError>
where Self: Sized, B: AsRef<[u8]>,

Replaces all non-overlapping matches in haystack with strings from replace_with depending on the pattern that matched. The replace_with slice must have length equal to Automaton::patterns_len. Read more
source§

fn try_replace_all_with<F>( &self, haystack: &str, dst: &mut String, replace_with: F, ) -> Result<(), MatchError>
where Self: Sized, F: FnMut(&Match, &str, &mut String) -> bool,

Replaces all non-overlapping matches in haystack by calling the replace_with closure given. Read more
source§

fn try_replace_all_with_bytes<F>( &self, haystack: &[u8], dst: &mut Vec<u8>, replace_with: F, ) -> Result<(), MatchError>
where Self: Sized, F: FnMut(&Match, &[u8], &mut Vec<u8>) -> bool,

Replaces all non-overlapping matches in haystack by calling the replace_with closure given. Read more
source§

fn try_stream_find_iter<'a, R: Read>( &'a self, rdr: R, ) -> Result<StreamFindIter<'a, Self, R>, MatchError>
where Self: Sized,

Returns an iterator of non-overlapping matches with this automaton from the stream given. Read more
source§

fn try_stream_replace_all<R, W, B>( &self, rdr: R, wtr: W, replace_with: &[B], ) -> Result<()>
where Self: Sized, R: Read, W: Write, B: AsRef<[u8]>,

Replaces all non-overlapping matches in rdr with strings from replace_with depending on the pattern that matched, and writes the result to wtr. The replace_with slice must have length equal to Automaton::patterns_len. Read more
source§

fn try_stream_replace_all_with<R, W, F>( &self, rdr: R, wtr: W, replace_with: F, ) -> Result<()>
where Self: Sized, R: Read, W: Write, F: FnMut(&Match, &[u8], &mut W) -> Result<()>,

Replaces all non-overlapping matches in rdr by calling the replace_with closure given and writing the result to wtr. Read more
source§

impl Clone for DFA

source§

fn clone(&self) -> DFA

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for DFA

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl Freeze for DFA

§

impl RefUnwindSafe for DFA

§

impl Send for DFA

§

impl Sync for DFA

§

impl Unpin for DFA

§

impl UnwindSafe for DFA

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> CloneToUninit for T
where T: Clone,

source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> ToOwned for T
where T: Clone,

source§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

source§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.