onigleam/token

Types

Kind of assertion

pub type AssertionKind {
  StartOfLine
  EndOfLine
  StartOfString
  EndOfString
  EndOfStringBeforeNewline
  WordBoundary
  NonWordBoundary
  GraphemeBoundary
  NonGraphemeBoundary
  SearchStart
}

Constructors

  • StartOfLine

    ^ - start of line (or string in singleline mode)

  • EndOfLine

    $ - end of line (or string in singleline mode)

  • StartOfString

    \A - start of string

  • EndOfString

    \z - end of string (absolute)

  • EndOfStringBeforeNewline

    \Z - end of string (allows trailing newline)

  • WordBoundary

    \b - word boundary

  • NonWordBoundary

    \B - non-word boundary

  • GraphemeBoundary

    \y - grapheme cluster boundary

  • NonGraphemeBoundary

    \Y - non-grapheme cluster boundary

  • SearchStart

    \G - start of search (not supported in gleam_regexp)

Reference for backreference

pub type BackreferenceRef {
  BackrefNumber(Int)
  BackrefName(String)
  BackrefRelative(Int)
}

Constructors

  • BackrefNumber(Int)

    Numbered backreference \1, \2, etc.

  • BackrefName(String)

    Named backreference \k or \k’name’

  • BackrefRelative(Int)

    Relative backreference \k<-1>

Kind of character set

pub type CharacterSetKind {
  Dot
  Digit
  Word
  Space
  Hex
  Newline
  NotNewline
  Any
  TextSegment
  Property
  Posix
}

Constructors

  • Dot

    . - any character (except newline unless dotall)

  • Digit

    \d, \D - digit

  • Word

    \w, \W - word character

  • Space

    \s, \S - whitespace

  • Hex

    \h, \H - hex digit

  • Newline

    \R - newline sequence

  • NotNewline

    \N - not newline

  • Any

    \O - any character (true any)

  • TextSegment

    \X - extended grapheme cluster

  • Property

    \p{…} or \P{…} - Unicode property

  • Posix

    [:alpha:] etc - POSIX class

Kind of directive

pub type DirectiveKind {
  Keep
  Flags
}

Constructors

  • Keep

    \K - keep (reset match start)

  • Flags

    (?imx) - flag directive (modifies flags for rest of pattern/group)

Flag modifiers for groups

pub type FlagModifiers {
  FlagModifiers(
    enable: option.Option(FlagSet),
    disable: option.Option(FlagSet),
  )
}

Constructors

Set of flags

pub type FlagSet {
  FlagSet(ignore_case: Bool, dot_all: Bool, extended: Bool)
}

Constructors

  • FlagSet(ignore_case: Bool, dot_all: Bool, extended: Bool)

    Arguments

    ignore_case

    i - case insensitive

    dot_all

    m in Onig (s in PCRE/JS) - dot matches newline

    extended

    x - extended mode (whitespace ignored, # comments)

Kind of group

pub type GroupKind {
  Capturing
  NonCapturing
  Atomic
  Lookahead
  NegativeLookahead
  Lookbehind
  NegativeLookbehind
  Absence
}

Constructors

  • Capturing

    (…) or (?…) - capturing group

  • NonCapturing

    (?:…) - non-capturing group (also used for flag groups like (?i:…))

  • Atomic

    (?>…) - atomic group (not supported)

  • Lookahead

    (?=…) - positive lookahead

  • NegativeLookahead

    (?!…) - negative lookahead (negate=true)

  • Lookbehind

    (?<=…) - positive lookbehind

  • NegativeLookbehind

    (?<!…) - negative lookbehind (negate=true)

  • Absence

    (?~…) - absence operator

Kind of quantifier

pub type QuantifierKind {
  Greedy
  Lazy
  Possessive
}

Constructors

  • Greedy

    Greedy quantifier (default)

  • Lazy

    Lazy quantifier (?)

  • Possessive

    Possessive quantifier (+) - not supported

Reference for subroutine calls

pub type SubroutineRef {
  SubroutineRecursion
  SubroutineNumber(Int)
  SubroutineName(String)
  SubroutineRelative(Int)
}

Constructors

  • SubroutineRecursion

    \g<0> - recursion (whole pattern)

  • SubroutineNumber(Int)

    \g - call to numbered group

  • SubroutineName(String)

    \g - call to named group

  • SubroutineRelative(Int)

    \g<+n> or \g<-n> - relative call

A token produced by the tokenizer

pub type Token {
  Alternator(raw: String)
  Assertion(kind: AssertionKind, raw: String)
  Backreference(ref: BackreferenceRef, raw: String)
  Character(value: Int, raw: String)
  CharacterClassClose(raw: String)
  CharacterClassHyphen(raw: String)
  CharacterClassIntersector(raw: String)
  CharacterClassOpen(negate: Bool, raw: String)
  CharacterSet(
    kind: CharacterSetKind,
    negate: Bool,
    value: option.Option(String),
    raw: String,
  )
  Directive(
    kind: DirectiveKind,
    flags: option.Option(FlagModifiers),
    raw: String,
  )
  GroupClose(raw: String)
  GroupOpen(
    kind: GroupKind,
    flags: option.Option(FlagModifiers),
    name: option.Option(String),
    number: option.Option(Int),
    negate: Bool,
    raw: String,
  )
  Quantifier(
    kind: QuantifierKind,
    min: Int,
    max: option.Option(Int),
    raw: String,
  )
  Subroutine(ref: SubroutineRef, raw: String)
}

Constructors

  • Alternator(raw: String)

    Alternation operator |

  • Assertion(kind: AssertionKind, raw: String)

    Assertion like ^, $, \b, \B, \A, \z, \Z, \y, \Y, \G

  • Backreference(ref: BackreferenceRef, raw: String)

    Backreference like \1, \k, \k’name’

  • Character(value: Int, raw: String)

    A single character (literal or escape sequence)

  • CharacterClassClose(raw: String)

    Character class close ]

  • CharacterClassHyphen(raw: String)

    Hyphen in character class (potential range operator)

  • CharacterClassIntersector(raw: String)

    Character class intersection &&

  • CharacterClassOpen(negate: Bool, raw: String)

    Character class open [ or [^

  • CharacterSet(
      kind: CharacterSetKind,
      negate: Bool,
      value: option.Option(String),
      raw: String,
    )

    Character set like ., \d, \w, \s, \p{…}, [:alpha:]

  • Directive(
      kind: DirectiveKind,
      flags: option.Option(FlagModifiers),
      raw: String,
    )

    Directive like \K or flag directive (?im)

  • GroupClose(raw: String)

    Group close )

  • GroupOpen(
      kind: GroupKind,
      flags: option.Option(FlagModifiers),
      name: option.Option(String),
      number: option.Option(Int),
      negate: Bool,
      raw: String,
    )

    Group open with various kinds

  • Quantifier(
      kind: QuantifierKind,
      min: Int,
      max: option.Option(Int),
      raw: String,
    )

    Quantifier like *, +, ?, {n}, {n,}, {n,m}

  • Subroutine(ref: SubroutineRef, raw: String)

    Subroutine call \g or \g<0>

Values

pub fn empty_flag_set() -> FlagSet

Create an empty flag set

pub fn flag_set_with_dot_all() -> FlagSet
pub fn flag_set_with_extended() -> FlagSet
pub fn flag_set_with_ignore_case() -> FlagSet

Create a flag set with a single flag enabled

pub fn raw(token: Token) -> String

Get the raw string from a token

Search Document