Module tokenizers::tokenizer::normalizer
source · Structs§
- A
NormalizedString
takes care of processing an “original” string to modify it and obtain a “normalized” string. It keeps both version of the string, alignments information between both and provides an interface to retrieve ranges of each string, using offsets from any of them.
Enums§
- The possible offsets referential
- Represents a Range usable by the NormalizedString to index its content. A Range can use indices relative to either the
Original
or theNormalized
string - Defines the expected behavior for the delimiter of a Split Pattern When splitting on
'-'
for example, with inputthe-final--countdown
:
Functions§
- Convert the given range from bytes to char
- Convert the given range from char to bytes
- Returns a range of the given string slice, by indexing chars instead of bytes