Struct tokenizers::tokenizer::AddedToken
source · pub struct AddedToken {
pub content: String,
pub single_word: bool,
pub lstrip: bool,
pub rstrip: bool,
pub normalized: bool,
pub special: bool,
}
Expand description
Represent a token added by the user on top of the existing Model vocabulary. AddedToken can be configured to specify the behavior they should have in various situations like:
- Whether they should only match single words
- Whether to include any whitespace on its left or right
Fields§
§content: String
The content of the added token
single_word: bool
Whether this token must be a single word or can break words
lstrip: bool
Whether this token should strip whitespaces on its left
rstrip: bool
Whether this token should strip whitespaces on its right
normalized: bool
Whether this token should be normalized
special: bool
Whether this token is special
Implementations§
source§impl AddedToken
impl AddedToken
sourcepub fn from<S: Into<String>>(content: S, special: bool) -> Self
pub fn from<S: Into<String>>(content: S, special: bool) -> Self
Build this token from the given content, specifying if it is intented to be a special token. Special tokens are not normalized by default.
sourcepub fn single_word(self, single_word: bool) -> Self
pub fn single_word(self, single_word: bool) -> Self
Specify whether this token should only match on whole single words, and never part of a word.
sourcepub fn lstrip(self, lstrip: bool) -> Self
pub fn lstrip(self, lstrip: bool) -> Self
Specify whether this token should include all the whitespaces on its left, in order to strip them out.
sourcepub fn rstrip(self, rstrip: bool) -> Self
pub fn rstrip(self, rstrip: bool) -> Self
Specify whether this token should include all the whitespaces on its right, in order to strip them out.
sourcepub fn normalized(self, normalized: bool) -> Self
pub fn normalized(self, normalized: bool) -> Self
Specify whether this token should be normalized and match against its normalized version in the input text.
Trait Implementations§
source§impl Clone for AddedToken
impl Clone for AddedToken
source§fn clone(&self) -> AddedToken
fn clone(&self) -> AddedToken
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moresource§impl Debug for AddedToken
impl Debug for AddedToken
source§impl Default for AddedToken
impl Default for AddedToken
source§impl<'de> Deserialize<'de> for AddedToken
impl<'de> Deserialize<'de> for AddedToken
source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
source§impl Hash for AddedToken
impl Hash for AddedToken
source§impl PartialEq for AddedToken
impl PartialEq for AddedToken
source§fn eq(&self, other: &AddedToken) -> bool
fn eq(&self, other: &AddedToken) -> bool
self
and other
values to be equal, and is used
by ==
.