pub struct TokenOutputStream { /* private fields */ }Expand description
TokenOutputStream is a wrapper around a tokenizer that allows for streaming tokens to the user
rather than waiting for the full decoding to complete.
Implementations§
source§impl TokenOutputStream
impl TokenOutputStream
sourcepub fn new(tokenizer: Tokenizer) -> Self
pub fn new(tokenizer: Tokenizer) -> Self
Creates a new TokenOutputStream instance.
§Arguments
tokenizer- ATokenizerinstance to be used for tokenizing.
§Returns
A new TokenOutputStream instance.
sourcepub fn into_inner(self) -> Tokenizer
pub fn into_inner(self) -> Tokenizer
Consumes the TokenOutputStream, returning the inner Tokenizer.
This method is used when the TokenOutputStream is no longer needed,
and you want to access the underlying Tokenizer.
§Returns
The inner Tokenizer instance.
sourcepub fn tokenizer(&self) -> &Tokenizer
pub fn tokenizer(&self) -> &Tokenizer
Provides a reference to the inner Tokenizer.
This method is used when you want to access the underlying Tokenizer
but still keep the TokenOutputStream for further use.
§Returns
A reference to the inner Tokenizer instance.
sourcepub fn clear(&mut self)
pub fn clear(&mut self)
Clears the TokenOutputStream.
This method is used to reset the state of the TokenOutputStream. It clears the tokens
and resets the prev_index and current_index to 0.
sourcepub fn next_token(&mut self, token: u32) -> Result<Option<String>>
pub fn next_token(&mut self, token: u32) -> Result<Option<String>>
Processes the next token and returns the decoded string if the token leads to a new word.
§Arguments
token- The next token to process.
§Returns
A Result which contains an Option with the decoded string if the token leads to a new word,
or None if it does not. Returns an error if the decoding fails.
§Example
// Assuming that the `tokenizer.json` file contains the following vocab:
// { "hello": 1, "world": 2, "everybody": 3 }
let tokenizer = Tokenizer::from_file("path/to/tokenizer.json").unwrap();
let mut stream = TokenOutputStream::new(tokenizer);
let tokens: [u32; 4] = [1, 2, 1, 3];
let sent: String = tokens
.iter()
.filter_map(|token| stream.next_token(*token).ok())
.flatten()
.collect();
assert_eq!(sent, "hello world hello everybody");sourcepub fn decode_rest(&self) -> Result<Option<String>>
pub fn decode_rest(&self) -> Result<Option<String>>
Decodes the remaining tokens and returns the decoded string if there are any new words.
§Returns
A Result which contains an Option with the decoded string if there are any new words,
or None if there are not. Returns an error if the decoding fails.
sourcepub fn decode_all(&self) -> Result<String>
pub fn decode_all(&self) -> Result<String>
Decodes all tokens in the TokenOutputStream and returns the decoded string.
§Returns
A Result which contains the decoded string if the decoding is successful,
or an error if the decoding fails.
Trait Implementations§
Auto Trait Implementations§
impl !Freeze for TokenOutputStream
impl RefUnwindSafe for TokenOutputStream
impl Send for TokenOutputStream
impl Sync for TokenOutputStream
impl Unpin for TokenOutputStream
impl UnwindSafe for TokenOutputStream
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> Instrument for T
impl<T> Instrument for T
source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T in a tonic::Request