pub struct TokenOutputStream { /* private fields */ }
Expand description
TokenOutputStream
is a wrapper around a tokenizer that allows for streaming tokens to the user
rather than waiting for the full decoding to complete.
Implementations§
source§impl TokenOutputStream
impl TokenOutputStream
sourcepub fn new(tokenizer: Tokenizer) -> Self
pub fn new(tokenizer: Tokenizer) -> Self
Creates a new TokenOutputStream
instance.
§Arguments
tokenizer
- ATokenizer
instance to be used for tokenizing.
§Returns
A new TokenOutputStream
instance.
sourcepub fn into_inner(self) -> Tokenizer
pub fn into_inner(self) -> Tokenizer
Consumes the TokenOutputStream
, returning the inner Tokenizer
.
This method is used when the TokenOutputStream
is no longer needed,
and you want to access the underlying Tokenizer
.
§Returns
The inner Tokenizer
instance.
sourcepub fn tokenizer(&self) -> &Tokenizer
pub fn tokenizer(&self) -> &Tokenizer
Provides a reference to the inner Tokenizer
.
This method is used when you want to access the underlying Tokenizer
but still keep the TokenOutputStream
for further use.
§Returns
A reference to the inner Tokenizer
instance.
sourcepub fn clear(&mut self)
pub fn clear(&mut self)
Clears the TokenOutputStream
.
This method is used to reset the state of the TokenOutputStream
. It clears the tokens
and resets the prev_index
and current_index
to 0.
sourcepub fn next_token(&mut self, token: u32) -> Result<Option<String>>
pub fn next_token(&mut self, token: u32) -> Result<Option<String>>
Processes the next token and returns the decoded string if the token leads to a new word.
§Arguments
token
- The next token to process.
§Returns
A Result
which contains an Option
with the decoded string if the token leads to a new word,
or None
if it does not. Returns an error if the decoding fails.
§Example
// Assuming that the `tokenizer.json` file contains the following vocab:
// { "hello": 1, "world": 2, "everybody": 3 }
let tokenizer = Tokenizer::from_file("path/to/tokenizer.json").unwrap();
let mut stream = TokenOutputStream::new(tokenizer);
let tokens: [u32; 4] = [1, 2, 1, 3];
let sent: String = tokens
.iter()
.filter_map(|token| stream.next_token(*token).ok())
.flatten()
.collect();
assert_eq!(sent, "hello world hello everybody");
sourcepub fn decode_rest(&self) -> Result<Option<String>>
pub fn decode_rest(&self) -> Result<Option<String>>
Decodes the remaining tokens and returns the decoded string if there are any new words.
§Returns
A Result
which contains an Option
with the decoded string if there are any new words,
or None
if there are not. Returns an error if the decoding fails.
sourcepub fn decode_all(&self) -> Result<String>
pub fn decode_all(&self) -> Result<String>
Decodes all tokens in the TokenOutputStream
and returns the decoded string.
§Returns
A Result
which contains the decoded string if the decoding is successful,
or an error if the decoding fails.
Trait Implementations§
Auto Trait Implementations§
impl !Freeze for TokenOutputStream
impl RefUnwindSafe for TokenOutputStream
impl Send for TokenOutputStream
impl Sync for TokenOutputStream
impl Unpin for TokenOutputStream
impl UnwindSafe for TokenOutputStream
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> Instrument for T
impl<T> Instrument for T
source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request