| # Encode Inputs |
|
|
| <tokenizerslangcontent> |
| <python> |
| These types represent all the different kinds of input that a accepts |
| when using [`~tokenizers.Tokenizer.encode_batch`]. |
|
|
| ## TextEncodeInput[[[[tokenizers.TextEncodeInput]]]] |
|
|
| <code>tokenizers.TextEncodeInput</code> |
|
|
| Represents a textual input for encoding. Can be either: |
| - A single sequence: (/docs/tokenizers/api/input-sequences#tokenizers.TextInputSequence) |
| - A pair of sequences: |
| - A Tuple of (/docs/tokenizers/api/input-sequences#tokenizers.TextInputSequence) |
| - Or a List of (/docs/tokenizers/api/input-sequences#tokenizers.TextInputSequence) of size 2 |
|
|
| alias of `Union`. |
|
|
| ## PreTokenizedEncodeInput |
|
|
| <code>tokenizers.PreTokenizedEncodeInput</code> |
|
|
| Represents a pre-tokenized input for encoding. Can be either: |
| - A single sequence: (/docs/tokenizers/api/input-sequences#tokenizers.PreTokenizedInputSequence) |
| - A pair of sequences: |
| - A Tuple of (/docs/tokenizers/api/input-sequences#tokenizers.PreTokenizedInputSequence) |
| - Or a List of (/docs/tokenizers/api/input-sequences#tokenizers.PreTokenizedInputSequence) of size 2 |
|
|
| alias of `Union`. |
|
|
| ## EncodeInput |
|
|
| <code>tokenizers.EncodeInput</code> |
|
|
| Represents all the possible types of input for encoding. Can be: |
| - When `is_pretokenized=False`: (#tokenizers.TextEncodeInput) |
| - When `is_pretokenized=True`: (#tokenizers.PreTokenizedEncodeInput) |
|
|
| alias of `Union`. |
| </python> |
| <rust> |
| The Rust API Reference is available directly on the (https://docs.rs/tokenizers/latest/tokenizers/) website. |
| </rust> |
| <node> |
| The node API has not been documented yet. |
| </node> |
| </tokenizerslangcontent> |