rodrigo-nogueira commited on
Commit
16fdda0
1 Parent(s): 90dc4d7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -4,4 +4,15 @@ MariTalk Large is a proprietary LLM that can be used through an API endpoint, wh
4
 
5
  The purpose of including this tokenizer is to allow you to estimate the number of tokens in your prompts and, therefore, the cost of using the model.
6
 
 
 
 
 
 
 
 
 
 
 
 
7
  For more information on how to use the model, please refer to our documentation at [this link](https://maritaca-ai.github.io/maritalk-api/maritalk.html).
 
4
 
5
  The purpose of including this tokenizer is to allow you to estimate the number of tokens in your prompts and, therefore, the cost of using the model.
6
 
7
+ ```python
8
+ import transformers
9
+ tokenizer = transformers.AutoTokenizer.from_pretrained("maritaca-ai/maritalk-tokenizer-large")
10
+
11
+ prompt = "Com quantos paus se faz uma canoa?"
12
+
13
+ tokens = tokenizer.encode(prompt)
14
+
15
+ print(f'O prompt "{prompt}" contém {len(tokens)} tokens.') # Deve imprimir 12 tokens.
16
+ ```
17
+
18
  For more information on how to use the model, please refer to our documentation at [this link](https://maritaca-ai.github.io/maritalk-api/maritalk.html).