guillermoruiz commited on
Commit
ac6585c
1 Parent(s): 6641744

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -21
README.md CHANGED
@@ -30,37 +30,40 @@ You will need TensorFlow 2.4 or newer.
30
 
31
  # Quick guide
32
 
33
- You can see the demo notebooks for a quick guide on how to use the models.
 
 
 
 
34
 
35
- Clone this repository and then run
 
36
  ```
37
- bash download-emoji15-bilma.sh
 
 
 
38
  ```
39
 
40
- to download the MX model. Then to load the model you can use the code:
41
  ```
42
- from bilma import bilma_model
43
- vocab_file = "vocab_file_All.txt"
44
- model_file = "bilma_small_MX_epoch-1_classification_epochs-13.h5"
45
- model = bilma_model.load(model_file)
46
- tokenizer = bilma_model.tokenizer(vocab_file=vocab_file,
47
- max_length=280)
48
  ```
49
 
50
- Now you will need some text:
51
  ```
52
- texts = ["Tenemos tres dias sin internet ni senal de celular en el pueblo.",
53
- "Incomunicados en el siglo XXI tampoco hay servicio de telefonia fija",
54
- "Vamos a comer unos tacos",
55
- "Los del banco no dejan de llamarme"]
56
- toks = tokenizer.tokenize(texts)
57
  ```
58
 
59
- With this, you are ready to use the model
60
  ```
61
- p = model.predict(toks)
62
- tokenizer.decode_emo(p[1])
63
  ```
64
 
65
- which produces the output: ![emoji-output](https://user-images.githubusercontent.com/392873/165176270-77dd32ca-377e-4d29-ab4a-bc5f75913241.jpg)
66
- each emoji correspond to each entry in `texts`.
 
 
 
30
 
31
  # Quick guide
32
 
33
+ Install the following version for the transformers library
34
+ ```
35
+ !pip install transformers==4.30.2
36
+ ```
37
+
38
 
39
+
40
+ Instanciate the tokenizer and the trained model
41
  ```
42
+ from transformers import AutoTokenizer
43
+ tok = AutoTokenizer.from_pretrained("guillermoruiz/bilma_mx")
44
+ from transformers import TFAutoModel
45
+ model = TFAutoModel.from_pretrained("guillermoruiz/bilma_mx", trust_remote_code=True, include_top=False)
46
  ```
47
 
48
+ Now,we will need some text and then pass it through the tokenizer:
49
  ```
50
+ text = ["Vamos a comer [MASK].",
51
+ "Hace mucho que no voy al [MASK]."]
52
+ t = tok(text, padding="max_length", return_tensors="tf", max_length=280)
 
 
 
53
  ```
54
 
55
+ With this, we are ready to use the model
56
  ```
57
+ p = model(t)
 
 
 
 
58
  ```
59
 
60
+ Now, we get the most likely words with:
61
  ```
62
+ import tensorflow as tf
63
+ tok.batch_decode(tf.argmax(p["logits"], 2)[:,1:], skip_special_tokens=True)
64
  ```
65
 
66
+ which produces the output:
67
+ ```
68
+ ['vamos a comer tacos.', 'hace mucho que no voy al gym.']
69
+ ```