guillermoruiz
/

bilma_MX

Inference Endpoints

Model card Files Files and versions Community

guillermoruiz commited on Apr 1

Commit

ac6585c

•

1 Parent(s): 6641744

Update README.md

Files changed (1) hide show

README.md +24 -21

README.md CHANGED Viewed

@@ -30,37 +30,40 @@ You will need TensorFlow 2.4 or newer.
 # Quick guide
-You can see the demo notebooks for a quick guide on how to use the models.
-Clone this repository and then run
 ```
-bash download-emoji15-bilma.sh
 ```
-to download the MX model. Then to load the model you can use the code:
 ```
-from bilma import bilma_model
-vocab_file = "vocab_file_All.txt"
-model_file = "bilma_small_MX_epoch-1_classification_epochs-13.h5"
-model = bilma_model.load(model_file)
-tokenizer = bilma_model.tokenizer(vocab_file=vocab_file,
-max_length=280)
 ```
-Now you will need some text:
 ```
-texts = ["Tenemos tres dias sin internet ni senal de celular en el pueblo.",
-         "Incomunicados en el siglo XXI tampoco hay servicio de telefonia fija",
-         "Vamos a comer unos tacos",
-         "Los del banco no dejan de llamarme"]
-toks = tokenizer.tokenize(texts)
 ```
-With this, you are ready to use the model
 ```
-p = model.predict(toks)
-tokenizer.decode_emo(p[1])
 ```
-which produces the output: ![emoji-output](https://user-images.githubusercontent.com/392873/165176270-77dd32ca-377e-4d29-ab4a-bc5f75913241.jpg)
-each emoji correspond to each entry in `texts`.

 # Quick guide
+Install the following version for the transformers library
+```
+!pip install transformers==4.30.2
+```
+Instanciate the tokenizer and the trained model
 ```
+from transformers import AutoTokenizer
+tok = AutoTokenizer.from_pretrained("guillermoruiz/bilma_mx")
+from transformers import TFAutoModel
+model = TFAutoModel.from_pretrained("guillermoruiz/bilma_mx", trust_remote_code=True, include_top=False)
 ```
+Now,we will need some text and then pass it through the tokenizer:
 ```
+text = ["Vamos a comer [MASK].",
+        "Hace mucho que no voy al [MASK]."]
+t = tok(text, padding="max_length", return_tensors="tf", max_length=280)
 ```
+With this, we are ready to use the model
 ```
+p = model(t)
 ```
+Now, we get the most likely words with:
 ```
+import tensorflow as tf
+tok.batch_decode(tf.argmax(p["logits"], 2)[:,1:], skip_special_tokens=True)
 ```
+which produces the output:
+```
+['vamos a comer tacos.', 'hace mucho que no voy al gym.']
+```