mrm8488 commited on
Commit
c9da272
1 Parent(s): c737857

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -1
README.md CHANGED
@@ -12,4 +12,29 @@ license: mit
12
  ---
13
  # Spanish GPT-2 trained on BETO's corpus (large_spanish_corpus)
14
 
15
- ## Details are WIP
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
  # Spanish GPT-2 trained on BETO's corpus (large_spanish_corpus)
14
 
15
+ # BERTIN
16
+ This is a Spanish GPT-2 model trained from scratch on the [large_spanish_corpus]() aka BETO's corpus with [Flax](https://github.com/google/flax)
17
+ This is part of the
18
+ [Flax/Jax Community Week](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104), organised by [HuggingFace](https://huggingface.co/) and TPU usage sponsored by Google.
19
+ ## Dataset
20
+ The dataset is about 20 GB. 95% of the data was used for training and the rest 5% for validation.
21
+
22
+ ## Metrics (on evaluation dataset)
23
+
24
+
25
+
26
+ ```
27
+ ## Team members
28
+ - Javier de la Rosa ([versae](https://huggingface.co/versae))
29
+ - Eduardo González ([edugp](https://huggingface.co/edugp))
30
+ - Paulo Villegas ([paulo](https://huggingface.co/paulo))
31
+ - Pablo González de Prado ([Pablogps](https://huggingface.co/Pablogps))
32
+ - Manu Romero ([mrm8488](https://huggingface.co/))
33
+ - María Grandury ([mariagrandury](https://huggingface.co/))
34
+ ## Useful links
35
+ - [Community Week timeline](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104#summary-timeline-calendar-6)
36
+ - [Community Week README](https://github.com/huggingface/transformers/blob/master/examples/research_projects/jax-projects/README.md)
37
+ - [Community Week thread](https://discuss.huggingface.co/t/bertin-pretrain-roberta-large-from-scratch-in-spanish/7125)
38
+ - [Community Week channel](https://discord.com/channels/858019234139602994/859113060068229190)
39
+ - [Masked Language Modelling example scripts](https://github.com/huggingface/transformers/tree/master/examples/flax/language-modeling)
40
+ - [Model Repository](https://huggingface.co/flax-community/bertin-roberta-large-spanish/)