GorkaUrbizu commited on
Commit
63af604
1 Parent(s): bbc7c4e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -10,7 +10,7 @@ tags:
10
 
11
  # ElhBERTeu
12
 
13
- This is a BERT model for Basque introduced in [BasqueGLUE: A Natural Language Understanding Benchmark for Basque]().
14
 
15
  To train ElhBERTeu, we collected different corpora sources from several domains: updated (2021) national and local news sources, Basque Wikipedia, as well as novel news sources and texts from other domains, such as science (both academic and divulgative), literature or subtitles. More details about the corpora used and their sizes are shown in the following table. Texts from news sources were oversampled (duplicated) as done during the training of BERTeus. In total 575M tokens were used for pre-training ElhBERTeu.
16
 
 
10
 
11
  # ElhBERTeu
12
 
13
+ This is a BERT model for Basque introduced in [BasqueGLUE: A Natural Language Understanding Benchmark for Basque](https://aclanthology.org/2022.lrec-1.172/).
14
 
15
  To train ElhBERTeu, we collected different corpora sources from several domains: updated (2021) national and local news sources, Basque Wikipedia, as well as novel news sources and texts from other domains, such as science (both academic and divulgative), literature or subtitles. More details about the corpora used and their sizes are shown in the following table. Texts from news sources were oversampled (duplicated) as done during the training of BERTeus. In total 575M tokens were used for pre-training ElhBERTeu.
16