Pablo commited on
Commit
ae788a5
1 Parent(s): e951a81

Further format improvements

Browse files
Files changed (1) hide show
  1. app.py +3 -2
app.py CHANGED
@@ -52,12 +52,13 @@ st.sidebar.image(LOGO)
52
  # Body
53
  st.markdown(
54
  """
55
- BERTIN is a series of BERT-based models for Spanish.
 
56
  The models are trained with Flax and using TPUs sponsored by Google since this is part of the
57
  [Flax/Jax Community Week](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104)
58
  organised by HuggingFace.
59
 
60
- All models are variations of RoBERTa-base trained from scratch in Spanish using the mc4 dataset.
61
  We reduced the dataset size to 50 million documents to keep training times shorter, and also to be able to bias training examples based on their perplexity.
62
 
63
  The idea is to favour examples with perplexities that are neither too small (short, repetitive texts) or too long (potentially poor quality).
 
52
  # Body
53
  st.markdown(
54
  """
55
+ BERTIN is a series of BERT-based models for Spanish.
56
+
57
  The models are trained with Flax and using TPUs sponsored by Google since this is part of the
58
  [Flax/Jax Community Week](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104)
59
  organised by HuggingFace.
60
 
61
+ All models are variations of **RoBERTa-base** trained from scratch in **Spanish** using the **mc4 dataset**.
62
  We reduced the dataset size to 50 million documents to keep training times shorter, and also to be able to bias training examples based on their perplexity.
63
 
64
  The idea is to favour examples with perplexities that are neither too small (short, repetitive texts) or too long (potentially poor quality).