versae commited on
Commit
9ed2311
1 Parent(s): f97cce1

Reorganizing demo

Browse files
Files changed (1) hide show
  1. app.py +27 -11
app.py CHANGED
@@ -52,17 +52,27 @@ st.set_page_config(page_title="BERTIN Demo", page_icon=LOGO)
52
  st.title("BERTIN")
53
 
54
  #Sidebar
55
- st.sidebar.image(LOGO)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
 
57
  # Body
58
  st.markdown(
59
  """
60
- BERTIN is a series of BERT-based models for Spanish.
61
-
62
- The models are trained with Flax and using TPUs sponsored by Google since this is part of the
63
- [Flax/Jax Community Week](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104)
64
- organised by HuggingFace.
65
-
66
  All models are variations of **RoBERTa-base** trained from scratch in **Spanish** using a sample from the **mc4 dataset**.
67
  We reduced the dataset size to 50 million documents to keep training times shorter, and also to be able to bias training examples based on their perplexity.
68
 
@@ -72,15 +82,21 @@ st.markdown(
72
  * **Stepwise** applies different four sampling probabilities to each of the four quartiles of the perplexity distribution.
73
 
74
  The first models have been trained (250.000 steps) on sequence length 128, and then training for Gaussian changed to sequence length 512 for the last 25.000 training steps to yield another version.
75
-
76
  Please read our [full report](https://huggingface.co/bertin-project/bertin-roberta-base-spanish) for more details on the methodology and metrics on downstream tasks.
77
  """
78
  )
79
 
80
- model_name = st.selectbox("Model", list(MODELS.keys()))
81
- model_url = MODELS[model_name]["url"]
 
 
 
 
 
 
82
 
83
- prompt = st.selectbox("Prompt", ["Random", "Custom"])
84
  if prompt == "Custom":
85
  prompt_box = "Enter your masked text here..."
86
  else:
52
  st.title("BERTIN")
53
 
54
  #Sidebar
55
+ st.sidebar.markdown(f"""
56
+ <div align=center>
57
+ <img src="{LOGO}" width=200/>
58
+
59
+ # BERTIN
60
+
61
+ </div>
62
+
63
+ BERTIN is a series of BERT-based models for Spanish.
64
+
65
+ The models are trained with Flax and using TPUs sponsored by Google since this is part of the
66
+ [Flax/Jax Community Week](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104)
67
+ organised by HuggingFace.
68
+
69
+ Please read our [full report](https://huggingface.co/bertin-project/bertin-roberta-base-spanish) for more details on the methodology and metrics on downstream tasks.
70
+
71
+ """, unsafe_allow_html=True)
72
 
73
  # Body
74
  st.markdown(
75
  """
 
 
 
 
 
 
76
  All models are variations of **RoBERTa-base** trained from scratch in **Spanish** using a sample from the **mc4 dataset**.
77
  We reduced the dataset size to 50 million documents to keep training times shorter, and also to be able to bias training examples based on their perplexity.
78
 
82
  * **Stepwise** applies different four sampling probabilities to each of the four quartiles of the perplexity distribution.
83
 
84
  The first models have been trained (250.000 steps) on sequence length 128, and then training for Gaussian changed to sequence length 512 for the last 25.000 training steps to yield another version.
85
+
86
  Please read our [full report](https://huggingface.co/bertin-project/bertin-roberta-base-spanish) for more details on the methodology and metrics on downstream tasks.
87
  """
88
  )
89
 
90
+ col1, col2, col3 = st.beta_columns(3)
91
+ strategy = col1.selectbox("Sampling strategy", ["Gaussian", "Stepwise", "Random"])
92
+ seq_len = col2.selectbox("Sequence length", [128, 512])
93
+
94
+ if seq_len == 128:
95
+ model_url = f"bertin-project/bertin-base-{str(strategy).lower()}"
96
+ else:
97
+ model_url = f"bertin-project/bertin-base-{str(strategy).lower()}-exp-512seqlen"
98
 
99
+ prompt = col3.selectbox("Prompt", ["Random", "Custom"])
100
  if prompt == "Custom":
101
  prompt_box = "Enter your masked text here..."
102
  else: