Update app.py
Browse files
app.py
CHANGED
@@ -28,13 +28,12 @@ We curated a new dataset that combines existing corpora for readability assessme
|
|
28 |
|
29 |
Each text has two readability labels, according to the following mapping:
|
30 |
|
31 |
-
| |
|
32 |
|------------------|--------------|--------------|-----------------|-----------------|------------------|
|
33 |
-
| |
|
34 |
| With CERF Levels | A1, A2, B1 | B2, C1, C2 | A1, A2 | B1,B2 | C1,C2 |
|
35 |
| Newsela Corpus | Versions 3-4 | Versions 0-1 | Grade Level 2-5 | Grade Level 6-8 | Grade Level 9-12 |
|
36 |
|
37 |
-
|
38 |
In addition, texts in the dataset could be too long to fit in a model. As such, we created two versions of the dataset, dividing each text into [sentences](https://huggingface.co/datasets/hackathon-pln-es/readability-es-sentences) and [paragraphs](https://huggingface.co/datasets/hackathon-pln-es/readability-es-paragraphs).
|
39 |
|
40 |
We also scraped several texts from the ["Corpus de Aprendices del Español" (CAES)](http://galvan.usc.es/caes/). However, due to the time constraints, we leave experiments with it for future work. The data is available [here](https://huggingface.co/datasets/hackathon-pln-es/readability-es-caes).
|
|
|
28 |
|
29 |
Each text has two readability labels, according to the following mapping:
|
30 |
|
31 |
+
| | 2-class | | | 3-class | |
|
32 |
|------------------|--------------|--------------|-----------------|-----------------|------------------|
|
33 |
+
| | Simple | Complex | Basic | Intermediate | Advanced |
|
34 |
| With CERF Levels | A1, A2, B1 | B2, C1, C2 | A1, A2 | B1,B2 | C1,C2 |
|
35 |
| Newsela Corpus | Versions 3-4 | Versions 0-1 | Grade Level 2-5 | Grade Level 6-8 | Grade Level 9-12 |
|
36 |
|
|
|
37 |
In addition, texts in the dataset could be too long to fit in a model. As such, we created two versions of the dataset, dividing each text into [sentences](https://huggingface.co/datasets/hackathon-pln-es/readability-es-sentences) and [paragraphs](https://huggingface.co/datasets/hackathon-pln-es/readability-es-paragraphs).
|
38 |
|
39 |
We also scraped several texts from the ["Corpus de Aprendices del Español" (CAES)](http://galvan.usc.es/caes/). However, due to the time constraints, we leave experiments with it for future work. The data is available [here](https://huggingface.co/datasets/hackathon-pln-es/readability-es-caes).
|