1-800-BAD-CODE
commited on
Commit
•
0dc2ad3
1
Parent(s):
c502d01
Update README.md
Browse files
README.md
CHANGED
@@ -178,10 +178,11 @@ This model was trained on news data, and may not perform well on conversational
|
|
178 |
Further, this model is unlikely to be of production quality.
|
179 |
It was trained with "only" 1M lines per language, and the dev sets may have been noisy due to the nature of web-scraped news data.
|
180 |
|
181 |
-
This model over-predicts the inverted Spanish question mark,
|
182 |
context of a 47-language model, Spanish questions were over-sampled by selecting more of these sentences from
|
183 |
additional training data that was not used. However, this seems to have "over-corrected" the problem and a lot
|
184 |
-
of Spanish question marks are predicted.
|
|
|
185 |
|
186 |
|
187 |
# Evaluation
|
@@ -269,4 +270,70 @@ seg test report:
|
|
269 |
weighted avg 99.96 99.96 99.96 597175
|
270 |
```
|
271 |
|
272 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
178 |
Further, this model is unlikely to be of production quality.
|
179 |
It was trained with "only" 1M lines per language, and the dev sets may have been noisy due to the nature of web-scraped news data.
|
180 |
|
181 |
+
This model over-predicts the inverted Spanish question mark, `¿` (see metrics below). Since `¿` is a rare token, especially in the
|
182 |
context of a 47-language model, Spanish questions were over-sampled by selecting more of these sentences from
|
183 |
additional training data that was not used. However, this seems to have "over-corrected" the problem and a lot
|
184 |
+
of Spanish question marks are predicted. This can be fixed by exposing prior probabilities, but I'll fine-tune
|
185 |
+
it later to fix this the right way.
|
186 |
|
187 |
|
188 |
# Evaluation
|
|
|
270 |
weighted avg 99.96 99.96 99.96 597175
|
271 |
```
|
272 |
|
273 |
+
</details>
|
274 |
+
|
275 |
+
|
276 |
+
|
277 |
+
<details>
|
278 |
+
<summary>Spanish</summary>
|
279 |
+
|
280 |
+
```text
|
281 |
+
punct_pre test report:
|
282 |
+
label precision recall f1 support
|
283 |
+
<NULL> (label_id: 0) 99.96 99.76 99.86 609200
|
284 |
+
¿ (label_id: 1) 39.66 77.89 52.56 1221
|
285 |
+
-------------------
|
286 |
+
micro avg 99.72 99.72 99.72 610421
|
287 |
+
macro avg 69.81 88.82 76.21 610421
|
288 |
+
weighted avg 99.83 99.72 99.76 610421
|
289 |
+
```
|
290 |
+
|
291 |
+
```text
|
292 |
+
punct_post test report:
|
293 |
+
label precision recall f1 support
|
294 |
+
<NULL> (label_id: 0) 99.17 98.44 98.80 553100
|
295 |
+
<ACRONYM> (label_id: 1) 23.33 43.75 30.43 48
|
296 |
+
. (label_id: 2) 91.92 92.58 92.25 29623
|
297 |
+
, (label_id: 3) 73.07 82.04 77.30 26432
|
298 |
+
? (label_id: 4) 49.44 71.84 58.57 1218
|
299 |
+
? (label_id: 5) 0.00 0.00 0.00 0
|
300 |
+
, (label_id: 6) 0.00 0.00 0.00 0
|
301 |
+
。 (label_id: 7) 0.00 0.00 0.00 0
|
302 |
+
、 (label_id: 8) 0.00 0.00 0.00 0
|
303 |
+
・ (label_id: 9) 0.00 0.00 0.00 0
|
304 |
+
। (label_id: 10) 0.00 0.00 0.00 0
|
305 |
+
؟ (label_id: 11) 0.00 0.00 0.00 0
|
306 |
+
، (label_id: 12) 0.00 0.00 0.00 0
|
307 |
+
; (label_id: 13) 0.00 0.00 0.00 0
|
308 |
+
። (label_id: 14) 0.00 0.00 0.00 0
|
309 |
+
፣ (label_id: 15) 0.00 0.00 0.00 0
|
310 |
+
፧ (label_id: 16) 0.00 0.00 0.00 0
|
311 |
+
-------------------
|
312 |
+
micro avg 97.39 97.39 97.39 610421
|
313 |
+
macro avg 67.39 77.73 71.47 610421
|
314 |
+
weighted avg 97.58 97.39 97.47 610421
|
315 |
+
```
|
316 |
+
|
317 |
+
```text
|
318 |
+
cap test report:
|
319 |
+
label precision recall f1 support
|
320 |
+
LOWER (label_id: 0) 99.82 99.86 99.84 2222062
|
321 |
+
UPPER (label_id: 1) 95.96 94.64 95.29 75940
|
322 |
+
-------------------
|
323 |
+
micro avg 99.69 99.69 99.69 2298002
|
324 |
+
macro avg 97.89 97.25 97.57 2298002
|
325 |
+
weighted avg 99.69 99.69 99.69 2298002
|
326 |
+
```
|
327 |
+
|
328 |
+
```text
|
329 |
+
seg test report:
|
330 |
+
label precision recall f1 support
|
331 |
+
NOSTOP (label_id: 0) 99.99 99.97 99.98 580519
|
332 |
+
FULLSTOP (label_id: 1) 99.52 99.81 99.66 32902
|
333 |
+
-------------------
|
334 |
+
micro avg 99.96 99.96 99.96 613421
|
335 |
+
macro avg 99.75 99.89 99.82 613421
|
336 |
+
weighted avg 99.96 99.96 99.96 613421
|
337 |
+
```
|
338 |
+
|
339 |
+
</details>
|