1-800-BAD-CODE
/

xlm-roberta_punctuation_fullstop_truecase

@@ -178,10 +178,11 @@ This model was trained on news data, and may not perform well on conversational
 Further, this model is unlikely to be of production quality.
 It was trained with "only" 1M lines per language, and the dev sets may have been noisy due to the nature of web-scraped news data.
-This model over-predicts the inverted Spanish question mark, `¿`. Since `¿` is a rare token, especially in the
 context of a 47-language model, Spanish questions were over-sampled by selecting more of these sentences from
 additional training data that was not used. However, this seems to have "over-corrected" the problem and a lot
-of Spanish question marks are predicted.
 # Evaluation
@@ -269,4 +270,70 @@ seg test report:
     weighted avg                                            99.96      99.96      99.96     597175
 ```
-<\details>

 Further, this model is unlikely to be of production quality.
 It was trained with "only" 1M lines per language, and the dev sets may have been noisy due to the nature of web-scraped news data.
+This model over-predicts the inverted Spanish question mark, `¿` (see metrics below). Since `¿` is a rare token, especially in the
 context of a 47-language model, Spanish questions were over-sampled by selecting more of these sentences from
 additional training data that was not used. However, this seems to have "over-corrected" the problem and a lot
+of Spanish question marks are predicted. This can be fixed by exposing prior probabilities, but I'll fine-tune
+it later to fix this the right way.
 # Evaluation
     weighted avg                                            99.96      99.96      99.96     597175
 ```
+</details>
+<details>
+  <summary>Spanish</summary>
+```text
+  punct_pre test report:
+    label                                                precision    recall       f1           support
+    <NULL> (label_id: 0)                                    99.96      99.76      99.86     609200
+    ¿ (label_id: 1)                                         39.66      77.89      52.56       1221
+    -------------------
+    micro avg                                               99.72      99.72      99.72     610421
+    macro avg                                               69.81      88.82      76.21     610421
+    weighted avg                                            99.83      99.72      99.76     610421
+```
+```text
+punct_post test report:
+    label                                                precision    recall       f1           support
+    <NULL> (label_id: 0)                                    99.17      98.44      98.80     553100
+    <ACRONYM> (label_id: 1)                                 23.33      43.75      30.43         48
+    . (label_id: 2)                                         91.92      92.58      92.25      29623
+    , (label_id: 3)                                         73.07      82.04      77.30      26432
+    ? (label_id: 4)                                         49.44      71.84      58.57       1218
+    ？ (label_id: 5)                                          0.00       0.00       0.00          0
+    ， (label_id: 6)                                          0.00       0.00       0.00          0
+    。 (label_id: 7)                                          0.00       0.00       0.00          0
+    、 (label_id: 8)                                          0.00       0.00       0.00          0
+    ・ (label_id: 9)                                          0.00       0.00       0.00          0
+    । (label_id: 10)                                         0.00       0.00       0.00          0
+    ؟ (label_id: 11)                                         0.00       0.00       0.00          0
+    ، (label_id: 12)                                         0.00       0.00       0.00          0
+    ; (label_id: 13)                                         0.00       0.00       0.00          0
+    ። (label_id: 14)                                         0.00       0.00       0.00          0
+    ፣ (label_id: 15)                                         0.00       0.00       0.00          0
+    ፧ (label_id: 16)                                         0.00       0.00       0.00          0
+    -------------------
+    micro avg                                               97.39      97.39      97.39     610421
+    macro avg                                               67.39      77.73      71.47     610421
+    weighted avg                                            97.58      97.39      97.47     610421
+```
+```text
+cap test report:
+    label                                                precision    recall       f1           support
+    LOWER (label_id: 0)                                     99.82      99.86      99.84    2222062
+    UPPER (label_id: 1)                                     95.96      94.64      95.29      75940
+    -------------------
+    micro avg                                               99.69      99.69      99.69    2298002
+    macro avg                                               97.89      97.25      97.57    2298002
+    weighted avg                                            99.69      99.69      99.69    2298002
+```
+```text
+seg test report:
+    label                                                precision    recall       f1           support
+    NOSTOP (label_id: 0)                                    99.99      99.97      99.98     580519
+    FULLSTOP (label_id: 1)                                  99.52      99.81      99.66      32902
+    -------------------
+    micro avg                                               99.96      99.96      99.96     613421
+    macro avg                                               99.75      99.89      99.82     613421
+    weighted avg                                            99.96      99.96      99.96     613421
+```
+</details>