jordimas's picture
Better documentation
2dce94c
metadata
language:
  - ca
tags:
  - punctuation prediction
  - punctuation
datasets: softcatala/Europarl-catalan
widget:
  - text: >-
      Ara tenim ratolins de quatre mesos que no són diabètics tot i que solien
      ser-ho va afegir
    example_title: Catalan
metrics:
  - f1

This model predicts the punctuation of Catalan language.

The model restores the following punctuation markers: "." "," "?" "-" ":"

Based on the work https://github.com/oliverguhr/fullstop-deep-punctuation-prediction

Results

The performance differs for the single punctuation markers as hyphens and colons, in many cases, are optional and can be substituted by either a comma or a full stop. The model achieves the following F1 scores for Catalan language:

Label CA
0 (LABEL_0) 0.99
. (LABEL_1) 0.93
, (LABEL_2) 0.82
? (LABEL_3) 0.76
- (LABEL_4) 0.89
: (LABEL_5) 0.64
macro average 0.84

Contact

Jordi Mas jmas@softcatala.org