--- tags: - spacy - token-classification language: - vi license: cc-by-sa-4.0 model-index: - name: vi_udv25_vietnamesevtb_trf results: - task: name: TAG type: token-classification metrics: - name: TAG (XPOS) Accuracy type: accuracy value: 0.8805048216 - task: name: POS type: token-classification metrics: - name: POS (UPOS) Accuracy type: accuracy value: 0.9018631331 - task: name: MORPH type: token-classification metrics: - name: Morph (UFeats) Accuracy type: accuracy value: 0.9695345305 - task: name: LEMMA type: token-classification metrics: - name: Lemma Accuracy type: accuracy value: 0.8934519139 - task: name: UNLABELED_DEPENDENCIES type: token-classification metrics: - name: Unlabeled Attachment Score (UAS) type: f_score value: 0.6807696182 - task: name: LABELED_DEPENDENCIES type: token-classification metrics: - name: Labeled Attachment Score (LAS) type: f_score value: 0.6063552526 - task: name: SENTS type: token-classification metrics: - name: Sentences F-Score type: f_score value: 0.943275972 --- UD v2.5 benchmarking pipeline for UD_Vietnamese-VTB | Feature | Description | | --- | --- | | **Name** | `vi_udv25_vietnamesevtb_trf` | | **Version** | `0.0.1` | | **spaCy** | `>=3.2.1,<3.3.0` | | **Default Pipeline** | `experimental_char_ner_tokenizer`, `transformer`, `tagger`, `morphologizer`, `parser`, `experimental_edit_tree_lemmatizer` | | **Components** | `experimental_char_ner_tokenizer`, `transformer`, `senter`, `tagger`, `morphologizer`, `parser`, `experimental_edit_tree_lemmatizer` | | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) | | **Sources** | [Universal Dependencies v2.5](https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-3105) (Zeman, Daniel; et al.) | | **License** | `CC BY-SA 4.0` | | **Author** | [Explosion](https://explosion.ai) | ### Label Scheme

View label scheme (81 labels for 6 components)

| Component | Labels | | --- | --- | | **`experimental_char_ner_tokenizer`** | `TOKEN` | | **`senter`** | `I`, `S` | | **`tagger`** | `!`, `"`, `,`, `-`, `.`, `...`, `:`, `;`, `?`, `@`, `A`, `C`, `CC`, `E`, `I`, `L`, `LBKT`, `M`, `N`, `NP`, `Nb`, `Nc`, `Np`, `Nu`, `Ny`, `P`, `R`, `RBKT`, `T`, `V`, `VP`, `X`, `Y`, `Z` | | **`morphologizer`** | `POS=NOUN`, `POS=ADP`, `POS=X\|Polarity=Neg`, `POS=VERB`, `POS=ADJ`, `POS=PUNCT`, `POS=X`, `POS=SCONJ`, `NumType=Card\|POS=NUM`, `POS=DET`, `POS=CCONJ`, `POS=PROPN`, `POS=AUX`, `POS=PART`, `POS=INTJ` | | **`parser`** | `ROOT`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `aux:pass`, `case`, `cc`, `ccomp`, `compound`, `conj`, `cop`, `csubj`, `dep`, `det`, `discourse`, `iobj`, `list`, `mark`, `nmod`, `nsubj`, `nummod`, `obj`, `obl`, `parataxis`, `punct`, `xcomp` | | **`experimental_edit_tree_lemmatizer`** | `0` |

### Accuracy | Type | Score | | --- | --- | | `TOKEN_F` | 87.90 | | `TOKEN_P` | 86.84 | | `TOKEN_R` | 89.00 | | `TOKEN_ACC` | 98.42 | | `SENTS_F` | 94.33 | | `SENTS_P` | 96.23 | | `SENTS_R` | 92.50 | | `TAG_ACC` | 88.05 | | `POS_ACC` | 90.19 | | `MORPH_ACC` | 96.95 | | `DEP_UAS` | 68.08 | | `DEP_LAS` | 60.64 | | `LEMMA_ACC` | 89.35 |