Edit model card

roberta-base-vietnamese-upos

Model Description

This is a RoBERTa model pre-trained on Vietnamese texts for POS-tagging and dependency-parsing, derived from roberta-base-vietnamese. Every word is tagged by UPOS(Universal Part-Of-Speech).

How to Use

from transformers import AutoTokenizer,AutoModelForTokenClassification,TokenClassificationPipeline
tokenizer=AutoTokenizer.from_pretrained("KoichiYasuoka/roberta-base-vietnamese-upos")
model=AutoModelForTokenClassification.from_pretrained("KoichiYasuoka/roberta-base-vietnamese-upos")
pipeline=TokenClassificationPipeline(tokenizer=tokenizer,model=model,aggregation_strategy="simple")
nlp=lambda x:[(x[t["start"]:t["end"]],t["entity_group"]) for t in pipeline(x)]
print(nlp("Hai cái đầu thì tốt hơn một."))

or

import esupar
nlp=esupar.load("KoichiYasuoka/roberta-base-vietnamese-upos")
print(nlp("Hai cái đầu thì tốt hơn một."))

See Also

esupar: Tokenizer POS-tagger and Dependency-parser with BERT/RoBERTa/DeBERTa models

Downloads last month
15
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train KoichiYasuoka/roberta-base-vietnamese-upos