Back to all models
token-classification mask_token: <mask>
Query this model
🔥 This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint  

⚡️ Upgrade your account to access the Inference API

							$
							curl -X POST \
-H "Authorization: Bearer YOUR_ORG_OR_USER_API_TOKEN" \
-H "Content-Type: application/json" \
-d '"json encoded string"' \
https://api-inference.huggingface.co/models/iarfmoose/roberta-small-bulgarian-pos
Share Copied link to clipboard

Monthly model downloads

iarfmoose/roberta-small-bulgarian-pos iarfmoose/roberta-small-bulgarian-pos
93 downloads
last 30 days

pytorch

tf

Contributed by

iarfmoose Adam Montgomerie
7 models

How to use this model directly from the 🤗/transformers library:

			
Copy to clipboard
from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("iarfmoose/roberta-small-bulgarian-pos") model = AutoModelForTokenClassification.from_pretrained("iarfmoose/roberta-small-bulgarian-pos")

RoBERTa-small-bulgarian-POS

The RoBERTa model was originally introduced in this paper. This model is a version of RoBERTa-small-Bulgarian fine-tuned for part-of-speech tagging.

Intended uses

The model can be used to predict part-of-speech tags in Bulgarian text. Since the tokenizer uses byte-pair encoding, each word in the text may be split into more than one token. When predicting POS-tags, the last token from each word can be used. Using the last token was found to slightly outperform predictions based on the first token.

An example of this can be found here.

Limitations and bias

The pretraining data is unfiltered text from the internet and may contain all sorts of biases.

Training data

In addition to the pretraining data used in RoBERTa-base-Bulgarian, the model was trained on the UPOS tags from (UD_Bulgarian-BTB)[https://github.com/UniversalDependencies/UD_Bulgarian-BTB].

Training procedure

The model was trained for 5 epochs over the training set. The loss was calculated based on label predictions for the last POS-tag for each word. The model achieves 98% on the test set.