pos-polish-gpt2-small
This model was trained from polish-gpt2-small on clarin-pl/nkjp-pos dataset. It achieves the following results on the evaluation set:
- Loss: 0.3109
- Precision: 0.8793
- Recall: 0.9255
- F1: 0.9018
- Accuracy: 0.9371
Model description
Trained from polish-gpt2-small
Intended uses & limitations
Part-of-speech tagging for Polish language. Tags description at the bottom of http://nkjp.pl/poliqarp/help/plse2.html
Training and evaluation data
Dataset: clarin-pl/nkjp-pos
Datacollator:
from transformers import DataCollatorForTokenClassification
data_collator = DataCollatorForTokenClassification(tokenizer=tokenizer)
Training procedure
GPU: RTX 3090
Training time: 00:50:24
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
0.0 | 0 | 3.6116 | 0.0464 | 0.0524 | 0.0492 | 0.0676 | |
0.2303 | 1.0 | 1222 | 0.2159 | 0.8737 | 0.9225 | 0.8974 | 0.9347 |
0.1776 | 2.0 | 2444 | 0.2124 | 0.8799 | 0.9254 | 0.9021 | 0.9381 |
0.1467 | 3.0 | 3666 | 0.2205 | 0.8759 | 0.9241 | 0.8994 | 0.9368 |
0.1254 | 4.0 | 4889 | 0.2304 | 0.8792 | 0.9256 | 0.9018 | 0.9377 |
0.1091 | 5.0 | 6111 | 0.2480 | 0.8787 | 0.9251 | 0.9013 | 0.9375 |
0.0949 | 6.0 | 7333 | 0.2651 | 0.8794 | 0.9250 | 0.9016 | 0.9373 |
0.0857 | 7.0 | 8555 | 0.2794 | 0.8791 | 0.9251 | 0.9015 | 0.9372 |
0.079 | 8.0 | 9778 | 0.2922 | 0.8789 | 0.9247 | 0.9012 | 0.9366 |
0.0736 | 9.0 | 11000 | 0.3037 | 0.8807 | 0.9256 | 0.9026 | 0.9375 |
0.0691 | 10.0 | 12220 | 0.3109 | 0.8793 | 0.9255 | 0.9018 | 0.9371 |
Framework versions
- Transformers 4.36.2
- Pytorch 2.1.2+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.