--- license: mit base_model: FacebookAI/xlm-roberta-large tags: - generated_from_trainer datasets: - cnec metrics: - precision - recall - f1 - accuracy model-index: - name: CNEC_xlm-roberta-large results: - task: name: Token Classification type: token-classification dataset: name: cnec type: cnec config: default split: validation args: default metrics: - name: Precision type: precision value: 0.8566729323308271 - name: Recall type: recall value: 0.9047146401985111 - name: F1 type: f1 value: 0.8800386193579531 - name: Accuracy type: accuracy value: 0.9771662763466042 language: - cs --- # CNEC_xlm-roberta-large This model is a fine-tuned version of [FacebookAI/xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large) on the [cnec](https://lindat.cz/repository/xmlui/handle/11234/1-3493) dataset. It achieves the following results on the evaluation set: - Loss: 0.1471 - Precision: 0.8567 - Recall: 0.9047 - F1: 0.8800 - Accuracy: 0.9772 ## Model description The entities are described as: - 'O' = Outside of a named entity - 'B-A' = Beginning of a complex address number (Postal code, street number, even phone number) - 'I-A' = Inside of a number in the address - 'B-G' = Beginning of a geographical name - 'I-G' = Inside of a geographical name - 'B-I' = Beginning of an institution name - 'I-I' = Inside of an institution name - 'B-M' = Beginning of a media name (email, server, website, tv series, etc.) - 'I-M' = Inside of a media name - 'B-O' = Beginning of an artifact name (book, old movies, etc.) - 'I-O' = Inside of an artifact name - 'B-P' = Beginning of a person's name - 'I-P' = Inside of a person's name - 'B-T' = Beginning of a time expression - 'I-T' = Inside of a time expression ## Intended uses & limitations CNEC or Czech named entity corpus is a dataset aimed at the Czech language. This is an edited version of the dataset with only 7 supertypes and 1 type for non-entity. ## Training and evaluation data The model was trained with an increased dropout rate to 0.2 for hidden_dropout_prob and 0.15 for attention_probs_dropout_prob ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - weight_decay = 0.01 - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy | |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:| | 0.2836 | 1.12 | 500 | 0.1341 | 0.7486 | 0.8467 | 0.7946 | 0.9649 | | 0.116 | 2.24 | 1000 | 0.1048 | 0.7866 | 0.8655 | 0.8242 | 0.9734 | | 0.0832 | 3.36 | 1500 | 0.1066 | 0.7967 | 0.8734 | 0.8333 | 0.9746 | | 0.0577 | 4.47 | 2000 | 0.1112 | 0.8408 | 0.8834 | 0.8616 | 0.9753 | | 0.0445 | 5.59 | 2500 | 0.1378 | 0.8384 | 0.8883 | 0.8627 | 0.9751 | | 0.0337 | 6.71 | 3000 | 0.1272 | 0.8505 | 0.8978 | 0.8735 | 0.9770 | | 0.025 | 7.83 | 3500 | 0.1447 | 0.8462 | 0.9007 | 0.8726 | 0.9760 | | 0.0191 | 8.95 | 4000 | 0.1471 | 0.8567 | 0.9047 | 0.8800 | 0.9772 | ### Framework versions - Transformers 4.36.2 - Pytorch 2.1.2+cu121 - Datasets 2.16.1 - Tokenizers 0.15.0