--- license: mit base_model: vicgalle/xlm-roberta-large-xnli-anli tags: - generated_from_trainer metrics: - accuracy model-index: - name: xlm-roberta-large-xnli-anli results: [] --- # xlm-roberta-large-xnli-anli This model is a fine-tuned version of [vicgalle/xlm-roberta-large-xnli-anli](https://huggingface.co/vicgalle/xlm-roberta-large-xnli-anli) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.3689 - F1 Macro: 0.8721 - F1 Micro: 0.8729 - Accuracy Balanced: 0.8725 - Accuracy: 0.8729 - Precision Macro: 0.8718 - Recall Macro: 0.8725 - Precision Micro: 0.8729 - Recall Micro: 0.8729 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 9e-06 - train_batch_size: 8 - eval_batch_size: 64 - seed: 40 - gradient_accumulation_steps: 4 - total_train_batch_size: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.06 - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | F1 Macro | F1 Micro | Accuracy Balanced | Accuracy | Precision Macro | Recall Macro | Precision Micro | Recall Micro | |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:-----------------:|:--------:|:---------------:|:------------:|:---------------:|:------------:| | 0.4942 | 0.17 | 200 | 0.4413 | 0.8081 | 0.8089 | 0.8093 | 0.8089 | 0.8076 | 0.8093 | 0.8089 | 0.8089 | | 0.4114 | 0.34 | 400 | 0.3991 | 0.8227 | 0.8232 | 0.8250 | 0.8232 | 0.8226 | 0.8250 | 0.8232 | 0.8232 | | 0.3467 | 0.51 | 600 | 0.3584 | 0.8388 | 0.8391 | 0.8421 | 0.8391 | 0.8396 | 0.8421 | 0.8391 | 0.8391 | | 0.3402 | 0.68 | 800 | 0.3620 | 0.8534 | 0.8544 | 0.8536 | 0.8544 | 0.8532 | 0.8536 | 0.8544 | 0.8544 | | 0.3304 | 0.85 | 1000 | 0.3385 | 0.8566 | 0.8576 | 0.8567 | 0.8576 | 0.8565 | 0.8567 | 0.8576 | 0.8576 | | 0.3234 | 1.02 | 1200 | 0.3456 | 0.8637 | 0.8650 | 0.8631 | 0.8650 | 0.8645 | 0.8631 | 0.8650 | 0.8650 | | 0.2702 | 1.19 | 1400 | 0.3201 | 0.8606 | 0.8613 | 0.8616 | 0.8613 | 0.8600 | 0.8616 | 0.8613 | 0.8613 | | 0.2581 | 1.36 | 1600 | 0.3233 | 0.8619 | 0.8624 | 0.8639 | 0.8624 | 0.8615 | 0.8639 | 0.8624 | 0.8624 | | 0.2414 | 1.52 | 1800 | 0.3451 | 0.8674 | 0.8687 | 0.8664 | 0.8687 | 0.8687 | 0.8664 | 0.8687 | 0.8687 | | 0.2687 | 1.69 | 2000 | 0.3415 | 0.8577 | 0.8608 | 0.8544 | 0.8608 | 0.8677 | 0.8544 | 0.8608 | 0.8608 | | 0.2518 | 1.86 | 2200 | 0.3378 | 0.8684 | 0.8692 | 0.8688 | 0.8692 | 0.8681 | 0.8688 | 0.8692 | 0.8692 | | 0.2182 | 2.03 | 2400 | 0.3581 | 0.8698 | 0.8708 | 0.8697 | 0.8708 | 0.8700 | 0.8697 | 0.8708 | 0.8708 | | 0.1919 | 2.2 | 2600 | 0.3671 | 0.8677 | 0.8687 | 0.8676 | 0.8687 | 0.8678 | 0.8676 | 0.8687 | 0.8687 | | 0.1771 | 2.37 | 2800 | 0.3790 | 0.8709 | 0.8719 | 0.8707 | 0.8719 | 0.8710 | 0.8707 | 0.8719 | 0.8719 | | 0.1793 | 2.54 | 3000 | 0.3856 | 0.8687 | 0.8692 | 0.8701 | 0.8692 | 0.8680 | 0.8701 | 0.8692 | 0.8692 | | 0.1909 | 2.71 | 3200 | 0.3777 | 0.8686 | 0.8698 | 0.8682 | 0.8698 | 0.8691 | 0.8682 | 0.8698 | 0.8698 | | 0.2021 | 2.88 | 3400 | 0.3685 | 0.8701 | 0.8708 | 0.8710 | 0.8708 | 0.8696 | 0.8710 | 0.8708 | 0.8708 | ### eval result |Datasets|asadfgglie/nli-zh-tw-all/test|asadfgglie/BanBan_2024-10-17-facial_expressions-nli/test|eval_dataset|test_dataset| | :---: | :---: | :---: | :---: | :---: | |eval_loss|0.355|0.246|0.369|0.337| |eval_f1_macro|0.872|0.932|0.872|0.88| |eval_f1_micro|0.873|0.932|0.873|0.881| |eval_accuracy_balanced|0.872|0.932|0.873|0.88| |eval_accuracy|0.873|0.932|0.873|0.881| |eval_precision_macro|0.873|0.932|0.872|0.881| |eval_recall_macro|0.872|0.932|0.873|0.88| |eval_precision_micro|0.873|0.932|0.873|0.881| |eval_recall_micro|0.873|0.932|0.873|0.881| |eval_runtime|50.724|0.611|11.126|44.342| |eval_samples_per_second|167.574|1547.575|169.783|170.424| |eval_steps_per_second|2.622|24.539|2.696|2.684| |Size of dataset|8500|946|1889|7557| ### Framework versions - Transformers 4.33.3 - Pytorch 2.5.1+cu121 - Datasets 2.14.7 - Tokenizers 0.13.3