--- license: apache-2.0 base_model: bert-base-multilingual-cased tags: - generated_from_trainer - Multilingual model-index: - name: bert-base-multilingual-cased-fine_tuned-ner-WikiNeural_Multilingual results: [] datasets: - dmargutierrez/Babelscape-wikineural-joined language: - en - es - nl - fr - it - ru - de - pt - pl metrics: - seqeval pipeline_tag: token-classification --- # bert-base-multilingual-cased-fine_tuned-ner-WikiNeural_Multilingual This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased). It achieves the following results on the evaluation set: - Loss: 0.0168 - Overall - Precision: 0.9957 - Recall: 0.9961 - F1: 0.9959 - Accuracy: 0.9947 - Loc - Precision: 0.9983410191680872 - Recall: 0.99820099576644 - F1: 0.9982710025571356 - Number: 1,932,180 - Misc - Precision: 0.9809396911027518 - Recall: 0.9833044214778437 - F1: 0.9821206328547606 - Number: 122,787 - Org - Precision: 0.9868919798954698 - Recall: 0.9881129520338388 - F1: 0.9875020885547201 - Number: 59,813 - Per - Precision: 0.9386096837531854 - Recall: 0.9516901050491359 - F1: 0.9451046377116415 - Number: 47,216 ## Model description For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/Token%20Classification/Multilingual/Babelscape-WikiNeural-Joined%20Dataset/Babelscape%20WikiNeural%20Joined%20Dataset%20With%20Multilingual%20BERT.ipynb ## Intended uses & limitations This model is intended to demonstrate my ability to solve a complex problem using technology. You are welcome to test and experiment with this model, but it is at your own risk/peril. ## Training and evaluation data Dataset Soruce: https://huggingface.co/datasets/dmargutierrez/Babelscape-wikineural-joined ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 1 ### Training results | Train Loss | Epoch | Step | Valid Loss | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy | |:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:| | 0.015 | 1.0 | 102700 | 0.0168 | 0.9957 | 0.9961 | 0.9959 | 0.9947 | | Train Loss | Epoch | Valid Loss | LOC Precision | LOC Recall | LOC F1 | LOC Number | MISC Precision | MISC Recall | MISC F1 | MISC Number | ORG Precision | ORG Recall | ORG F1 | ORG Number | PER Precision | PER Recall | PER F1 | PER Number | |:-------------:|:-----:|:---------------:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-------------:|:-------:|:---------------:| | 0.015 | 1.0 | 0.0168 | 0.9983 | 0.9982 | 0.9983 | 1,932,180 | 0.9809 | 0.9833 | 0.9821 | 122,787 | 0.9869 | 0.9881 | 0.9875 | 59,813 | 0.9386 | 0.9517 | 0.9451 | 47,216 | ### Framework versions - Transformers 4.31.0 - Pytorch 2.0.1+cu118 - Datasets 2.14.4 - Tokenizers 0.13.3