File size: 4,022 Bytes
73514b9 53a7edb 73514b9 1d56e01 73514b9 53a7edb 73514b9 53a7edb 73514b9 53a7edb 73514b9 eb9f1af 73514b9 cd9ab95 73514b9 cd9ab95 73514b9 cd9ab95 73514b9 cd9ab95 73514b9 f774c9f 73514b9 53a7edb 73514b9 1d56e01 53a7edb 73514b9 53a7edb 73514b9 420caa0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
---
license: apache-2.0
base_model: bert-base-cased
tags:
- generated_from_trainer
datasets:
- conll2002
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: bert-finetuned-ner
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: conll2002
type: conll2002
config: es
split: validation
args: es
metrics:
- name: Precision
type: precision
value: 0.7640546993705232
- name: Recall
type: recall
value: 0.8088235294117647
- name: F1
type: f1
value: 0.7858019868288871
- name: Accuracy
type: accuracy
value: 0.9676902769959431
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# bert-finetuned-ner
This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the conll2002 dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1912
- Precision: 0.7641
- Recall: 0.8088
- F1: 0.7858
- Accuracy: 0.9677
## Model description
El modelo base bert-base-cased es una versi贸n pre-entrenada del popular modelo de lenguaje BERT de Google. Inicialmente fue entrenado en grandes cantidades de texto para aprender representaciones densas de palabras y secuencias.
Posteriormente, este modelo toma la arquitectura y pesos pre-entrenados de bert-base-cased y los ajusta a煤n m谩s en la tarea espec铆fica de Reconocimiento de Entidades Nombradas (NER por sus siglas en ingl茅s) utilizando el conjunto de datos conll2002.
## How to Use
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
tokenizer = AutoTokenizer.from_pretrained("JoshuaAAX/bert-finetuned-ner")
model = AutoModelForTokenClassification.from_pretrained("JoshuaAAX/bert-finetuned-ner")
text = "La Federaci贸n nacional de cafeteros de Colombia es una entidad del estado. El primer presidente el Dr Augusto Guerra cont贸 con el aval de la Asociaci贸n Colombiana de Aviaci贸n."
ner_pipeline= pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="max")
ner_pipeline(text)
```
## Training data
| Abbreviation | Description |
|:-------------:|:-------------:|
| O | Outside of NE |
| PER | Person鈥檚 name |
| ORG | Organization |
| LOC | Location |
| MISC | Miscellaneous |
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
### Training results
| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
| 0.1713 | 1.0 | 521 | 0.1404 | 0.6859 | 0.7387 | 0.7114 | 0.9599 |
| 0.0761 | 2.0 | 1042 | 0.1404 | 0.6822 | 0.7693 | 0.7231 | 0.9623 |
| 0.05 | 3.0 | 1563 | 0.1304 | 0.7488 | 0.7937 | 0.7706 | 0.9672 |
| 0.0355 | 4.0 | 2084 | 0.1454 | 0.7585 | 0.7960 | 0.7768 | 0.9664 |
| 0.0253 | 5.0 | 2605 | 0.1501 | 0.7549 | 0.8095 | 0.7812 | 0.9677 |
| 0.0184 | 6.0 | 3126 | 0.1726 | 0.7581 | 0.7992 | 0.7781 | 0.9662 |
| 0.0138 | 7.0 | 3647 | 0.1743 | 0.7524 | 0.8042 | 0.7774 | 0.9676 |
| 0.0112 | 8.0 | 4168 | 0.1853 | 0.7576 | 0.8022 | 0.7792 | 0.9674 |
| 0.0082 | 9.0 | 4689 | 0.1914 | 0.7595 | 0.8061 | 0.7821 | 0.9667 |
| 0.0073 | 10.0 | 5210 | 0.1912 | 0.7641 | 0.8088 | 0.7858 | 0.9677 |
### Framework versions
- Transformers 4.41.0
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1
|