ashaduzzaman commited on
Commit
565ab4c
1 Parent(s): eb4e046

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -16
README.md CHANGED
@@ -42,27 +42,51 @@ should probably proofread and complete it, then remove this comment. -->
42
 
43
  # bert-finetuned-ner
44
 
45
- This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the conll2003 dataset.
46
- It achieves the following results on the evaluation set:
47
- - Loss: 0.0599
48
- - Precision: 0.9347
49
- - Recall: 0.9512
50
- - F1: 0.9429
51
- - Accuracy: 0.9864
52
 
53
- ## Model description
 
 
 
54
 
55
- More information needed
 
 
 
56
 
57
- ## Intended uses & limitations
 
58
 
59
- More information needed
 
60
 
61
- ## Training and evaluation data
 
62
 
63
- More information needed
 
64
 
65
- ## Training procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66
 
67
  ### Training hyperparameters
68
 
@@ -75,7 +99,8 @@ The following hyperparameters were used during training:
75
  - lr_scheduler_type: linear
76
  - num_epochs: 3
77
 
78
- ### Training results
 
79
 
80
  | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
81
  |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
@@ -83,8 +108,9 @@ The following hyperparameters were used during training:
83
  | 0.0359 | 2.0 | 3512 | 0.0693 | 0.9265 | 0.9418 | 0.9341 | 0.9847 |
84
  | 0.0222 | 3.0 | 5268 | 0.0599 | 0.9347 | 0.9512 | 0.9429 | 0.9864 |
85
 
 
86
 
87
- ### Framework versions
88
 
89
  - Transformers 4.42.4
90
  - Pytorch 2.3.1+cu121
 
42
 
43
  # bert-finetuned-ner
44
 
45
+ ## Model Description
46
+ This model is a Named Entity Recognition (NER) model built using PyTorch and trained on the CoNLL-2003 dataset. The model is designed to identify and classify named entities in text into categories such as persons (PER), organizations (ORG), locations (LOC), and miscellaneous entities (MISC).
 
 
 
 
 
47
 
48
+ ## Intended Uses & Limitations
49
+ **Intended Uses:**
50
+ - **Text Analysis:** This model can be used for extracting named entities from unstructured text data, which is useful in various NLP tasks such as information retrieval, content categorization, and automated summarization.
51
+ - **NER Task:** Specifically designed for NER tasks in English.
52
 
53
+ **Limitations:**
54
+ - **Language Dependency:** The model is trained on English data and may not perform well on texts in other languages.
55
+ - **Domain Specificity:** Performance may degrade on text from domains significantly different from the training data.
56
+ - **Error Propagation:** Incorrect predictions may propagate to downstream tasks, affecting overall performance.
57
 
58
+ ## How to Use
59
+ To use this model, load it through the Hugging Face Transformers library. Below is a basic example:
60
 
61
+ ```python
62
+ from transformers import pipeline
63
 
64
+ # Load the NER pipeline
65
+ ner_pipeline = pipeline("ner", model="Ashaduzzaman/bert-finetuned-ner")
66
 
67
+ # Example text
68
+ text = "Hugging Face Inc. is based in New York City."
69
 
70
+ # Perform NER
71
+ entities = ner_pipeline(text)
72
+
73
+ print(entities)
74
+ ```
75
+
76
+ ## Limitations and Bias
77
+ - **Bias in Data:** The model is trained on the CoNLL-2003 dataset, which may contain biases related to the sources of the text. The model might underperform on entities not well represented in the training data.
78
+ - **Overfitting:** The model may overfit to the specific entities present in the CoNLL-2003 dataset, affecting its generalization to new entities or text styles.
79
+
80
+ ## Training Data
81
+ The model was trained on the CoNLL-2003 dataset, a widely used benchmark dataset for NER tasks. The dataset contains annotated text from news articles, with labels for persons, organizations, locations, and miscellaneous entities.
82
+
83
+ ## Training Procedure
84
+ The model was fine-tuned using a pre-trained transformer model (e.g., BERT) with a token classification head for NER. The training involved:
85
+ - **Optimizer:** AdamW optimizer
86
+ - **Learning Rate:** Learning rate scheduler was employed
87
+ - **Batch Size:** Defined in the notebook based on available resources
88
+ - **Epochs:** The model was trained for a specified number of epochs until convergence
89
+ - **Evaluation:** Model performance was evaluated on a validation set, with metrics like F1-score, precision, and recall.
90
 
91
  ### Training hyperparameters
92
 
 
99
  - lr_scheduler_type: linear
100
  - num_epochs: 3
101
 
102
+ ## Evaluation Results
103
+ This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the CoNLL-2003 test set, with performance measured using standard NER metrics:
104
 
105
  | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
106
  |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
 
108
  | 0.0359 | 2.0 | 3512 | 0.0693 | 0.9265 | 0.9418 | 0.9341 | 0.9847 |
109
  | 0.0222 | 3.0 | 5268 | 0.0599 | 0.9347 | 0.9512 | 0.9429 | 0.9864 |
110
 
111
+ These results indicate the model's ability to correctly identify and classify named entities in text.
112
 
113
+ ## Framework versions
114
 
115
  - Transformers 4.42.4
116
  - Pytorch 2.3.1+cu121