IEETA
/

Multi-Head-CRF

Spanish

Model card Files Files and versions Community

richardjonker2000 commited on May 14

Commit

bd2bf08

•

1 Parent(s): 88880eb

Update README.md

Browse files

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ metrics:
 Our model focuses on Biomedical Named Entity Recognition (NER) in Spanish clinical texts, crucial for automated information extraction in medical research and treatment improvements. It proposes a novel approach using a Multi-Head Conditional Random Field (CRF) classifier to tackle multi-class NER tasks, overcoming challenges of overlapping entity instances. The classes it recognizes include symptoms, procedures, diseases, chemicals, and proteins.
-We provide 4 different, models, available as branches of this repository.
 ## Model Details
@@ -51,14 +51,14 @@ Please refer to our GitHub repository for more information on how to train the m
 The training data can be found on IEETA/SPACCC-Spanish-NER, which is further described on the dataset card.
 The dataset used consists of 4 seperate datasets:
-- [MedProcNer](https://zenodo.org/records/8224056)
 - [DisTEMIST](https://zenodo.org/records/7614764)
 - [PharmaCoNER](https://zenodo.org/records/4270158)
-- [SympTEMIST](https://zenodo.org/records/10635215)
 ### Speeds, Sizes, Times
-The models were trained using an Nvidia Quadra RTX 8000. The models for 5 classes took approximately 1 hour to train and occupy around 1GB of disk space. Additionally, this model shows linear complexity (+8 minutes) per entity class to classify.
 ### Testing Data, Factors & Metrics
@@ -67,7 +67,7 @@ The testing data can be found on IEETA/SPACCC-Spanish-NER, which is further desc
 #### Metrics
-The models were evaluated using the F1 score metric, the standard for entity recognition tasks.
 ### Results
@@ -80,7 +80,7 @@ We provide 4 separate models with various hyperparameter changes:
 | 3            | None         | -               | -                        | **78.89** |
 | 1            | Random       | 0.25            | 0.50                     | **78.89** |
-All models are trained with a context size of 32 for 60 epochs.
 ## Citation

 Our model focuses on Biomedical Named Entity Recognition (NER) in Spanish clinical texts, crucial for automated information extraction in medical research and treatment improvements. It proposes a novel approach using a Multi-Head Conditional Random Field (CRF) classifier to tackle multi-class NER tasks, overcoming challenges of overlapping entity instances. The classes it recognizes include symptoms, procedures, diseases, chemicals, and proteins.
+We provide 4 different models, available as branches of this repository.
 ## Model Details
 The training data can be found on IEETA/SPACCC-Spanish-NER, which is further described on the dataset card.
 The dataset used consists of 4 seperate datasets:
+- [SympTEMIST](https://zenodo.org/records/10635215)
+- [MedProcNER](https://zenodo.org/records/8224056)
 - [DisTEMIST](https://zenodo.org/records/7614764)
 - [PharmaCoNER](https://zenodo.org/records/4270158)
 ### Speeds, Sizes, Times
+The models were trained using an Nvidia Quadro RTX 8000. The models for 5 classes took approximately 1 hour to train and occupy around 1GB of disk space. Additionally, this model shows linear complexity (+8 minutes) per entity class to classify.
 ### Testing Data, Factors & Metrics
 #### Metrics
+The models were evaluated using the micro-averaged F1-score metric, the standard for entity recognition tasks.
 ### Results
 | 3            | None         | -               | -                        | **78.89** |
 | 1            | Random       | 0.25            | 0.50                     | **78.89** |
+All models are trained with a context size of 32 tokens for 60 epochs.
 ## Citation