Sarmila commited on
Commit
2850262
1 Parent(s): 343875a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -4
README.md CHANGED
@@ -3,11 +3,16 @@ license: mit
3
  base_model: Sarmila/pubmed-bert-squad-covidqa
4
  tags:
5
  - generated_from_trainer
 
6
  datasets:
7
  - covid_qa_deepset
 
8
  model-index:
9
  - name: pubmed-bert-squad-covidqa
10
  results: []
 
 
 
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -15,13 +20,22 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # pubmed-bert-squad-covidqa
17
 
18
- This model is a fine-tuned version of [Sarmila/pubmed-bert-squad-covidqa](https://huggingface.co/Sarmila/pubmed-bert-squad-covidqa) on the covid_qa_deepset dataset.
19
- It achieves the following results on the evaluation set:
 
 
 
20
  - Loss: 0.4876
21
 
22
  ## Model description
23
 
24
- More information needed
 
 
 
 
 
 
25
 
26
  ## Intended uses & limitations
27
 
@@ -58,4 +72,4 @@ The following hyperparameters were used during training:
58
  - Transformers 4.33.0
59
  - Pytorch 2.0.0
60
  - Datasets 2.1.0
61
- - Tokenizers 0.13.3
 
3
  base_model: Sarmila/pubmed-bert-squad-covidqa
4
  tags:
5
  - generated_from_trainer
6
+ - biology
7
  datasets:
8
  - covid_qa_deepset
9
+ - squad
10
  model-index:
11
  - name: pubmed-bert-squad-covidqa
12
  results: []
13
+ language:
14
+ - en
15
+ pipeline_tag: question-answering
16
  ---
17
 
18
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
20
 
21
  # pubmed-bert-squad-covidqa
22
 
23
+ This model is a fine-tuned version of [Sarmila/pubmed-bert-squad-covidqa](https://huggingface.co/Sarmila/pubmed-bert-squad-covidqa) on the squad qa first, covid_qa_deepset dataset.
24
+ It achieves the following results on the evaluation set for squad:
25
+ {'exact_match': 59.0, 'f1': 76.32473929579194}
26
+
27
+ It achieves the following results on the evaluation set for covidqa:
28
  - Loss: 0.4876
29
 
30
  ## Model description
31
 
32
+ This model is trained with an intention of testing pumed bert bionlp language model for question answering pipeline.
33
+ While testing on our custom dataset, we reliazed that the model when used directly for QA did not perform well at all. Hence, we decided to train on covidqa
34
+ to make model accustomed with answer extraction. While, covidqa data is very similar to what we intended to use, it is samll in number hence resulting not much improvement.
35
+
36
+ Therefore, we firt trained the model in squad dataset which is larger in number. Then, we trained the model for covid qa. Hence, squad helped model to learn how to extract answers and covid qa helped us to train the model on domain similar to ours i.e. biomedicine
37
+
38
+ further, we have first performed MLM using our dataset on pubmed bert bionlp and then performed exactly same üiüeline to see the difference which is [here]
39
 
40
  ## Intended uses & limitations
41
 
 
72
  - Transformers 4.33.0
73
  - Pytorch 2.0.0
74
  - Datasets 2.1.0
75
+ - Tokenizers 0.13.3