File size: 6,612 Bytes
11661d4
 
 
 
 
 
 
 
ce2799c
64a1c47
ce2799c
64a1c47
11661d4
 
 
ce2799c
11661d4
 
 
 
 
 
 
 
 
 
 
 
64a1c47
11661d4
 
 
 
 
 
 
 
 
64a1c47
11661d4
 
 
 
 
 
 
 
 
64a1c47
11661d4
 
 
 
 
 
 
 
 
64a1c47
11661d4
 
 
 
 
 
 
 
 
 
3246d48
 
 
51c142c
3246d48
11661d4
 
3246d48
f397d95
64a1c47
11661d4
 
 
64a1c47
ef9ce4f
51c142c
11661d4
 
 
8652b82
11661d4
8652b82
 
11661d4
 
fcfe744
51c142c
fcfe744
11661d4
 
 
 
 
 
 
 
 
 
51c142c
11661d4
 
 
 
 
 
 
 
 
 
 
 
51c142c
 
11661d4
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
import gradio as gr
from transformers import pipeline
import torch

title = "Extractive QA Biomedicine"
description = """
<p style="text-align: justify;"> 
Taking into account the existence of masked language models trained on Spanish Biomedical corpus, the objective of this project is to use them to generate extractice QA models for Biomedicine and compare their effectiveness with general masked language models.

The models were trained on the <a href="https://huggingface.co/datasets/squad_es">SQUAD_ES Dataset</a> (automatic translation of the Stanford Question Answering Dataset into Spanish). SQUAD v2 version was chosen in order to include questions that cannot be answered based on a provided context.

The models were evaluated on <a href="https://huggingface.co/datasets/hackathon-pln-es/biomed_squad_es_v2">BIOMED_SQUAD_ES_V2 Dataset</a> , a subset of the SQUAD_ES dev dataset containing questions related to the Biomedical domain.
</p>
"""
article = """
<p style="text-align: justify;">
<h3>Results</h3>
<table class="table table-bordered table-hover table-condensed">
<thead><tr><th title="Field #1">Model</th>
<th title="Field #2">Base Model Domain</th>
<th title="Field #3">exact</th>
<th title="Field #4">f1</th>
<th title="Field #5">HasAns_exact</th>
<th title="Field #6">HasAns_f1</th>
<th title="Field #7">NoAns_exact</th>
<th title="Field #8">NoAns_f1</th>
</tr></thead>
<tbody><tr>
<td><a href="https://huggingface.co/hackathon-pln-es/roberta-base-bne-squad2-es">hackathon-pln-es/roberta-base-bne-squad2-es</a></td>
<td>General</td>
<td align="right">67.6341</td>
<td align="right">75.6988</td>
<td align="right">53.7367</td>
<td align="right">70.0526</td>
<td align="right">81.2174</td>
<td align="right">81.2174</td>
</tr>
<tr>
<td><a href="https://huggingface.co/hackathon-pln-es/roberta-base-biomedical-clinical-es-squad2-es">hackathon-pln-es/roberta-base-biomedical-clinical-es-squad2-es</a></td>
<td>Biomedical</td>
<td align="right">66.8426</td>
<td align="right">75.2346</td>
<td align="right">53.0249</td>
<td align="right">70.0031</td>
<td align="right">80.3478</td>
<td align="right">80.3478</td>
</tr>
<tr>
<td><a href="https://huggingface.co/hackathon-pln-es/roberta-base-biomedical-es-squad2-es">hackathon-pln-es/roberta-base-biomedical-es-squad2-es</a></td>
<td>Biomedical</td>
<td align="right">67.6341</td>
<td align="right">74.5612</td>
<td align="right">47.6868</td>
<td align="right">61.7012</td>
<td align="right">87.1304</td>
<td align="right"> 87.1304</td>
</tr>
<tr>
<td><a href="https://huggingface.co/hackathon-pln-es/biomedtra-small-es-squad2-es">hackathon-pln-es/biomedtra-small-es-squad2-es</a></td>
<td>Biomedical</td>
<td align="right">29.6394</td>
<td align="right">36.317</td>
<td align="right">32.2064</td>
<td align="right">45.716</td>
<td align="right">27.1304</td>
<td align="right">27.1304</td>
</tr>
</tbody></table>
<h3>Conclusion and Future Work</h3>
If F1 score is considered, the results show that there may be no advantage in using domain-specific masked language models to generate Biomedical QA models. 
In any case, the scores reported for the biomedical roberta-based models are not far below from those of the general roberta-based model.

However, if only unanswerable questions are taken into account, the model with the best F1 score is hackathon-pln-es/roberta-base-biomedical-es-squad2-es.

As future work, the following experiments could be carried out:
<ul>
<li>Use Biomedical masked-language models that were not trained from scratch from a Biomedical corpus but have been adapted from a general model, so as not to lose words and features of Spanish that are also present in Biomedical questions and articles.
<li>Create a Biomedical training dataset with SQUAD v2 format.
<li>Generate a new and larger Spanish Biomedical validation dataset, not translated from English as in the case of SQUAD_ES Dataset.
<li>Ensamble different models.
</ul> 
</p>

<h3>Team</h3>
Santiago Maximo
"""

device = 0 if torch.cuda.is_available() else -1
MODEL_NAMES = ["hackathon-pln-es/roberta-base-bne-squad2-es",
               "hackathon-pln-es/roberta-base-biomedical-clinical-es-squad2-es",
               "hackathon-pln-es/roberta-base-biomedical-es-squad2-es",
               "hackathon-pln-es/biomedtra-small-es-squad2-es"]

examples = [
    [MODEL_NAMES[2], "¿Qué cidippido se utiliza como descripción de los ctenóforos en la mayoría de los libros de texto?","Para un filo con relativamente pocas especies, los ctenóforos tienen una amplia gama de planes corporales. Las especies costeras necesitan ser lo suficientemente duras para soportar las olas y remolcar partículas de sedimentos, mientras que algunas especies oceánicas son tan frágiles que es muy difícil capturarlas intactas para su estudio. Además, las especies oceánicas no conservan bien, y son conocidas principalmente por fotografías y notas de observadores. Por lo tanto, la mayor atención se ha concentrado recientemente en tres géneros costeros: Pleurobrachia, Beroe y Mnemiopsis. Al menos dos libros de texto basan sus descripciones de ctenóforos en los cidipépidos Pleurobrachia."],
    [MODEL_NAMES[0], "¿Dónde se atasca un fagocito en un patógeno?", "La fagocitosis es una característica importante de la inmunidad celular innata llevada a cabo por células llamadas fagocitos que absorben, o comen, patógenos o partículas. Los fagocitos generalmente patrullan el cuerpo en busca de patógenos, pero pueden ser llamados a lugares específicos por citoquinas. Una vez que un patógeno ha sido absorbido por un fagocito, queda atrapado en una vesícula intracelular llamada fagosoma, que posteriormente se fusiona con otra vesícula llamada lisosoma para formar un fagocito."],

]

def getanswer(model_name, question, context):

    question_answerer = pipeline("question-answering", model=model_name, device=device)

    response = question_answerer({
        'question': question,
        'context': context
    })
    return  response['answer'],response['score']

face = gr.Interface(
    fn=getanswer, 
    inputs=[
        gr.inputs.Radio(
            label="Pick a QA Model",
            choices=MODEL_NAMES,
        ),            
        gr.inputs.Textbox(lines=1, placeholder="Question Here… "), 
        gr.inputs.Textbox(lines=10, placeholder="Context Here… ")
    ], 
    outputs=[
        gr.outputs.Textbox(label="Answer"),
        gr.outputs.Textbox(label="Score"),
    ],
    layout="vertical",
    title=title,
    examples=examples,
    description=description,
    article=article,
    allow_flagging ="never"
)
face.launch()