Question Answering
Transformers
PyTorch
English
llama
text-generation
biology
medical
Inference Endpoints
text-generation-inference
G-AshwinKumar commited on
Commit
da62bce
1 Parent(s): 8353550

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -2
README.md CHANGED
@@ -22,12 +22,13 @@ library_name: transformers
22
  tags:
23
  - biology
24
  - medical
 
25
  ---
26
  # Aloe: A New Family of Healthcare LLMs
27
 
28
  Aloe is a new family of healthcare LLMs that is highly competitive with all previous open models of its range and reaches state-of-the-art results at its size by using model merging and advanced prompting strategies. Aloe scores high in metrics measuring ethics and factuality, thanks to a combined red teaming and alignment effort. Complete training details, model merging configurations, and all training data (including synthetically generated data) will be shared. Additionally, the prompting repository used in this work to produce state-of-the-art results during inference will also be shared. Aloe comes with a healthcare-specific risk assessment to contribute to the safe use and deployment of such systems.
29
 
30
- <img src="https://cdn-uploads.huggingface.co/production/uploads/62f7a16192950415b637e201/HMD6WEoqqrAV8Ng_fAcnN.png" width="95%">
31
 
32
  ## Model Details
33
 
@@ -168,6 +169,9 @@ Supervised fine-tuning on top of Llama 3 8B using medical and general domain dat
168
  ### Training Data
169
 
170
  - Medical domain datasets, including synthetic data generated using Mixtral-8x7B and Genstruct
 
 
 
171
  - LDJnr/Capybara
172
  - hkust-nlp/deita-10k-v0
173
  - jondurbin/airoboros-3.2
@@ -212,6 +216,9 @@ With the help of prompting techniques the performance of Llama3-Aloe-8B-Alpha is
212
  - **Compute Region:** Spain
213
  - **Carbon Emitted:** 439.25kg
214
 
 
 
 
215
  ## Model Card Contact
216
 
217
  mailto:hpai@bsc.es
@@ -220,6 +227,7 @@ mailto:hpai@bsc.es
220
 
221
  If you use this repository in a published work, please cite the following papers as source:
222
 
 
223
  @misc{gururajan2024aloe,
224
  title={Aloe: A Family of Fine-tuned Open Healthcare LLMs},
225
  author={Ashwin Kumar Gururajan and Enrique Lopez-Cuena and Jordi Bayarri-Planas and Adrian Tormos and Daniel Hinjos and Pablo Bernabeu-Perez and Anna Arias-Duart and Pablo Agustin Martin-Torres and Lucia Urcelay-Ganzabal and Marta Gonzalez-Mallo and Sergio Alvarez-Napagao and Eduard Ayguadé-Parra and Ulises Cortés Dario Garcia-Gasulla},
@@ -227,4 +235,5 @@ If you use this repository in a published work, please cite the following papers
227
  eprint={2405.01886},
228
  archivePrefix={arXiv},
229
  primaryClass={cs.CL}
230
- }
 
 
22
  tags:
23
  - biology
24
  - medical
25
+ pipeline_tag: question-answering
26
  ---
27
  # Aloe: A New Family of Healthcare LLMs
28
 
29
  Aloe is a new family of healthcare LLMs that is highly competitive with all previous open models of its range and reaches state-of-the-art results at its size by using model merging and advanced prompting strategies. Aloe scores high in metrics measuring ethics and factuality, thanks to a combined red teaming and alignment effort. Complete training details, model merging configurations, and all training data (including synthetically generated data) will be shared. Additionally, the prompting repository used in this work to produce state-of-the-art results during inference will also be shared. Aloe comes with a healthcare-specific risk assessment to contribute to the safe use and deployment of such systems.
30
 
31
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/62972c4979f193515da1d38e/xlssx5_3_kLQlJlmE-aya.png" width="95%">
32
 
33
  ## Model Details
34
 
 
169
  ### Training Data
170
 
171
  - Medical domain datasets, including synthetic data generated using Mixtral-8x7B and Genstruct
172
+ - HPAI-BSC/pubmedqa-cot
173
+ - HPAI-BSC/medqa-cot
174
+ - HPAI-BSC/medmcqa-cot
175
  - LDJnr/Capybara
176
  - hkust-nlp/deita-10k-v0
177
  - jondurbin/airoboros-3.2
 
216
  - **Compute Region:** Spain
217
  - **Carbon Emitted:** 439.25kg
218
 
219
+ ## Model Card Authors
220
+ [Ashwin Kumar Gururajan](https://huggingface.co/G-AshwinKumar)
221
+
222
  ## Model Card Contact
223
 
224
  mailto:hpai@bsc.es
 
227
 
228
  If you use this repository in a published work, please cite the following papers as source:
229
 
230
+ ```
231
  @misc{gururajan2024aloe,
232
  title={Aloe: A Family of Fine-tuned Open Healthcare LLMs},
233
  author={Ashwin Kumar Gururajan and Enrique Lopez-Cuena and Jordi Bayarri-Planas and Adrian Tormos and Daniel Hinjos and Pablo Bernabeu-Perez and Anna Arias-Duart and Pablo Agustin Martin-Torres and Lucia Urcelay-Ganzabal and Marta Gonzalez-Mallo and Sergio Alvarez-Napagao and Eduard Ayguadé-Parra and Ulises Cortés Dario Garcia-Gasulla},
 
235
  eprint={2405.01886},
236
  archivePrefix={arXiv},
237
  primaryClass={cs.CL}
238
+ }
239
+ ```