RichardErkhov commited on
Commit
691aada
1 Parent(s): 710f20d

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +109 -0
README.md ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ medalpaca-7b - bnb 8bits
11
+ - Model creator: https://huggingface.co/medalpaca/
12
+ - Original model: https://huggingface.co/medalpaca/medalpaca-7b/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ license: cc
20
+ language:
21
+ - en
22
+ library_name: transformers
23
+ pipeline_tag: text-generation
24
+ tags:
25
+ - medical
26
+ ---
27
+ # MedAlpaca 7b
28
+
29
+
30
+ ## Table of Contents
31
+
32
+ [Model Description](#model-description)
33
+ - [Architecture](#architecture)
34
+ - [Training Data](#trainig-data)
35
+ [Model Usage](#model-usage)
36
+ [Limitations](#limitations)
37
+
38
+ ## Model Description
39
+ ### Architecture
40
+ `medalpaca-7b` is a large language model specifically fine-tuned for medical domain tasks.
41
+ It is based on LLaMA (Large Language Model Meta AI) and contains 7 billion parameters.
42
+ The primary goal of this model is to improve question-answering and medical dialogue tasks.
43
+ Architecture
44
+
45
+
46
+ ### Training Data
47
+ The training data for this project was sourced from various resources.
48
+ Firstly, we used Anki flashcards to automatically generate questions,
49
+ from the front of the cards and anwers from the back of the card.
50
+ Secondly, we generated medical question-answer pairs from [Wikidoc](https://www.wikidoc.org/index.php/Main_Page).
51
+ We extracted paragraphs with relevant headings, and used Chat-GPT 3.5
52
+ to generate questions from the headings and using the corresponding paragraphs
53
+ as answers. This dataset is still under development and we believe
54
+ that approximately 70% of these question answer pairs are factual correct.
55
+ Thirdly, we used StackExchange to extract question-answer pairs, taking the
56
+ top-rated question from five categories: Academia, Bioinformatics, Biology,
57
+ Fitness, and Health. Additionally, we used a dataset from [ChatDoctor](https://arxiv.org/abs/2303.14070)
58
+ consisting of 200,000 question-answer pairs, available at https://github.com/Kent0n-Li/ChatDoctor.
59
+
60
+ | Source | n items |
61
+ |------------------------------|--------|
62
+ | ChatDoc large | 200000 |
63
+ | wikidoc | 67704 |
64
+ | Stackexchange academia | 40865 |
65
+ | Anki flashcards | 33955 |
66
+ | Stackexchange biology | 27887 |
67
+ | Stackexchange fitness | 9833 |
68
+ | Stackexchange health | 7721 |
69
+ | Wikidoc patient information | 5942 |
70
+ | Stackexchange bioinformatics | 5407 |
71
+
72
+ ## Model Usage
73
+ To evaluate the performance of the model on a specific dataset, you can use the Hugging Face Transformers library's built-in evaluation scripts. Please refer to the evaluation guide for more information.
74
+ Inference
75
+
76
+ You can use the model for inference tasks like question-answering and medical dialogues using the Hugging Face Transformers library. Here's an example of how to use the model for a question-answering task:
77
+
78
+ ```python
79
+
80
+ from transformers import pipeline
81
+
82
+ pl = pipeline("text-generation", model="medalpaca/medalpaca-7b", tokenizer="medalpaca/medalpaca-7b")
83
+ question = "What are the symptoms of diabetes?"
84
+ context = "Diabetes is a metabolic disease that causes high blood sugar. The symptoms include increased thirst, frequent urination, and unexplained weight loss."
85
+ answer = pl(f"Context: {context}\n\nQuestion: {question}\n\nAnswer: ")
86
+ print(answer)
87
+ ```
88
+
89
+ ## Limitations
90
+ The model may not perform effectively outside the scope of the medical domain.
91
+ The training data primarily targets the knowledge level of medical students,
92
+ which may result in limitations when addressing the needs of board-certified physicians.
93
+ The model has not been tested in real-world applications, so its efficacy and accuracy are currently unknown.
94
+ It should never be used as a substitute for a doctor's opinion and must be treated as a research tool only.
95
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
96
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_medalpaca__medalpaca-7b)
97
+
98
+ | Metric | Value |
99
+ |-----------------------|---------------------------|
100
+ | Avg. | 44.98 |
101
+ | ARC (25-shot) | 54.1 |
102
+ | HellaSwag (10-shot) | 80.42 |
103
+ | MMLU (5-shot) | 41.47 |
104
+ | TruthfulQA (0-shot) | 40.46 |
105
+ | Winogrande (5-shot) | 71.19 |
106
+ | GSM8K (5-shot) | 3.03 |
107
+ | DROP (3-shot) | 24.21 |
108
+
109
+