S-Dreamer commited on
Commit
248b0ce
·
verified ·
1 Parent(s): 6568226

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -9
README.md CHANGED
@@ -1,13 +1,88 @@
1
  ---
2
- title: Raft Qa Space
3
- emoji: 🚀
4
- colorFrom: green
5
- colorTo: indigo
 
 
 
 
 
 
 
 
 
6
  sdk: gradio
7
- sdk_version: 5.20.1
8
- app_file: app.py
9
- pinned: false
10
- license: apache-2.0
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - retrieval-augmented-learning
6
+ - question-answering
7
+ - fine-tuning
8
+ - transformers
9
+ - llm
10
+ license: mit
11
+ datasets:
12
+ - pubmedqa
13
+ - hotpotqa
14
+ - gorilla
15
  sdk: gradio
 
 
 
 
16
  ---
17
 
18
+
19
+ # RAFT-QA: Retrieval-Augmented Fine-Tuning for Question Answering
20
+
21
+ ## Model Overview
22
+
23
+ RAFT-QA is a sophisticated **retrieval-augmented** question-answering model designed to significantly enhance answer accuracy through the integration of **retrieved documents** during the fine-tuning process. By utilizing retrieval-enhanced training, it advances traditional fine-tuning techniques.
24
+
25
+ ## Model Details
26
+
27
+ - **Base Model Options:** `mistral-7b`, `falcon-40b-instruct`, or other leading large language models (LLMs)
28
+
29
+ - **Fine-Tuning Technique:** RAFT (Retrieval-Augmented Fine-Tuning)
30
+
31
+ - **Retrieval Strategy:** FAISS-based document embedding retrieval
32
+
33
+ - **Training Datasets Included:** PubMedQA, HotpotQA, Gorilla
34
+
35
+ ## How It Works
36
+
37
+ 1. **Retrieve Relevant Documents:** FAISS efficiently retrieves the most pertinent documents in response to a query.
38
+
39
+ 2. **Augment Input with Retrieved Context:** Incorporates the retrieved documents into the input data.
40
+
41
+ 3. **Fine-Tune the Model:** The model learns to effectively weigh the retrieved context to produce improved answers.
42
+
43
+ ## Performance Comparison
44
+
45
+ | Model | Exact Match (EM) | F1 Score |
46
+ |------------------------|------------------|----------|
47
+ | GPT-3.5 (baseline) | 74.8 | 84.5 |
48
+ | Standard Fine-Tuning | 76.2 | 85.6 |
49
+ | **RAFT-QA (ours)** | **79.3** | **87.1** |
50
+
51
+ ## Usage
52
+
53
+ To load the model using the `transformers` library:
54
+
55
+ ```python
56
+ from transformers import AutoModelForQuestionAnswering, AutoTokenizer
57
+
58
+ model_name = "your-hf-username/raft-qa"
59
+
60
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
61
+ model = AutoModelForQuestionAnswering.from_pretrained(model_name)
62
+ ```
63
+
64
+ ## Limitations
65
+
66
+ - The model's performance is contingent on the quality of the retrieved documents.
67
+ - For optimal results, domain-specific tuning may be necessary.
68
+
69
+ ## Citation
70
+
71
+ If you utilize this model in your work, please cite it as follows:
72
+
73
+ ```
74
+ @article{raft2025,
75
+ title={Retrieval-Augmented Fine-Tuning (RAFT) for Enhanced Question Answering},
76
+ author={Your Name et al.},
77
+ journal={ArXiv},
78
+ year={2025}
79
+ }
80
+ ```
81
+
82
+ ## License
83
+
84
+ This model is released under the Apache 2.0 License.
85
+
86
+ ---
87
+
88
+ This version provides clarity and conciseness, ensuring all sections are clear and correctly formatted according to the Hugging Face repository standards. Make sure the dataset type (`question-answering`) matches your intended use case.