Weyaxi commited on
Commit
7d3ef8b
1 Parent(s): 9fcd0e5

readme draft 2

Browse files
Files changed (1) hide show
  1. README.md +41 -15
README.md CHANGED
@@ -140,35 +140,55 @@ tokens:
140
 
141
  # 📊 Datasets
142
 
 
 
 
 
143
  Following datasets were used in this model:
144
 
145
- - [MATH](https://huggingface.co/datasets/dahendrycks/competition_math)
 
 
 
 
 
 
 
 
 
 
146
 
147
- - [ARC](https://huggingface.co/datasets/allenai/ai2_arc) (Note: Only **train** part)
148
 
149
- - [camel-ai/physics](https://huggingface.co/datasets/camel-ai/physics)
150
 
151
- - [camel-ai/chemistry](https://huggingface.co/datasets/camel-ai/chemistry)
152
 
153
- - [camel-ai/biology](https://huggingface.co/datasets/camel-ai/biology)
154
 
155
- - [camel-ai/math](https://huggingface.co/datasets/camel-ai/math)
156
 
157
- - [STEM-AI-mtl/Electrical-engineering](https://huggingface.co/datasets/STEM-AI-mtl/Electrical-engineering)
158
 
159
- - [openbookqa](https://huggingface.co/datasets/openbookqa)
160
 
161
- - [piqa](https://huggingface.co/datasets/piqa)
162
 
163
- - [reclor](https://huggingface.co/datasets/metaeval/reclor)
164
 
165
- - [scibench](https://github.com/mandyyyyii/scibench)
166
 
167
- - [ScienceQA](https://huggingface.co/datasets/derek-thomas/ScienceQA)
168
 
169
- - [sciq](https://huggingface.co/datasets/sciq)
170
 
171
- - [ScienceEval](https://huggingface.co/datasets/TIGER-Lab/ScienceEval)
 
 
 
 
 
 
172
 
173
  # 💬 Prompt Template
174
 
@@ -193,14 +213,20 @@ tokens = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
193
 
194
  # 🤝 Acknowledgments
195
 
196
- Thanks to [@jondurbin](https://hf.co/jondurbin) for reformatting codes for some datasets: [bagel/data_sources](https://github.com/jondurbin/bagel/tree/main/bagel/data_sources)
 
 
197
 
198
  Thanks to [Together AI](https://www.together.ai) for providing everyone with free credits, which I used to generate a dataset in multiple choice to explanations format.
199
 
 
 
200
  Thanks to all the dataset authors mentioned in the datasets section.
201
 
202
  Thanks to [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) for making the repository I used to make this model.
203
 
 
 
204
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
205
 
206
  If you would like to support me:
 
140
 
141
  # 📊 Datasets
142
 
143
+ You can find the dataset I used and the work I am doing with this datasets here:
144
+
145
+ https://huggingface.co/datasets/Weyaxi/sci-datasets
146
+
147
  Following datasets were used in this model:
148
 
149
+ - 📐 [MATH](https://huggingface.co/datasets/dahendrycks/competition_math)
150
+
151
+ - 🧠 [ARC](https://huggingface.co/datasets/allenai/ai2_arc) (Note: Only **train** part)
152
+
153
+ - 🧲 [camel-ai/physics](https://huggingface.co/datasets/camel-ai/physics)
154
+
155
+ - ⚗️ [camel-ai/chemistry](https://huggingface.co/datasets/camel-ai/chemistry)
156
+
157
+ - 🦠 [camel-ai/biology](https://huggingface.co/datasets/camel-ai/biology)
158
+
159
+ - 📊 [camel-ai/math](https://huggingface.co/datasets/camel-ai/math)
160
 
161
+ - [STEM-AI-mtl/Electrical-engineering](https://huggingface.co/datasets/STEM-AI-mtl/Electrical-engineering)
162
 
163
+ - 📚 [openbookqa](https://huggingface.co/datasets/openbookqa)
164
 
165
+ - 🧠 [piqa](https://huggingface.co/datasets/piqa)
166
 
167
+ - 🎨 [reclor](https://huggingface.co/datasets/metaeval/reclor)
168
 
169
+ - 🔬 [scibench](https://github.com/mandyyyyii/scibench)
170
 
171
+ - 🧪 [ScienceQA](https://huggingface.co/datasets/derek-thomas/ScienceQA)
172
 
173
+ - 🧬 [sciq](https://huggingface.co/datasets/sciq)
174
 
175
+ - 📝 [ScienceEval](https://huggingface.co/datasets/TIGER-Lab/ScienceEval)
176
 
177
+ ## 🛠️ Multiple Choice Question & Answer Datasets Conversion Progress
178
 
179
+ I used [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) to generate a reasonable and logical answer by providing it with the question and the answer key.
180
 
181
+ I used the [Together AI](https://www.together.ai) API for this task.
182
 
183
+ The following datasets are converted using this method:
184
 
185
+ - 🧠 [ARC](https://huggingface.co/datasets/allenai/ai2_arc) (Note: Only **train** part)
186
+
187
+ - 📚 [openbookqa](https://huggingface.co/datasets/openbookqa)
188
+
189
+ - 🎨 [reclor](https://huggingface.co/datasets/metaeval/reclor)
190
+
191
+ - 🧬 [sciq](https://huggingface.co/datasets/sciq)
192
 
193
  # 💬 Prompt Template
194
 
 
213
 
214
  # 🤝 Acknowledgments
215
 
216
+ Thanks to [openchat](https://huggingface.co/openchat) team for fine-tuning an excellent model that I used as a base model.
217
+
218
+ Thanks to [@jondurbin](https://huggingface.co/jondurbin) for reformatting codes for some datasets: [bagel/data_sources](https://github.com/jondurbin/bagel/tree/main/bagel/data_sources)
219
 
220
  Thanks to [Together AI](https://www.together.ai) for providing everyone with free credits, which I used to generate a dataset in multiple choice to explanations format.
221
 
222
+ Thanks to [Tim Dettmers](https://huggingface.co/timdettmers) for his excellent [QLoRA](https://arxiv.org/abs/2305.14314) work.
223
+
224
  Thanks to all the dataset authors mentioned in the datasets section.
225
 
226
  Thanks to [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) for making the repository I used to make this model.
227
 
228
+ Overall, thanks to all of the open soure AI community! 🚀
229
+
230
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
231
 
232
  If you would like to support me: