Chanukya Patnaik commited on
Commit
a592a49
1 Parent(s): 1b4b4c7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -38
README.md CHANGED
@@ -14,8 +14,7 @@ This model card aims to be a base template for new models. It has been generated
14
 
15
  ## Why use effi-13B-Instruct?
16
  - This is a ready to use chat/instruct model based on Llama-2-13b-chat-hf, which provides a rationale for the context provided.
17
- - Llama-2 is the best open-source model available.
18
- This is an instruct model, which may not be ideal for further finetuning. If you are interested in building your own instruct/chat model, we recommend starting from **Llama-2-13b-chat-hf**
19
 
20
  You will need at least **85-100GB of memory to swiftly run inference with effi-13b**.
21
 
@@ -23,7 +22,7 @@ You will need at least **85-100GB of memory to swiftly run inference with effi-1
23
 
24
  ### Model Description
25
 
26
- This model has been fine tuned on Chain of Thought datasets which has context from mixed sources with corresponding rationale. The final finetuned Large Language Model(LLM) have shown enhanced capabilities of solving novel tasks by providing a reasoning.
27
 
28
 
29
 
@@ -31,19 +30,9 @@ This model has been fine tuned on Chain of Thought datasets which has context f
31
  - **Model type:** Casual Decoder only
32
  - **Language(s) (NLP):** English
33
  - **License:** Apache 2.0
34
- - **Finetuned from model :** Llama-2-13b-chat-hf
35
 
36
- ### Model Sources [optional]
37
 
38
- <!-- Provide the basic links for the model. -->
39
-
40
- - **Repository:** [More Information Needed]
41
- - **Paper [optional]:** [More Information Needed]
42
- - **Demo [optional]:** [More Information Needed]
43
-
44
- ## Uses
45
-
46
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
47
 
48
  ### Direct Use
49
 
@@ -170,11 +159,7 @@ The data was tokenized with the **meta-llama/Llama-2-13b-chat-hf** tokenizer.
170
 
171
  ### Training Procedure
172
 
173
- Finetuning approach using PefT and Qlora(https://huggingface.co/blog/4bit-transformers-bitsandbytes)
174
-
175
- #### Preprocessing [optional]
176
-
177
- [More Information Needed]
178
 
179
 
180
  #### Training Hyperparameters
@@ -217,25 +202,7 @@ Finetuning approach using PefT and Qlora(https://huggingface.co/blog/4bit-transf
217
 
218
  Paper coming soon.
219
 
220
- See the OpenLLM Leaderboard(https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)for early results.
221
-
222
- ## Technical Specifications [optional]
223
-
224
- ### Model Architecture and Objective
225
-
226
- [More Information Needed]
227
-
228
- ### Compute Infrastructure
229
-
230
- [More Information Needed]
231
-
232
- #### Hardware
233
-
234
- [More Information Needed]
235
-
236
- #### Software
237
-
238
- [More Information Needed]
239
 
240
  ## Citation
241
 
 
14
 
15
  ## Why use effi-13B-Instruct?
16
  - This is a ready to use chat/instruct model based on Llama-2-13b-chat-hf, which provides a rationale for the context provided.
17
+ - Llama-2 is the best open-source model available. This is an instruct model, which may not be ideal for further finetuning. If you are interested in building your own instruct/chat model, we recommend starting from **Llama-2-13b-chat-hf**
 
18
 
19
  You will need at least **85-100GB of memory to swiftly run inference with effi-13b**.
20
 
 
22
 
23
  ### Model Description
24
 
25
+ This model has been fine-tuned on Chain of Thought datasets, which has context from mixed sources with corresponding rationale. The final finetuned Large Language Model(LLM) have shown enhanced capabilities of solving novel tasks by providing a reasoning.
26
 
27
 
28
 
 
30
  - **Model type:** Casual Decoder only
31
  - **Language(s) (NLP):** English
32
  - **License:** Apache 2.0
33
+ - **Finetuned from model:** Llama-2-13b-chat-hf
34
 
 
35
 
 
 
 
 
 
 
 
 
 
36
 
37
  ### Direct Use
38
 
 
159
 
160
  ### Training Procedure
161
 
162
+ Fine-tuning approach using PefT and Qlora(https://huggingface.co/blog/4bit-transformers-bitsandbytes)
 
 
 
 
163
 
164
 
165
  #### Training Hyperparameters
 
202
 
203
  Paper coming soon.
204
 
205
+ See the OpenLLM Leaderboard(https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) for early results.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
206
 
207
  ## Citation
208