Sandiago21 commited on
Commit
6e1cb11
1 Parent(s): 325d3f4

update readme with instructions on how to load and test the model

Browse files
Files changed (1) hide show
  1. README.md +28 -5
README.md CHANGED
@@ -20,7 +20,7 @@ This repository contains a LLaMA-13B further fine-tuned model on conversations a
20
 
21
  ## Model Details
22
 
23
- Anyone can use (ask prompts) and play with the model using the pre-existing Jupyter Notebook in the **noteboooks** folder.
24
 
25
  ### Model Description
26
 
@@ -102,7 +102,7 @@ Use the code below to get started with the model.
102
  import torch
103
  from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
104
 
105
- MODEL_NAME = "Sandiago21/llama-7b-hf-prompt-answering"
106
 
107
  config = PeftConfig.from_pretrained(MODEL_NAME)
108
 
@@ -159,8 +159,8 @@ print(response)
159
  import torch
160
  from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
161
 
162
- MODEL_NAME = "Sandiago21/llama-7b-hf-prompt-answering"
163
- BASE_MODEL = "decapoda-research/llama-7b-hf
164
 
165
  config = PeftConfig.from_pretrained(MODEL_NAME)
166
 
@@ -213,6 +213,29 @@ print(response)
213
 
214
  ## Training Details
215
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
216
 
217
  ### Training Data
218
 
@@ -226,4 +249,4 @@ The decapoda-research/llama-13b-hf model was further trained and finetuned on qu
226
 
227
  ## Model Architecture and Objective
228
 
229
- The model is based on decapoda-research/llama-13b-hf model and finetuned adapters on top of the main model on conversations and question answering data.
 
20
 
21
  ## Model Details
22
 
23
+ Anyone can use (ask prompts) and play with the model using the pre-existing Jupyter Notebook in the **noteboooks** folder. The Jupyter Notebook contains example code to load the model and ask prompts to it as well as example prompts to get you started.
24
 
25
  ### Model Description
26
 
 
102
  import torch
103
  from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
104
 
105
+ MODEL_NAME = "Sandiago21/llama-13b-hf-prompt-answering"
106
 
107
  config = PeftConfig.from_pretrained(MODEL_NAME)
108
 
 
159
  import torch
160
  from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
161
 
162
+ MODEL_NAME = "Sandiago21/llama-13b-hf-prompt-answering"
163
+ BASE_MODEL = "decapoda-research/llama-13b-hf
164
 
165
  config = PeftConfig.from_pretrained(MODEL_NAME)
166
 
 
213
 
214
  ## Training Details
215
 
216
+ ## Training procedure
217
+
218
+ ### Training hyperparameters
219
+
220
+ The following hyperparameters were used during training:
221
+ - learning_rate: 2e-05
222
+ - train_batch_size: 4
223
+ - eval_batch_size: 8
224
+ - seed: 42
225
+ - gradient_accumulation_steps: 2
226
+ - total_train_batch_size: 8
227
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
228
+ - lr_scheduler_type: linear
229
+ - lr_scheduler_warmup_steps: 50
230
+ - num_epochs: 2
231
+ - mixed_precision_training: Native AMP
232
+
233
+ ### Framework versions
234
+
235
+ - Transformers 4.28.1
236
+ - Pytorch 2.0.0+cu117
237
+ - Datasets 2.12.0
238
+ - Tokenizers 0.12.1
239
 
240
  ### Training Data
241
 
 
249
 
250
  ## Model Architecture and Objective
251
 
252
+ The model is based on decapoda-research/llama-13b-hf model and finetuned adapters on top of the main model on conversations and question answering data.