Sandiago21
/

llama-13b-hf-prompt-answering

@@ -20,7 +20,7 @@ This repository contains a LLaMA-13B further fine-tuned model on conversations a
 ## Model Details
-Anyone can use (ask prompts) and play with the model using the pre-existing Jupyter Notebook in the **noteboooks** folder.
 ### Model Description
@@ -102,7 +102,7 @@ Use the code below to get started with the model.
 import torch
 from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
-MODEL_NAME = "Sandiago21/llama-7b-hf-prompt-answering"
 config = PeftConfig.from_pretrained(MODEL_NAME)
@@ -159,8 +159,8 @@ print(response)
 import torch
 from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
-MODEL_NAME = "Sandiago21/llama-7b-hf-prompt-answering"
-BASE_MODEL = "decapoda-research/llama-7b-hf
 config = PeftConfig.from_pretrained(MODEL_NAME)
@@ -213,6 +213,29 @@ print(response)
 ## Training Details
 ### Training Data
@@ -226,4 +249,4 @@ The decapoda-research/llama-13b-hf model was further trained and finetuned on qu
 ## Model Architecture and Objective
-The model is based on decapoda-research/llama-13b-hf model and finetuned adapters on top of the main model on conversations and question answering data.

 ## Model Details
+Anyone can use (ask prompts) and play with the model using the pre-existing Jupyter Notebook in the **noteboooks** folder. The Jupyter Notebook contains example code to load the model and ask prompts to it as well as example prompts to get you started.
 ### Model Description
 import torch
 from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
+MODEL_NAME = "Sandiago21/llama-13b-hf-prompt-answering"
 config = PeftConfig.from_pretrained(MODEL_NAME)
 import torch
 from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
+MODEL_NAME = "Sandiago21/llama-13b-hf-prompt-answering"
+BASE_MODEL = "decapoda-research/llama-13b-hf
 config = PeftConfig.from_pretrained(MODEL_NAME)
 ## Training Details
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 50
+- num_epochs: 2
+- mixed_precision_training: Native AMP
+### Framework versions
+- Transformers 4.28.1
+- Pytorch 2.0.0+cu117
+- Datasets 2.12.0
+- Tokenizers 0.12.1
 ### Training Data
 ## Model Architecture and Objective
+The model is based on decapoda-research/llama-13b-hf model and finetuned adapters on top of the main model on conversations and question answering data.