MBZUAI-LLM
/

LLaMA2-7B-GLoRA-ShareGPT

@@ -3,20 +3,20 @@ library_name: peft
 datasets:
 - shareGPT
 tags:
-- llama
 inference: false
 pipeline_tag: text-generation
 ---
 # llama-7b-glora 🦙
-This model was built via parameter-efficient GLoRA finetuning of [llama-7b](https://huggingface.co/huggyllama/llama-7b) on the shareGPT dataset. We adapt only the attention layers using GLoRA.
-* Model license: This model is under a non-commercial license (see the LICENSE file) same as LLaMA.
 * GLoRA implementation: [script](https://github.com/Arnav0400/peft/blob/main/src/peft/tuners/glora.py)
 ## Model Description
-The architecture is similar to LLaMA-7B, but the bias is true for attention layers.
 ## Limitations and Biases
 _The following language is modified from [EleutherAI's GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)_
@@ -42,12 +42,12 @@ Basic model loading:
 ```python
 model = AutoModelForCausalLM.from_pretrained(
-    "MBZUAI-LLM/LLaMA-7B-GLoRA-ShareGPT",
     use_auth_token=True,
     torch_dtype=torch.bfloat16,
     device_map="auto",
 )
-tokenizer = AutoTokenizer.from_pretrained("MBZUAI-LLM/LLaMA-7B-GLoRA-ShareGPT")
 ```
 Once loaded, the model and tokenizer can be used with the following code:
@@ -65,7 +65,7 @@ def llama_generate(
     Uses Hugging Face GenerationConfig defaults
         https://huggingface.co/docs/transformers/v4.29.1/en/main_classes/text_generation#transformers.GenerationConfig
     Args:
-        model (transformers.AutoModelForCausalLM): Falcon model for text generation
         tokenizer (transformers.AutoTokenizer): Tokenizer for model
         prompt (str): Prompt for text generation
         max_new_tokens (int, optional): Max new tokens after the prompt to generate. Defaults to 128.

 datasets:
 - shareGPT
 tags:
+- llama2
 inference: false
 pipeline_tag: text-generation
 ---
 # llama-7b-glora 🦙
+This model was built via parameter-efficient GLoRA finetuning of [llama2-7b](https://huggingface.co/meta-llama/Llama-2-7b) on the shareGPT dataset. We adapt only the attention layers using GLoRA.
+* Model license: This model is under a same license (see the LICENSE file) as LLaMA2.
 * GLoRA implementation: [script](https://github.com/Arnav0400/peft/blob/main/src/peft/tuners/glora.py)
 ## Model Description
+The architecture is similar to LLaMA2-7B, but the bias is true for attention layers.
 ## Limitations and Biases
 _The following language is modified from [EleutherAI's GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)_
 ```python
 model = AutoModelForCausalLM.from_pretrained(
+    "MBZUAI-LLM/LLaMA2-7B-GLoRA-ShareGPT",
     use_auth_token=True,
     torch_dtype=torch.bfloat16,
     device_map="auto",
 )
+tokenizer = AutoTokenizer.from_pretrained("MBZUAI-LLM/LLaMA2-7B-GLoRA-ShareGPT")
 ```
 Once loaded, the model and tokenizer can be used with the following code:
     Uses Hugging Face GenerationConfig defaults
         https://huggingface.co/docs/transformers/v4.29.1/en/main_classes/text_generation#transformers.GenerationConfig
     Args:
+        model (transformers.AutoModelForCausalLM): Model for text generation
         tokenizer (transformers.AutoTokenizer): Tokenizer for model
         prompt (str): Prompt for text generation
         max_new_tokens (int, optional): Max new tokens after the prompt to generate. Defaults to 128.