--- title: Alpaca LoRa 7B language: en license: other tags: - alpaca - lora - llama - peft --- # Alpaca LoRa 7B This repository contains a LLaMA-7B fine-tuned model on the [Standford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) cleaned version dataset. ⚠️ **I used [LLaMA-7B-hf](https://huggingface.co/decapoda-research/llama-7b-hf) as a base model, so this model is for Research purpose only (See the [license](https://huggingface.co/decapoda-research/llama-7b-hf/blob/main/LICENSE))** # Usage ## Creating prompt The model was trained on the following kind of prompt: ```python def generate_prompt(instruction: str, input_ctxt: str = None) -> str: if input_ctxt: return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {instruction} ### Input: {input_ctxt} ### Response:""" else: return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: {instruction} ### Response:""" ``` ## Using the model ```python import torch from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM tokenizer = LlamaTokenizer.from_pretrained("chainyo/alpaca-lora-7b") model = LlamaForCausalLM.from_pretrained( "chainyo/alpaca-lora-7b", load_in_8bit=True, torch_dtype=torch.float16, device_map="auto", ) generation_config = GenerationConfig( temperature=0.2, top_p=0.75, top_k=40, num_beams=4, max_new_tokens=128, ) model.eval() if torch.__version__ >= "2": model = torch.compile(model) instruction = "What is the meaning of life?" input_ctxt = None # For some tasks, you can provide an input context to help the model generate a better response. prompt = generate_prompt(instruction, input_ctxt) input_ids = tokenizer(prompt, return_tensors="pt").input_ids input_ids = input_ids.to(model.device) with torch.no_grad(): outputs = model.generate( input_ids=input_ids, generation_config=generation_config, return_dict_in_generate=True, output_scores=True, ) response = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True) print(response) >>> The meaning of life is to live a life of meaning. ```