bertin-project
/

bertin-alpaca-lora-7b

Text Generation

Model card Files Files and versions Community

versae commited on Mar 28, 2023

Commit

1ec0a9c

•

1 Parent(s): a286db9

Update README.md

Files changed (1) hide show

README.md +5 -14

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ This is a Spanish adapter generated by fine-tuning LLaMA-7B on a [Spanish Alpaca
 ```python
 from peft import PeftModel
-from transformers import LLaMATokenizer, LLaMAForCausalLM
 base_model = "decapoda-research/llama-7b-hf"
 tokenizer = LLaMATokenizer.from_pretrained(base_model)
@@ -31,15 +31,10 @@ model = PeftModel.from_pretrained(model, "bertin-project/bertin-alpaca-lora-7b")
 Until `PEFT` is fully supported in Hugginface0s pipelines, for generation we can either consolidate the LoRA weights into the LLaMA model weights, or use the adapter's `generate()` method. Remember that the promtp still needs the English template:
 ```python
-from transformers import GenerationConfig
-# Load the model
-model = ...
-# Generate prompts from Alpaca template
-def generate_prompt(instruction, input=None):
     if input:
-        return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.  # noqa: E501
 ### Instruction:
 {instruction}
@@ -50,17 +45,13 @@ def generate_prompt(instruction, input=None):
 ### Response:
 """
     else:
-        return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.  # noqa: E501
 ### Instruction:
 {instruction}
 ### Response:
 """
-# Generate responses
-def generate(instruction, input=None):
-    prompt = generate_prompt(instruction, input)
     inputs = tokenizer(prompt, return_tensors="pt")
     input_ids = inputs["input_ids"].cuda()
     generation_output = model.generate(

 ```python
 from peft import PeftModel
+from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig
 base_model = "decapoda-research/llama-7b-hf"
 tokenizer = LLaMATokenizer.from_pretrained(base_model)
 Until `PEFT` is fully supported in Hugginface0s pipelines, for generation we can either consolidate the LoRA weights into the LLaMA model weights, or use the adapter's `generate()` method. Remember that the promtp still needs the English template:
 ```python
+# Generate responses
+def generate(instruction, input=None):
     if input:
+        prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.  # noqa: E501
 ### Instruction:
 {instruction}
 ### Response:
 """
     else:
+        prompt = f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.  # noqa: E501
 ### Instruction:
 {instruction}
 ### Response:
 """
     inputs = tokenizer(prompt, return_tensors="pt")
     input_ids = inputs["input_ids"].cuda()
     generation_output = model.generate(