bertin-project
/

bertin-alpaca-lora-7b

Text Generation

Model card Files Files and versions Community

versae commited on Mar 28, 2023

Commit

a286db9

•

1 Parent(s): 2b0cfc7

Update README.md

Files changed (1) hide show

README.md +39 -6

README.md CHANGED Viewed

@@ -28,20 +28,53 @@ model = LLaMAForCausalLM.from_pretrained(
 model = PeftModel.from_pretrained(model, "bertin-project/bertin-alpaca-lora-7b")
 ```
-For generation, the promtp still needs the English template:
 ```python
-from transformers import pipeline
-pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
-instruction = "Escribe un correo electrónico dando la bienvenida a un nuevo empleado llamado Manolo."
-pipe.generate(f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.
 ### Instruction:
 {instruction}
 ### Response:
-""")
 # Estimado Manolo,
 #
 # ¡Bienvenido a nuestro equipo! Estamos muy contentos de que hayas decidido unirse a nosotros y estamos ansiosos por comenzar a trabajar juntos.

 model = PeftModel.from_pretrained(model, "bertin-project/bertin-alpaca-lora-7b")
 ```
+Until `PEFT` is fully supported in Hugginface0s pipelines, for generation we can either consolidate the LoRA weights into the LLaMA model weights, or use the adapter's `generate()` method. Remember that the promtp still needs the English template:
 ```python
+from transformers import GenerationConfig
+# Load the model
+model = ...
+# Generate prompts from Alpaca template
+def generate_prompt(instruction, input=None):
+    if input:
+        return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.  # noqa: E501
+### Instruction:
+{instruction}
+### Input:
+{input}
+### Response:
+"""
+    else:
+        return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.  # noqa: E501
 ### Instruction:
 {instruction}
 ### Response:
+"""
+# Generate responses
+def generate(instruction, input=None):
+    prompt = generate_prompt(instruction, input)
+    inputs = tokenizer(prompt, return_tensors="pt")
+    input_ids = inputs["input_ids"].cuda()
+    generation_output = model.generate(
+        input_ids=input_ids,
+        generation_config=GenerationConfig(temperature=0.2, top_p=0.75, num_beams=4),
+        return_dict_in_generate=True,
+        output_scores=True,
+        max_new_tokens=256
+    )
+    for seq in generation_output.sequences:
+        output = tokenizer.decode(seq)
+        print("Respuesta:", output.split("### Response:")[1].strip())
+generate("Escribe un correo electrónico dando la bienvenida a un nuevo empleado llamado Manolo.")
 # Estimado Manolo,
 #
 # ¡Bienvenido a nuestro equipo! Estamos muy contentos de que hayas decidido unirse a nosotros y estamos ansiosos por comenzar a trabajar juntos.