bertin-project
/

bertin-gpt-j-6B-alpaca

Text Generation

Inference Endpoints

Model card Files Files and versions Community

versae commited on Mar 29, 2023

Commit

0c49286

·

1 Parent(s): 880ce9c

Update README.md

Files changed (1) hide show

README.md +2 -6

README.md CHANGED Viewed

@@ -19,14 +19,10 @@ from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig,
 base_model = "bertin-project/bertin-gpt-j-6B-alpaca"
 tokenizer = AutoTokenizer.from_pretrained(base_model)
-model = AutoModelForCausalLM.from_pretrained(
-    base_model,
-    load_in_8bit=True,
-    device_map="auto",
-)
 ```
-Until `PEFT` is fully supported in Hugginface's pipelines, for generation we can either consolidate the LoRA weights into the LLaMA model weights, or use the adapter's `generate()` method. Remember that the prompt still needs the English template:
 ```python
 # Generate responses

 base_model = "bertin-project/bertin-gpt-j-6B-alpaca"
 tokenizer = AutoTokenizer.from_pretrained(base_model)
+model = AutoModelForCausalLM.from_pretrained(base_model)
 ```
+For generation, we can either use `pipeline()` or the model's `.generate()` method. Remember that the prompt needs a **Spanish** template:
 ```python
 # Generate responses