CLARA-MeD
/

bertin-gpt

Generated from Trainer

Model card Files Files and versions Community

joheras commited on Oct 11, 2023

Commit

2f43cc5

•

1 Parent(s): d1f03d4

Update README.md

Files changed (1) hide show

README.md +43 -3

README.md CHANGED Viewed

@@ -6,6 +6,8 @@ tags:
 model-index:
 - name: bertin-gpt-clara-med
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -17,9 +19,47 @@ This model is a fine-tuned version of [bertin-project/bertin-gpt-j-6B-alpaca](ht
 It achieves the following results on the evaluation set:
 - Loss: 0.6110
-## Model description
-More information needed
 ## Intended uses & limitations
@@ -62,4 +102,4 @@ The following hyperparameters were used during training:
 - Transformers 4.32.1
 - Pytorch 2.0.0+cu117
 - Datasets 2.14.4
-- Tokenizers 0.13.3

 model-index:
 - name: bertin-gpt-clara-med
   results: []
+datasets:
+- CLARA-MeD/CLARA-MeD
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 It achieves the following results on the evaluation set:
 - Loss: 0.6110
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig, pipeline
+base_model = "CLARA-MeD/bertin-gpt"
+tokenizer = AutoTokenizer.from_pretrained(base_model)
+model = AutoModelForCausalLM.from_pretrained(base_model).cuda()
+```
+For generation, we can use the model's `.generate()` method. Remember that the prompt needs a **Spanish** template:
+```python
+# Generate responses
+def generate(input):
+    prompt = f"""A continuación hay una instrucción que describe una tarea, junto con una entrada que proporciona más contexto. Escribe una respuesta que complete adecuadamente lo que se pide.
+### Instrucción:
+Simplifica la siguiente frase
+### Entrada:
+{input}
+### Respuesta:"""
+    inputs = tokenizer(prompt, return_tensors="pt")
+    input_ids = inputs["input_ids"].cuda()
+    generation_output = model.generate(
+        input_ids=input_ids,
+        generation_config=GenerationConfig(temperature=0.2, top_p=0.75, num_beams=4),
+        return_dict_in_generate=True,
+        output_scores=True,
+        max_new_tokens=256
+    )
+    for seq in generation_output.sequences:
+        output = tokenizer.decode(seq, skip_special_tokens=True)
+        print(output.split("### Respuesta:")[-1].strip())
+generate("Al sujeto se le ha tratado previamente con antagonistas del factor de necrosis tumoral alfa (TNF-α) sin respuesta clínica documentada al tratamiento. También puede ocurrir que al sujeto no se le tratara anteriormente con antagonistas de TNF-α, pero está fallando el tratamiento convencional actual.")
+```
 ## Intended uses & limitations
 - Transformers 4.32.1
 - Pytorch 2.0.0+cu117
 - Datasets 2.14.4
+- Tokenizers 0.13.3