|
--- |
|
license: openrail |
|
base_model: bertin-project/bertin-gpt-j-6B-alpaca |
|
tags: |
|
- generated_from_trainer |
|
model-index: |
|
- name: bertin-gpt-clara-med |
|
results: [] |
|
datasets: |
|
- CLARA-MeD/CLARA-MeD |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# bertin-gpt-clara-med |
|
|
|
This model is a fine-tuned version of [bertin-project/bertin-gpt-j-6B-alpaca](https://huggingface.co/bertin-project/bertin-gpt-j-6B-alpaca) on an unknown dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.6110 |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig, pipeline |
|
from peft import PeftConfig, PeftModel |
|
import torch |
|
from accelerate import init_empty_weights, load_checkpoint_and_dispatch, infer_auto_device_map |
|
|
|
|
|
repo_name = "CLARA-MeD/bertin-gpt" |
|
config = PeftConfig.from_pretrained(repo_name) |
|
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) |
|
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path,torch_dtype=torch.float16, |
|
device_map="auto") |
|
model = PeftModel.from_pretrained(model, repo_name) |
|
``` |
|
|
|
For generation, we can use the model's `.generate()` method. Remember that the prompt needs a **Spanish** template: |
|
|
|
```python |
|
# Generate responses |
|
def generate(input): |
|
prompt = f"""A continuaci贸n hay una instrucci贸n que describe una tarea, junto con una entrada que proporciona m谩s contexto. Escribe una respuesta que complete adecuadamente lo que se pide. |
|
|
|
### Instrucci贸n: |
|
Simplifica la siguiente frase |
|
|
|
### Entrada: |
|
{input} |
|
|
|
### Respuesta:""" |
|
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
input_ids = inputs["input_ids"].cuda() |
|
generation_output = model.generate( |
|
input_ids=input_ids, |
|
generation_config=GenerationConfig(temperature=0.2, top_p=0.75, num_beams=4), |
|
return_dict_in_generate=True, |
|
output_scores=True, |
|
max_new_tokens=256 |
|
) |
|
for seq in generation_output.sequences: |
|
output = tokenizer.decode(seq, skip_special_tokens=True) |
|
print(output.split("### Respuesta:")[-1].strip()) |
|
|
|
generate("Acromegalia") |
|
# La acromegalia es un trastorno causado por un exceso de hormona del crecimiento en el cuerpo. |
|
|
|
|
|
``` |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0003 |
|
- train_batch_size: 4 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- gradient_accumulation_steps: 32 |
|
- total_train_batch_size: 128 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_steps: 100 |
|
- training_steps: 300 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | |
|
|:-------------:|:-----:|:----:|:---------------:| |
|
| 0.5564 | 0.38 | 50 | 0.7804 | |
|
| 0.3879 | 0.75 | 100 | 0.6551 | |
|
| 0.3609 | 1.13 | 150 | 0.6327 | |
|
| 0.3615 | 1.5 | 200 | 0.6179 | |
|
| 0.3371 | 1.88 | 250 | 0.6135 | |
|
| 0.3242 | 2.25 | 300 | 0.6110 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.32.1 |
|
- Pytorch 2.0.0+cu117 |
|
- Datasets 2.14.4 |
|
- Tokenizers 0.13.3 |