|
--- |
|
library_name: peft |
|
tags: |
|
- dpo |
|
base_model: SGaleshchuk/Llama-2-13b-hf_uk_rank-32_ft |
|
model-index: |
|
- name: Llama-2-13b-summarization_uk_dpo |
|
results: [] |
|
license: apache-2.0 |
|
datasets: |
|
- SGaleshchuk/XL_SUM_ukr_synthetic_hallucinations |
|
- csebuetnlp/xlsum |
|
language: |
|
- uk |
|
metrics: |
|
- rouge |
|
pipeline_tag: summarization |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# Llama-2-13b-summarization_uk_dpo |
|
|
|
This model is a fine-tuned version of [SGaleshchuk/Llama-2-13b-hf_uk_rank-32_ft](https://huggingface.co/SGaleshchuk/Llama-2-13b-hf_uk_rank-32_ft) on summarization dataset. |
|
|
|
## Set-up step description |
|
|
|
* Fine-tune Llama-2 model on training data |
|
* Generate summaries using fine-tuned Llama-2 model on validation set |
|
* Corrupt generated summaries by adding information not given in input text |
|
* Align fine-tuned Llama-2 with golden summaries to choose and reject noisy synthetic text |
|
* Apply both fine-tuned and aligned versions on test set |
|
* Assess level of faithfulness hallucinations in generated texts using GPT-4 and Rouge-L, and human evaluation on a small subset |
|
|
|
|
|
## Intended uses & limitations |
|
```python |
|
# tested with colab+A100 GPU |
|
!pip install -q -U peft transformers==4.30 |
|
!pip install flash-attn --no-build-isolation |
|
!pip install einops bitsandbytes accelerate |
|
# unpatch flash attention |
|
import torch |
|
from peft import AutoPeftModelForCausalLM |
|
from transformers import AutoTokenizer |
|
|
|
model_id = "SGaleshchuk/Llama-2-13b-summarization_uk_dpo" |
|
|
|
# load base LLM model and tokenizer |
|
model = AutoPeftModelForCausalLM.from_pretrained( |
|
model_id, |
|
low_cpu_mem_usage=True, |
|
torch_dtype=torch.float16, |
|
load_in_4bit=True) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
|
def prepare_instruction(text): |
|
|
|
|
|
prompt = """The article to summarize in maximum 100 words:{text}. Summary:""" # adapt to your needs |
|
|
|
return prompt.format( |
|
text=text, |
|
) |
|
def summarization(text): |
|
instruction = prepare_instruction(text) |
|
input_ids = tokenizer(instruction, return_tensors="pt", truncation=True).input_ids.cuda() |
|
with torch.inference_mode(): |
|
outputs = model.generate( |
|
input_ids=input_ids, |
|
max_new_tokens=128, |
|
do_sample=True, |
|
top_p=0.9, |
|
temperature=1e-2, |
|
) |
|
result = tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0] |
|
result = result[len(instruction) :] |
|
print(result) |
|
return result |
|
|
|
text = """your text here to summarize" |
|
result = summarization(text) |
|
|
|
``` |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-06 |
|
- train_batch_size: 1 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: cosine |
|
- num_epochs: 10 |
|
|
|
### Training results |
|
|
|
|
|
|
|
### Framework versions |
|
|
|
- PEFT 0.9.0 |
|
- Transformers 4.38.2 |
|
- Pytorch 2.2.1+cu121 |
|
- Datasets 2.19.1 |
|
- Tokenizers 0.15.2 |