Edit model card

Model Card

NorGPT-3B-summarization-peft is trained on top of NorGPT-3B model using RLHF strategy on NO-CNN-DailyMail dataset.

Different from step 2 in the original RLHF, we trained the reward model by estimating the semantic similarity between the candidate generated text and the human annotated summary (golden summary) using the NorBERT model. Generated summaries with higher cosine similarity to the golden summary will be ranked higher in the training of the reward model.

Prompt format:

Summarise the article:\\n{article} |||\\n{positive_sample}

Inference prompt:

Summarise the article:\\n{article} |||\\n

Training Split

We split data to train on step 1-step 3 for RLHF:

#samples
step 1 61181
step 2 16798
step 3 9758

Run the Model

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "NorGLM/NorGPT-3B-rfhl-summarization"
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map='auto',
    torch_dtype=torch.bfloat16
)

Inference on test set

Load the model to evaluate on the test set of NO-CNN-DailyMail dataset:

def generate_texts(model, tokenizer, prompts, max_seq_length=200, do_sample=True, top_p=0.95, top_k=10):
    # prompts are a list of news articles
    results = []
    cnt = 0
    for prompt in prompts:
        cnt += 1
        pro_len = len(prompt.split())
        if pro_len>1024:
            results.append('')
            continue

        prompt = 'Summarise the article:\\n' + prompt + ' |||\\n'

        model_inputs = tokenizer(prompt, return_tensors='pt').to(torch_device)
        output = model.generate(**model_inputs, do_sample=False, max_new_tokens=max_seq_length)
        result = tokenizer.decode(output[0], skip_special_tokens=True)
        result = result.split("|||\\n")[-1]
        results.append(result)
    return results

print("--LOADING EVAL DATAS---")
eval_data = load_dataset("NorGLM/NO-CNN-DailyMail", data_files="test.csv")
prompts = eval_data['train']['article']
positive_samples = eval_data['train']['positive_sample']

print("--MAKING PREDICTIONS---")
model.eval()

output_file = <output file name>
with torch.no_grad():
    results = generate_texts(model, tokenizer, prompts)

df = pd.DataFrame({'article':prompts, 'generated_text':results, 'positive_sample':positive_samples})

print("Save results to csv file...")
df.to_csv(output_file)

Note

More training details will be released soon!

Downloads last month
234

Dataset used to train NorGLM/NorGPT-3B-rfhl-summarization

Collection including NorGLM/NorGPT-3B-rfhl-summarization