OpenMath-Mistral-7B-v0.1-hf-dialogsum-test-flash-attention-2

This model is a fine-tuned version of nvidia/OpenMath-Mistral-7B-v0.1-hf on the generator dataset. with dataset:

Load dataset from the hub

huggingface_dataset_name = "neil-code/dialogsum-test" dataset = load_d#print(dataset)

import torch from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig from trl import setup_chat_format

Hugging Face model id

model_id = "nvidia/OpenMath-Mistral-7B-v0.1-hf"

BitsAndBytesConfig int-4 config

bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 )

Load model and tokenizer

model = AutoModelForCausalLM.from_pretrained( model_id, device_map="auto", attn_implementation="flash_attention_2", torch_dtype=torch.bfloat16, quantization_config=bnb_config ) tokenizer = AutoTokenizer.from_pretrained(model_id,use_fast=True) tokenizer.padding_side = 'right' # to prevent warnings

We redefine the pad_token and pad_token_id with out of vocabulary token (unk_token)

tokenizer.pad_token = tokenizer.unk_token tokenizer.pad_token_id = tokenizer.unk_token_id

# set chat template to OAI chatML, remove if you start from a fine-tuned model

model, tokenizer = setup_chat_format(model, tokenizer)

text = "What is the capital of India?"

device = 'cuda' model_inputs = tokenizer(text, return_tensors="pt").to(model.device)

generated_ids = model.generate(**model_inputs, temperature=0.1, top_k=1, top_p=1.0, repetition_penalty=1.4, min_new_tokens=16, max_new_tokens=128, do_sample=True) decoded = tokenizer.decode(generated_ids[0]) print(decoded)

MODEL GENERATION - AFTER TUNNING

index=10 dataset = dataset_dialogsum_test TUNE_model = model prompt = dataset[index]['dialogue'] summary = dataset[index]['summary']

formatted_prompt = f"Instruct: Summarize the following conversation.\n{prompt}\nOutput:\n" res = gen_after_tunning(TUNE_model,formatted_prompt,1024,) #print(res[0]) output = res[0].split('Output:\n')[1]

dash_line = '-'.join('' for x in range(100)) print(dash_line) print(f'INPUT PROMPT:\n{formatted_prompt}') print(dash_line) print(f'BASELINE HUMAN SUMMARY:\n{summary}\n') print(dash_line) print(f'MODEL GENERATION - AFTER THE TUNNING:\n{output}')

INPUT PROMPT:

Instruct: Summarize the following conversation. #Person1#: Could you do me a favor? #Person2#: Sure. What is it? #Person1#: Could you run over to the store? We need a few things. #Person2#: All right. What do you want me to get? #Person1#: Well, could you pick up some sugar? #Person2#: Okay. How much? #Person1#: A small bag. I guess we also need a few oranges. #Person2#: How many? #Person1#: Oh, let's see. . . About six. #Person2#: Anything else? #Person1#: Yes. We're out of milk. #Person2#: Okay. How much do you want me to get? A gallon? #Person1#: No. I think a half gallon will be enough. #Person2#: Is that all? #Person1#: I think so. Have you got all that? #Person2#: Yes. That's small bag of sugar, four oranges, and a half gallon of milk. #Person1#: Do you have enough money? #Person2#: I think so. #Person1#: Thanks very much. I appreciate it. Output:

BASELINE HUMAN SUMMARY:

#Person1# asks #Person2# to do a favor. #Person2# agrees and helps buy a small bag of sugar, six oranges, and a half-gallon of milk.

MODEL GENERATION - AFTER THE TUNNING:

#Person1# asks #Person2# to run over to the store to pick up some sugar, oranges, and milk. #Person2# thinks he has got all that.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 3
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 6
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.03
num_epochs: 3

Training results

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.1.0+cu121
Datasets 2.18.0
Tokenizers 0.15.2

frankmorales2020
/

OpenMath-Mistral-7B-v0.1-hf-dialogsum-test-flash-attention-2

OpenMath-Mistral-7B-v0.1-hf-dialogsum-test-flash-attention-2

Load dataset from the hub

Hugging Face model id

BitsAndBytesConfig int-4 config

Load model and tokenizer

We redefine the pad_token and pad_token_id with out of vocabulary token (unk_token)

# set chat template to OAI chatML, remove if you start from a fine-tuned model

MODEL GENERATION - AFTER TUNNING

INPUT PROMPT:

BASELINE HUMAN SUMMARY:

MODEL GENERATION - AFTER THE TUNNING:

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for frankmorales2020/OpenMath-Mistral-7B-v0.1-hf-dialogsum-test-flash-attention-2

Evaluation results