Weird Responses

#6
by Shallowlabs - opened

Hi Team, I was trying instruct model for summarization tasks but in multiple cases I observed that at the end of summarized passage its appending some code or arbitrary words like below:

period.package com.github.tix320.skimp.api.level.entity;\n\nimport com.github.tix320.skimp.api.level.SkimpLevel;\nimport com.github.tix320.skimp.api.util.Tuple;\nimport com.github.tix320.skimp.api.world.World;\nimport com.github.tix320.skimp.api.world.

#ifndef LIB_UTIL_H\n#define LIB_UTIL_H\n\n#include \n#include \n#include \n#include \n#include \n#include \n#include \n#include \n#include \n#include \n\n#include <Eigen/Core>\n\n#define UNUSED(x) ((void)(x))

Just FYI using below code:

import torch
import transformers
from transformers import pipeline
import pandas as pd

name = 'mosaicml/mpt-30b-instruct'

config = transformers.AutoConfig.from_pretrained(name, trust_remote_code=True)
config.attn_config['attn_impl'] = 'triton'
config.init_device = 'cuda:0'

tokenizer = transformers.AutoTokenizer.from_pretrained("mosaicml/mpt-30b")
model = transformers.AutoModelForCausalLM.from_pretrained(
name,
config=config,
torch_dtype=torch.bfloat16, # Load model weights in bfloat16
trust_remote_code=True
)

def format_prompt(instruction):
template = "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n###Instruction\n{instruction}\n\n### Response\n"
return template.format(instruction=instruction)

def summarize(text):
text = 'Please summarize the following article. Only highlighting the important points. \n\n' + text
fmt_ex = format_prompt(instruction=text)
with torch.autocast('cuda', dtype=torch.bfloat16):
inputs = tokenizer(fmt_ex, return_tensors="pt").to('cuda:0')
outputs = model.generate(**inputs, max_new_tokens=256, do_sample=True, top_p=0.95, top_k=50, temperature=0.7)
summary = tokenizer.batch_decode(outputs, skip_special_tokens=True)
summary = summary[0].split("### Response")[1]
return summary

I tried to play around with the generate parameters but not much help

The EOS token seems to get ignored by default in the generate method for this model. Change your generation setup to:
model.generate(**inputs, max_new_tokens=256, do_sample=True, top_p=0.95, top_k=50, temperature=0.7, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id)
and it should work as expected.

I'm experiencing similar weird responses, but I'm running the model through an Inference Endpoint; is it possible to specify the EOS token when calling the model through an Inference Endpoint?

Sign up or log in to comment