Model Card for MCQ-Classifier-MMLU-XYZ

MCQ-Classifier is a parameter-efficient finetuned 7B Mistral-7b-base-v0.1 to automatically detect the model answers to Multiple Choice Questions.

This model is trained on annotated model outputs to MMLU dataset. We collected responses from Llama2-7b-chat, Llama2-13b-chat and Mistral-7b-Inst-v0.2

For full details of this model please read our paper.

"XYZ"

During our annotation phase, we noticed that models may not choose the available answer candiates but refuse to answer or claim "No correct answer available." Therefore, we consider other three cases "Refusal", "No correct answer", "I don't know" and label them as "X", "Y", "Z".

Note that "I don't know" is an additional mode we assumed model could have. However, we didn't counter this behaviour during the entire annoation phase. Therefore, our classifier cannot identify "I don't know" cases. If you observe such behaviour of your model, feel free to continue fine-tune our classifiers, or even add more modes!

Also note that, if your data has "Refuse" in the options, such as "D. Refuse", our classifier will classify this as "Y". We expicitly do so, because we ignore the difference between refusing to answer and chosing the refusal options. There are many cases where the model first refuse to answer then choose the option "D. Refuse", which makes it difficult to label. You can use our another version of the classifer (EFG) which will only map the answer to available options.

Run the model

Your should construct your input into such format: model_reponse + "\nReferences:" + references + "\nAnswer:"

For example:

inputs = " Sure, I'm happy to help! The correct answer is:\n\nB. retraction of the stoma. \nReferences: \nA. high output stomas. \nB. retraction of the stoma. \nC. prolapsed stomas. \nD. herniation around the stoma. \nAnswer:"

then feed it to the classifier:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig
config = PeftConfig.from_pretrained("mainlp/MCQ-Classifier-MMLU-XYZ")
base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
model = PeftModel.from_pretrained(base_model, "mainlp/MCQ-Classifier-MMLU-XYZ")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")
to_classify = f"""<s>[INST] Classify the response.{inputs} [/INST]"""
model_input = tokenizer(to_classify, return_tensors="pt")
output =  model.generate(**model_input, max_new_tokens=1, do_sample=False)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Cite

@article{wang2024my,
  title={" My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models},
  author={Wang, Xinpeng and Ma, Bolei and Hu, Chengzhi and Weber-Genzel, Leon and R{\"o}ttger, Paul and Kreuter, Frauke and Hovy, Dirk and Plank, Barbara},
  journal={arXiv preprint arXiv:2402.14499},
  year={2024}
}

@article{wang2024look,
  title={Look at the Text: Instruction-Tuned Language Models are More Robust Multiple Choice Selectors than You Think},
  author={Wang, Xinpeng and Hu, Chengzhi and Ma, Bolei and R{\"o}ttger, Paul and Plank, Barbara},
  journal={arXiv preprint arXiv:2404.08382},
  year={2024}
}