---
language:
- en
datasets:
- Reza8848/MUFFIN_68k
license: mit
---

<img src="https://cdn-uploads.huggingface.co/production/uploads/6434a6e8ea46c009904c617e/J_4FHXmtM6TuRnN3aL06y.png" width="38" height="38">


This is the model weight of **MUFFIN-T5-3B** (**Mu**lti-**F**aceted **In**structions).

We fine-tune the [T5-3B](https://huggingface.co/t5-3b) model on our [MUFFIN dataset](https://arxiv.org/abs/2312.02436).

We released both 3B and 11B models:
|Model|Number of parameters|
|-|-|
|[MUFFIN-T5-3B](https://huggingface.co/Reza8848/MUFFIN-T5-3B)|3 billion|
|[MUFFIN-T5-11B](https://huggingface.co/Reza8848/MUFFIN-T5-11B)|11 billion|

You can also find the Llama2-based model weights [here](https://huggingface.co/Reza8848/MUFFIN-Llama2-lora-13B).


## Prompt Template

Please use the following prompt template when using the models for inference (including the evaluations on **SuperNI-Test**, **T0-Eval**, and **BBH**):

```python
prompt = "### Input:\n{input}"
prompt += "\n\n"
prompt += "### Instruction:\n{instruction}"
prompt += "\n\n"
prompt += "### Output:\n"

print(prompt)
```

Please use the below prompt when testing the models on classification tasks (i.e., the **MMLU**).

```python
prompt = "### Input:\n{input}"
prompt += "\n\n"
prompt += "### Instruction:\n{instruction}\n"
prompt += "(A): {option1}\n(B): {option2}\n(C): {option3}\n(D): {option4}\nAvoid answers outside of (A, B, C, D)."  # Add one more sentence in the prompt to indicate the output spaces
prompt += "\n\n"
prompt += "### Output:\n"

print(prompt)
```


## Model Usage

Download our model weights through HuggingFace transformers 🤗:

```python
import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

## Download
tokenizer = AutoTokenizer.from_pretrained("Reza8848/MUFFIN-T5-3B")
model = AutoModelForSeq2SeqLM.from_pretrained("Reza8848/MUFFIN-T5-3B")


## Inference
#### Please prepare your testing instance (as shown below)
value_dict = {
  "input": "Drink more wine when you feel thirsty.\nDrink more water when you feel thirsty"
  "instruction": "In this task, you are given two unconventional instructions for quenching thirst. Your goal is to identify which instruction is more likely to be followed by a person who wants to try something new or different. Answer \"Wine\" if the person is more likely to drink wine when thirsty, and \"Water\" if they are more likely to drink water."
}

#### Please use the prompt template mentioned before
input_sequence = prompt.format_map(value_dict)

input_ids = tokenizer(input_sequence, return_tensors="pt").input_ids
raw_outputs = model.generate(input_ids)  # set the generation arguments according to your needs (e.g., `do_sample`, `num_beams`)
outputs = tokenizer.decode(raw_outputs[0], skip_special_tokens=True)
print(outputs)
```

## Zero-Shot Evaluation Performances

Our training and inference code is based on [Tk-Instruct](https://github.com/yizhongw/Tk-Instruct/tree/main), including the [metric calculation scripts](https://github.com/yizhongw/Tk-Instruct/blob/main/src/compute_metrics.py) (i.e., `ROUGE-L` and `Exact-Match`). 

<div style="text-align:center"><img src="https://cdn-uploads.huggingface.co/production/uploads/6434a6e8ea46c009904c617e/J1ZMmCs6GvrRKyD1hazTs.png" alt="performances.png" width="600"/></div>


## 🥳 Citation

Please kindly cite our paper if you use any resources in this repository:

```bibtex
@inproceedings{Lou2023MUFFIN,
   title={{MUFFIN}: Curating Multi-Faceted Instructions for Improving Instruction Following},
   author={Renze Lou and Kai Zhang and Jian Xie and Yuxuan Sun and Janice Ahn and Hanzi Xu and Yu su and Wenpeng Yin},
   booktitle={The Twelfth International Conference on Learning Representations},
   year={2024},
   url={https://openreview.net/forum?id=1vrS1zwekw}
}
```