cappy-large / README.md
btan2's picture
Update README.md
39ed2ab
---
license: apache-2.0
---
# Cappy-Large
## Getting Started
Cappy is a pretrained small scorer designed to enhance the performance and efficiency of multi-task LLMs.
Cappy takes in an instruction and a candidate response as input, and produces a score between 0 and 1, indicating an estimated correctness of the response with respect to the instruction.
With merely 360 million parameters, Cappy functions either independently on classification tasks or serve as an auxiliary component for LLMs, boosting their performance.
Also, Cappy enables efficiently integrating downstream supervision without requiring LLM finetuning nor the access to their parameters.
Furthermore, Cappy is flexible to cooperate with other LLM adaptations, including finetuning and in-context learning, and prompt tuning, offering additional performance enhancement.
- **Repository:** [https://github.com/tanyuqian/cappy](https://github.com/tanyuqian/cappy)
- **Paper:** [arxiv.org/abs/2311.06720](https://arxiv.org/abs/2311.06720)
## Uses
Cappy can be loaded either as a Jax/Flax model or a PyTorch model.
### Jax/Flax
```python
from transformers import AutoTokenizer, FlaxAutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
cappy = FlaxAutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')
instruction = """
What label best describes this news article?
Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
"""
response = 'Business'
inputs = tokenizer([(instruction, response), ], return_tensors='pt')
score = cappy(**inputs).logits[0][0].item()
```
### PyTorch
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
cappy = AutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')
instruction = """
What label best describes this news article?
Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
"""
response = 'Business'
inputs = tokenizer([(instruction, response), ], return_tensors='pt')
score = cappy(**inputs).logits[0][0].item()
```
## Evaluation
We validate Cappy through an extensive suite of held-out tasks distinct from those incorporated in its pretraining.
The overall performance is as shown in Fig. 1 and Fig. 2.
Specifically, on 11 language understanding tasks drawn from PromptSource, Cappy, with 360 million parameters, outperforms
OPT-IML-30B and OPT-175B significantly, and matches the best ones among previous multi-task
LLMs. Besides, on 45 diverse complex tasks from BIG-Bench, Cappy consistently boosts the
performance of the advanced multi-task LLM, FLAN-T5, by a large margin. Furthermore, Cappy
offers additional performance enhancement when applied together with finetuning or in-context
learning. Our subsequent ablation study proves the significance of our proposed pretraining and data
augmentation strategies.
![](cappy_eval.png)
## Software
Cappy's pretraining uses the code from [this example](https://github.com/tanyuqian/redco/tree/master/examples/classification_regression) in [Red Coast](https://github.com/tanyuqian/redco), a lightweight
toolkit for automating distributed training.
## Citation
```
@inproceedings{
tan2023cappy,
title={Cappy: Outperforming and Boosting Large Multi-Task {LM}s with a Small Scorer},
author={Bowen Tan and Yun Zhu and Lijuan Liu and Eric Xing and Zhiting Hu and Jindong Chen},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=Srt1hhQgqa}
}
```
![](cappy.jpg)