Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,88 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
|
5 |
+
# Cappy-Large
|
6 |
+
|
7 |
+
## Getting Started
|
8 |
+
|
9 |
+
Cappy is a pretrained small scorer designed to enhance the performance and efficiency of multi-task LLMs.
|
10 |
+
Cappy takes in an instruction and a candidate response as input, and produces a score between 0 and 1, indicating an estimated correctness of the response with respect to the instruction.
|
11 |
+
With merely 360 million parameters, Cappy functions either independently on classification tasks or serve as an auxiliary component for LLMs, boosting their performance.
|
12 |
+
Also, Cappy enables efficiently integrating downstream supervision without requiring LLM finetuning nor the access to their parameters.
|
13 |
+
Furthermore, Cappy is flexible to cooperate with other LLM adaptations, including finetuning and in-context learning, and prompt tuning, offering additional performance enhancement.
|
14 |
+
|
15 |
+
- **Repository:** [https://github.com/tanyuqian/cappy](https://github.com/tanyuqian/cappy)
|
16 |
+
- **Paper:** [arxiv.org/abs/2311.06720](https://arxiv.org/abs/2311.06720)
|
17 |
+
|
18 |
+
## Uses
|
19 |
+
|
20 |
+
Cappy can be loaded either as a Jax/Flax model or a PyTorch model.
|
21 |
+
|
22 |
+
### Jax/Flax
|
23 |
+
```python
|
24 |
+
from transformers import AutoTokenizer, FlaxAutoModelForSequenceClassification
|
25 |
+
tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
|
26 |
+
cappy = FlaxAutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')
|
27 |
+
|
28 |
+
instruction = """
|
29 |
+
What label best describes this news article?
|
30 |
+
Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
|
31 |
+
"""
|
32 |
+
response = 'Business'
|
33 |
+
|
34 |
+
inputs = tokenizer([(instruction, response), ], return_tensors='pt')
|
35 |
+
score = cappy(**inputs).logits[0][0].item()
|
36 |
+
```
|
37 |
+
|
38 |
+
### PyTorch
|
39 |
+
```python
|
40 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
41 |
+
|
42 |
+
tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
|
43 |
+
cappy = AutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')
|
44 |
+
|
45 |
+
instruction = """
|
46 |
+
What label best describes this news article?
|
47 |
+
Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
|
48 |
+
"""
|
49 |
+
response = 'Business'
|
50 |
+
|
51 |
+
inputs = tokenizer([(instruction, response), ], return_tensors='pt')
|
52 |
+
score = cappy(**inputs).logits[0][0].item()
|
53 |
+
```
|
54 |
+
|
55 |
+
|
56 |
+
## Evaluation
|
57 |
+
|
58 |
+
|
59 |
+
We validate Cappy through an extensive suite of held-out tasks distinct from those incorporated in its pretraining.
|
60 |
+
The overall performance is as shown in Fig. 1 and Fig. 2.
|
61 |
+
Specifically, on 11 language understanding tasks drawn from PromptSource, Cappy, with 360 million parameters, outperforms
|
62 |
+
OPT-IML-30B and OPT-175B significantly, and matches the best ones among previous multi-task
|
63 |
+
LLMs. Besides, on 45 diverse complex tasks from BIG-Bench, Cappy consistently boosts the
|
64 |
+
performance of the advanced multi-task LLM, FLAN-T5, by a large margin. Furthermore, Cappy
|
65 |
+
offers additional performance enhancement when applied together with finetuning or in-context
|
66 |
+
learning. Our subsequent ablation study proves the significance of our proposed pretraining and data
|
67 |
+
augmentation strategies.
|
68 |
+
|
69 |
+
![](imgs/cappy_eval.png)
|
70 |
+
|
71 |
+
## Software
|
72 |
+
|
73 |
+
Cappy's pretraining uses the code from [this example](https://github.com/tanyuqian/redco/tree/master/examples/classification_regression) in [Red Coast](https://github.com/tanyuqian/redco), a lightweight
|
74 |
+
toolkit for automating distributed training.
|
75 |
+
|
76 |
+
|
77 |
+
## Citation
|
78 |
+
|
79 |
+
```
|
80 |
+
@inproceedings{
|
81 |
+
tan2023cappy,
|
82 |
+
title={Cappy: Outperforming and Boosting Large Multi-Task {LM}s with a Small Scorer},
|
83 |
+
author={Bowen Tan and Yun Zhu and Lijuan Liu and Eric Xing and Zhiting Hu and Jindong Chen},
|
84 |
+
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
|
85 |
+
year={2023},
|
86 |
+
url={https://openreview.net/forum?id=Srt1hhQgqa}
|
87 |
+
}
|
88 |
+
```
|