btan2 commited on
Commit
e621cb0
1 Parent(s): aed97f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +85 -0
README.md CHANGED
@@ -1,3 +1,88 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # Cappy-Large
6
+
7
+ ## Getting Started
8
+
9
+ Cappy is a pretrained small scorer designed to enhance the performance and efficiency of multi-task LLMs.
10
+ Cappy takes in an instruction and a candidate response as input, and produces a score between 0 and 1, indicating an estimated correctness of the response with respect to the instruction.
11
+ With merely 360 million parameters, Cappy functions either independently on classification tasks or serve as an auxiliary component for LLMs, boosting their performance.
12
+ Also, Cappy enables efficiently integrating downstream supervision without requiring LLM finetuning nor the access to their parameters.
13
+ Furthermore, Cappy is flexible to cooperate with other LLM adaptations, including finetuning and in-context learning, and prompt tuning, offering additional performance enhancement.
14
+
15
+ - **Repository:** [https://github.com/tanyuqian/cappy](https://github.com/tanyuqian/cappy)
16
+ - **Paper:** [arxiv.org/abs/2311.06720](https://arxiv.org/abs/2311.06720)
17
+
18
+ ## Uses
19
+
20
+ Cappy can be loaded either as a Jax/Flax model or a PyTorch model.
21
+
22
+ ### Jax/Flax
23
+ ```python
24
+ from transformers import AutoTokenizer, FlaxAutoModelForSequenceClassification
25
+ tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
26
+ cappy = FlaxAutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')
27
+
28
+ instruction = """
29
+ What label best describes this news article?
30
+ Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
31
+ """
32
+ response = 'Business'
33
+
34
+ inputs = tokenizer([(instruction, response), ], return_tensors='pt')
35
+ score = cappy(**inputs).logits[0][0].item()
36
+ ```
37
+
38
+ ### PyTorch
39
+ ```python
40
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
41
+
42
+ tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
43
+ cappy = AutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')
44
+
45
+ instruction = """
46
+ What label best describes this news article?
47
+ Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
48
+ """
49
+ response = 'Business'
50
+
51
+ inputs = tokenizer([(instruction, response), ], return_tensors='pt')
52
+ score = cappy(**inputs).logits[0][0].item()
53
+ ```
54
+
55
+
56
+ ## Evaluation
57
+
58
+
59
+ We validate Cappy through an extensive suite of held-out tasks distinct from those incorporated in its pretraining.
60
+ The overall performance is as shown in Fig. 1 and Fig. 2.
61
+ Specifically, on 11 language understanding tasks drawn from PromptSource, Cappy, with 360 million parameters, outperforms
62
+ OPT-IML-30B and OPT-175B significantly, and matches the best ones among previous multi-task
63
+ LLMs. Besides, on 45 diverse complex tasks from BIG-Bench, Cappy consistently boosts the
64
+ performance of the advanced multi-task LLM, FLAN-T5, by a large margin. Furthermore, Cappy
65
+ offers additional performance enhancement when applied together with finetuning or in-context
66
+ learning. Our subsequent ablation study proves the significance of our proposed pretraining and data
67
+ augmentation strategies.
68
+
69
+ ![](imgs/cappy_eval.png)
70
+
71
+ ## Software
72
+
73
+ Cappy's pretraining uses the code from [this example](https://github.com/tanyuqian/redco/tree/master/examples/classification_regression) in [Red Coast](https://github.com/tanyuqian/redco), a lightweight
74
+ toolkit for automating distributed training.
75
+
76
+
77
+ ## Citation
78
+
79
+ ```
80
+ @inproceedings{
81
+ tan2023cappy,
82
+ title={Cappy: Outperforming and Boosting Large Multi-Task {LM}s with a Small Scorer},
83
+ author={Bowen Tan and Yun Zhu and Lijuan Liu and Eric Xing and Zhiting Hu and Jindong Chen},
84
+ booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
85
+ year={2023},
86
+ url={https://openreview.net/forum?id=Srt1hhQgqa}
87
+ }
88
+ ```