File size: 10,156 Bytes
092efe0
 
0089ab7
 
 
 
092efe0
 
 
0089ab7
 
092efe0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0089ab7
 
 
 
 
 
 
 
 
 
 
 
 
0f35095
 
 
 
 
 
 
 
 
0089ab7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
---
datasets: 
  - natural_instructions
  - the_pile
  - cot
  - Muennighoff/P3
inference: 
  parameters: 
    max_new_tokens: 5
    temperature: 1.0
    top_k: 1
language: 
  - en
pipeline_tag: text-generation
widget: 
  - 
    example_title: "ADE Corpus V2"
    text: |-
        Label the sentence based on whether it is related to an adverse drug effect (ADE). Details are described below:
        Drugs: Names of drugs and chemicals that include brand names, trivial names, abbreviations and systematic names were annotated. Mentions of drugs or chemicals should strictly be in a therapeutic context. This category does not include the names of metabolites, reaction byproducts, or hospital chemicals (e.g. surgical equipment disinfectants).
        Adverse effect: Mentions of adverse effects include signs, symptoms, diseases, disorders, acquired abnormalities, deficiencies, organ damage or death that strictly occur as a consequence of drug intake.
        Possible labels:
        1. ADE-related
        2. not ADE-related
        
        Sentence: A challenge with clozapine was feasible and showed no clinical symptoms of eosinophilia.
        Label: not ADE-related
        
        Sentence: CONCLUSIONS: These results suggest that clozapine may cause TD; however, the prevalence is low and the severity is relatively mild, with no or mild self-reported discomfort.
        Label: ADE-related
        
        Sentence: Best-corrected visual acuity measurements were performed at every visit.
        Label: not ADE-related
        
        Sentence: These cases were considered unusual in light of the short delay of their onset after initiation of immunosuppressive therapy and their fulminant course: 3 of these patients died of PCP occurring during the first month of treatment with prednisone.
        Label: ADE-related
        
        Sentence: The INR should be monitored more frequently when bosentan is initiated, adjusted, or discontinued in patients taking warfarin.
        Label: not ADE-related
        
        Sentence: NEH must be considered in lupus patients receiving cytotoxic agents to avoid inappropriate use of corticosteroids or antibiotics in this self-limited condition.
        Label:
  - 
    example_title: Banking77
    text: |-
        The following is a banking customer service query. Classify the query into one of the 77 categories available.
        Possible labels:
        1. Refund_not_showing_up
        2. activate_my_card
        3. age_limit
        4. apple_pay_or_google_pay
        5. atm_support
        6. automatic_top_up
        7. balance_not_updated_after_bank_transfer
        8. balance_not_updated_after_cheque_or_cash_deposit
        9. beneficiary_not_allowed
        10. cancel_transfer
        11. card_about_to_expire
        12. card_acceptance
        13. card_arrival
        14. card_delivery_estimate
        15. card_linking
        16. card_not_working
        17. card_payment_fee_charged
        18. card_payment_not_recognised
        19. card_payment_wrong_exchange_rate
        20. card_swallowed
        21. cash_withdrawal_charge
        22. cash_withdrawal_not_recognised
        23. change_pin
        24. compromised_card
        25. contactless_not_working
        26. country_support
        27. declined_card_payment
        28. declined_cash_withdrawal
        29. declined_transfer
        30. direct_debit_payment_not_recognised
        31. disposable_card_limits
        32. edit_personal_details
        33. exchange_charge
        34. exchange_rate
        35. exchange_via_app
        36. extra_charge_on_statement
        37. failed_transfer
        38. fiat_currency_support
        39. get_disposable_virtual_card
        40. get_physical_card
        41. getting_spare_card
        42. getting_virtual_card
        43. lost_or_stolen_card
        44. lost_or_stolen_phone
        45. order_physical_card
        46. passcode_forgotten
        47. pending_card_payment
        48. pending_cash_withdrawal
        49. pending_top_up
        50. pending_transfer
        51. pin_blocked
        52. receiving_money
        53. request_refund
        54. reverted_card_payment?
        55. supported_cards_and_currencies
        56. terminate_account
        57. top_up_by_bank_transfer_charge
        58. top_up_by_card_charge
        59. top_up_by_cash_or_cheque
        60. top_up_failed
        61. top_up_limits
        62. top_up_reverted
        63. topping_up_by_card
        64. transaction_charged_twice
        65. transfer_fee_charged
        66. transfer_into_account
        67. transfer_not_received_by_recipient
        68. transfer_timing
        69. unable_to_verify_identity
        70. verify_my_identity
        71. verify_source_of_funds
        72. verify_top_up
        73. virtual_card_not_working
        74. visa_or_mastercard
        75. why_verify_identity
        76. wrong_amount_of_cash_received
        77. wrong_exchange_rate_for_cash_withdrawal
        
        Query: My card payment was not successful.
        Label: declined_card_payment
        
        Query: Is it possible for me to change my PIN number?
        Label: change_pin
        
        Query: limits on top ups
        Label: top_up_limits
        
        Query: I live in the EU - can I get a card?
        Label: country_support
        
        Query: How can I tell the source for my available funds?
        Label: verify_source_of_funds
        
        Query: Why am I getting declines when trying to make a purchase online?
        Label:
  - 
    example_title: Overruling
    text: |-
        In law, an overruling sentence is a statement that nullifies a previous case decision as a precedent, by a constitutionally valid statute or a decision by the same or higher ranking court which establishes a different rule on the point of law involved. Label the sentence based on whether it is overruling or not.
        Possible labels:
        1. not overruling
        2. overruling
        
        Sentence: see  mciver, 134 n.c.app. at 588, 518 s.e.2d at 526.
        Label: not overruling
        
        Sentence: to the extent that paprskar v. state, supra, applied the general test of waiver of constitutional rights set forth in johnson v. zerbst, supra, it is no longer viable.
        Label: overruling
        
        Sentence:  narrowstep, 2010 wl 5422405, at *12.
        Label: not overruling
        
        Sentence: accordingly, to the extent of any conflict nemecek v. state, 621 s.w.2d 404 (tex.cr.app. 1980) is overruled.
        Label: overruling
        
        Sentence: the following facts are taken from the administrative record.
        Label: not overruling
        
        Sentence: see scott, supra at 352; commonwealth v. ruffin, 475 mass. 1003, 1004 (2016).
        Label:
  - 
    example_title: "Tweet Eval Hate"
    text: |-
        Label whether the following tweet contains hate speech against either immigrants or women. Hate Speech (HS) is commonly defined as any communication that disparages a person or a group on the basis of some characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or other characteristics.
        Possible labels:
        1. hate speech
        2. not hate speech
        
        Tweet: #TakeAKnee trump's a chump! @user is the real deal-helping so many on his journey through greatness! Even @user can't stand #IQ45-stands to CLEAN UP while he's in prison. One hot, lucky #Immigrant babe! #OHi12 #VoteBlue @user in the #Columbus area. #BlueWave baby!
        Label: not hate speech
        
        Tweet: HOW REFRESHING! In South Korea, there is no such thing as 'political correctness" when it comes to dealing with Muslim refugee wannabes via @user
        Label: hate speech
        
        Tweet: New to Twitter-- any men on here know what the process is to get #verified?
        Label: not hate speech
        
        Tweet: UK Pensioner Faces 350 Lashes In Saudi Arabia why does this country exist it does nothing for migrants picks on old men no help from anyone
        Label: not hate speech
        
        Tweet: RT @user Her:I don't get what u want outta this relationship Him:Well, I was only looking for a bj but u kept coming back
        Label: not hate speech
        
        Tweet: Dont worry @user you are and will always be the most hysterical woman.
        Label:
---

<h1 style="font-size: 42px">TOGETHER RESEARCH<h1/>

# Model Summary
We present GPT-JT, a fork of GPT-6B, trained for 20,000 steps, that outperforms most 100B+ parameter models at classification, and improves most tasks. GPT-JT was trained with a new decentralized algorithm with  1G interconnect.

# Quick Start
```python
from transformers import pipeline
pipe = pipeline(model='togethercomputer/GPT-JT-6B-v1')
pipe('''Please answer the following question:\n\nQuestion: Where is Zurich?\nAnswer:''')
```

or

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("togethercomputer/GPT-JT-6B-v1")
model = AutoModelForCausalLM.from_pretrained("togethercomputer/GPT-JT-6B-v1")
```

# Training Data
We fine-tune [GPT-J-6B](https://huggingface.co/EleutherAI/gpt-j-6B) on NI, P3, COT, the pile data.
- [Natural-Instructions](https://github.com/allenai/natural-instructions)
- [P3](https://huggingface.co/datasets/Muennighoff/P3)
- [MMLU-COT](https://github.com/jasonwei20/flan-2/blob/main/mmlu-cot.json)
- [the pile](https://huggingface.co/datasets/the_pile)

# Hyperparameters
We used AdamW with a learning rate of 1e-5 and global batch size of 64, and train for 20k steps.
We used mix-precision training where the activation is in FP16 while the optimizer states are kept in FP32.
We use both data parallelism and pipeline parallelism to conduct training.
During training, we truncate the input sequence to 2048 tokens, and for input sequence that contains less than 2048 tokens, we concatenate multiple sequences into one long sequence to improve the data efficiency.

# Infrastructure
We used [the Together Research Computer](https://together.xyz/) to conduct training.