--- datasets: - natural_instructions - the_pile - cot - Muennighoff/P3 inference: parameters: max_new_tokens: 5 temperature: 1.0 top_k: 1 language: - en pipeline_tag: text-generation tags: - gpt widget: - example_title: "ADE Corpus V2" text: |- Label the sentence based on whether it is related to an adverse drug effect (ADE). Details are described below: Drugs: Names of drugs and chemicals that include brand names, trivial names, abbreviations and systematic names were annotated. Mentions of drugs or chemicals should strictly be in a therapeutic context. This category does not include the names of metabolites, reaction byproducts, or hospital chemicals (e.g. surgical equipment disinfectants). Adverse effect: Mentions of adverse effects include signs, symptoms, diseases, disorders, acquired abnormalities, deficiencies, organ damage or death that strictly occur as a consequence of drug intake. Possible labels: 1. ADE-related 2. not ADE-related Sentence: A challenge with clozapine was feasible and showed no clinical symptoms of eosinophilia. Label: not ADE-related Sentence: CONCLUSIONS: These results suggest that clozapine may cause TD; however, the prevalence is low and the severity is relatively mild, with no or mild self-reported discomfort. Label: ADE-related Sentence: Best-corrected visual acuity measurements were performed at every visit. Label: not ADE-related Sentence: These cases were considered unusual in light of the short delay of their onset after initiation of immunosuppressive therapy and their fulminant course: 3 of these patients died of PCP occurring during the first month of treatment with prednisone. Label: ADE-related Sentence: The INR should be monitored more frequently when bosentan is initiated, adjusted, or discontinued in patients taking warfarin. Label: not ADE-related Sentence: NEH must be considered in lupus patients receiving cytotoxic agents to avoid inappropriate use of corticosteroids or antibiotics in this self-limited condition. Label: - example_title: Banking77 text: |- The following is a banking customer service query. Classify the query into one of the 77 categories available. Possible labels: 1. Refund_not_showing_up 2. activate_my_card 3. age_limit 4. apple_pay_or_google_pay 5. atm_support 6. automatic_top_up 7. balance_not_updated_after_bank_transfer 8. balance_not_updated_after_cheque_or_cash_deposit 9. beneficiary_not_allowed 10. cancel_transfer 11. card_about_to_expire 12. card_acceptance 13. card_arrival 14. card_delivery_estimate 15. card_linking 16. card_not_working 17. card_payment_fee_charged 18. card_payment_not_recognised 19. card_payment_wrong_exchange_rate 20. card_swallowed 21. cash_withdrawal_charge 22. cash_withdrawal_not_recognised 23. change_pin 24. compromised_card 25. contactless_not_working 26. country_support 27. declined_card_payment 28. declined_cash_withdrawal 29. declined_transfer 30. direct_debit_payment_not_recognised 31. disposable_card_limits 32. edit_personal_details 33. exchange_charge 34. exchange_rate 35. exchange_via_app 36. extra_charge_on_statement 37. failed_transfer 38. fiat_currency_support 39. get_disposable_virtual_card 40. get_physical_card 41. getting_spare_card 42. getting_virtual_card 43. lost_or_stolen_card 44. lost_or_stolen_phone 45. order_physical_card 46. passcode_forgotten 47. pending_card_payment 48. pending_cash_withdrawal 49. pending_top_up 50. pending_transfer 51. pin_blocked 52. receiving_money 53. request_refund 54. reverted_card_payment? 55. supported_cards_and_currencies 56. terminate_account 57. top_up_by_bank_transfer_charge 58. top_up_by_card_charge 59. top_up_by_cash_or_cheque 60. top_up_failed 61. top_up_limits 62. top_up_reverted 63. topping_up_by_card 64. transaction_charged_twice 65. transfer_fee_charged 66. transfer_into_account 67. transfer_not_received_by_recipient 68. transfer_timing 69. unable_to_verify_identity 70. verify_my_identity 71. verify_source_of_funds 72. verify_top_up 73. virtual_card_not_working 74. visa_or_mastercard 75. why_verify_identity 76. wrong_amount_of_cash_received 77. wrong_exchange_rate_for_cash_withdrawal Query: My card payment was not successful. Label: declined_card_payment Query: Is it possible for me to change my PIN number? Label: change_pin Query: limits on top ups Label: top_up_limits Query: I live in the EU - can I get a card? Label: country_support Query: How can I tell the source for my available funds? Label: verify_source_of_funds Query: Why am I getting declines when trying to make a purchase online? Label: - example_title: Overruling text: |- In law, an overruling sentence is a statement that nullifies a previous case decision as a precedent, by a constitutionally valid statute or a decision by the same or higher ranking court which establishes a different rule on the point of law involved. Label the sentence based on whether it is overruling or not. Possible labels: 1. not overruling 2. overruling Sentence: see mciver, 134 n.c.app. at 588, 518 s.e.2d at 526. Label: not overruling Sentence: to the extent that paprskar v. state, supra, applied the general test of waiver of constitutional rights set forth in johnson v. zerbst, supra, it is no longer viable. Label: overruling Sentence: narrowstep, 2010 wl 5422405, at *12. Label: not overruling Sentence: accordingly, to the extent of any conflict nemecek v. state, 621 s.w.2d 404 (tex.cr.app. 1980) is overruled. Label: overruling Sentence: the following facts are taken from the administrative record. Label: not overruling Sentence: see scott, supra at 352; commonwealth v. ruffin, 475 mass. 1003, 1004 (2016). Label: - example_title: "Tweet Eval Hate" text: |- Label whether the following tweet contains hate speech against either immigrants or women. Hate Speech (HS) is commonly defined as any communication that disparages a person or a group on the basis of some characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or other characteristics. Possible labels: 1. hate speech 2. not hate speech Tweet: #TakeAKnee trump's a chump! @user is the real deal-helping so many on his journey through greatness! Even @user can't stand #IQ45-stands to CLEAN UP while he's in prison. One hot, lucky #Immigrant babe! #OHi12 #VoteBlue @user in the #Columbus area. #BlueWave baby! Label: not hate speech Tweet: HOW REFRESHING! In South Korea, there is no such thing as 'political correctness" when it comes to dealing with Muslim refugee wannabes via @user Label: hate speech Tweet: New to Twitter-- any men on here know what the process is to get #verified? Label: not hate speech Tweet: UK Pensioner Faces 350 Lashes In Saudi Arabia why does this country exist it does nothing for migrants picks on old men no help from anyone Label: not hate speech Tweet: RT @user Her:I don't get what u want outta this relationship Him:Well, I was only looking for a bj but u kept coming back Label: not hate speech Tweet: Dont worry @user you are and will always be the most hysterical woman. Label: ---

TOGETHER RESEARCH

# Model Summary We present GPT-JT, a fork of GPT-6B, trained for 20,000 steps, that outperforms most 100B+ parameter models at classification, and improves most tasks. GPT-JT was trained with a new decentralized algorithm with 1G interconnect. # Quick Start ```python from transformers import pipeline pipe = pipeline(model='togethercomputer/GPT-JT-6B-v1') pipe('''Please answer the following question:\n\nQuestion: Where is Zurich?\nAnswer:''') ``` # Training Data We fine-tune [GPT-J-6B](https://huggingface.co/EleutherAI/gpt-j-6B) on NI, P3, COT, the pile data. - [Natural-Instructions](https://github.com/allenai/natural-instructions) - [P3](https://huggingface.co/datasets/Muennighoff/P3) - [MMLU-COT](https://github.com/jasonwei20/flan-2/blob/main/mmlu-cot.json) - [the pile](https://huggingface.co/datasets/the_pile) # Hyperparameters We used AdamW with a learning rate of 1e-5 and global batch size of 64, and train for 20k steps. We used mix-precision training where the activation is in FP16 while the optimizer states are kept in FP32. We use both data parallelism and pipeline parallelism to conduct training. During training, we truncate the input sequence to 2048 tokens, and for input sequence that contains less than 2048 tokens, we concatenate multiple sequences into one long sequence to improve the data efficiency. # Infrastructure We used [the Together Research Computer](https://together.xyz/) to conduct training.