juewang commited on
Commit
c4d3569
1 Parent(s): d91dcaa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -192,10 +192,10 @@ widget:
192
  Label:
193
  ---
194
 
195
- <h1 style="font-size: 42px">TOGETHER RESEARCH<h1/>
196
 
197
  # Model Summary
198
- We present GPT-JT, a fork of GPT-6B, trained for 20,000 steps, that outperforms most 100B+ parameter models at classification, and improves most tasks. GPT-JT was trained with a new decentralized algorithm with 1G interconnect.
199
  GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
200
 
201
  **Please check out our demo: [TOMA-app](https://huggingface.co/spaces/togethercomputer/TOMA-app).**
@@ -204,7 +204,7 @@ GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3
204
  ```python
205
  from transformers import pipeline
206
  pipe = pipeline(model='togethercomputer/GPT-JT-6B-v1')
207
- pipe('''Please answer the following question:\n\nQuestion: Where is Zurich?\nAnswer:''')
208
  ```
209
 
210
  or
 
192
  Label:
193
  ---
194
 
195
+ <h1 style="font-size: 42px">GPT-JT<h1/>
196
 
197
  # Model Summary
198
+ We present GPT-JT, a fork of GPT-6B, trained for 20,000 steps, that outperforms most 100B+ parameter models at classification, and improves most tasks relative to GPT-J-6B. GPT-JT was trained with a new decentralized algorithm on computers networked on slow 1Gbps links.
199
  GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
200
 
201
  **Please check out our demo: [TOMA-app](https://huggingface.co/spaces/togethercomputer/TOMA-app).**
 
204
  ```python
205
  from transformers import pipeline
206
  pipe = pipeline(model='togethercomputer/GPT-JT-6B-v1')
207
+ pipe('''I like this! <-- Is it positive or negative?\nA:''')
208
  ```
209
 
210
  or