Text Generation
Transformers
PyTorch
English
gptj
Inference Endpoints
xzyao Amrrs commited on
Commit
525e6b8
1 Parent(s): f16ab32

fixed a simple typo, apologies if not required (#3)

Browse files

- fixed a simple typo, apologies if not required (29aa10c2c0d90c18fb408154e7c5ae5194b442ce)


Co-authored-by: amrrs <Amrrs@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -85,7 +85,7 @@ widget:
85
  > With a new decentralized training algorithm, we fine-tuned GPT-J (6B) on 3.53 billion tokens, resulting in GPT-JT (6B), a model that outperforms many 100B+ parameter models on classification benchmarks.
86
 
87
  We incorporated a collection of open techniques and datasets to build GPT-JT:
88
- - GPT-JT is a folk of [EleutherAI](https://www.eleuther.ai)'s [GPT-J (6B)](https://huggingface.co/EleutherAI/gpt-j-6B);
89
  - We used [UL2](https://github.com/google-research/google-research/tree/master/ul2)'s training objective, allowing the model to see bidirectional context of the prompt;
90
  - The model was trained on a large collection of diverse data, including [Chain-of-Thought (CoT)](https://ai.googleblog.com/2022/05/language-models-perform-reasoning-via.html), [Public Pool of Prompts (P3) dataset](https://huggingface.co/datasets/bigscience/P3), [Natural-Instructions (NI) dataset](https://github.com/allenai/natural-instructions).
91
 
 
85
  > With a new decentralized training algorithm, we fine-tuned GPT-J (6B) on 3.53 billion tokens, resulting in GPT-JT (6B), a model that outperforms many 100B+ parameter models on classification benchmarks.
86
 
87
  We incorporated a collection of open techniques and datasets to build GPT-JT:
88
+ - GPT-JT is a fork of [EleutherAI](https://www.eleuther.ai)'s [GPT-J (6B)](https://huggingface.co/EleutherAI/gpt-j-6B);
89
  - We used [UL2](https://github.com/google-research/google-research/tree/master/ul2)'s training objective, allowing the model to see bidirectional context of the prompt;
90
  - The model was trained on a large collection of diverse data, including [Chain-of-Thought (CoT)](https://ai.googleblog.com/2022/05/language-models-perform-reasoning-via.html), [Public Pool of Prompts (P3) dataset](https://huggingface.co/datasets/bigscience/P3), [Natural-Instructions (NI) dataset](https://github.com/allenai/natural-instructions).
91