YX-Cerebras commited on
Commit
0ea9b34
1 Parent(s): 74e7b13

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md CHANGED
@@ -1,3 +1,42 @@
1
  ---
 
 
 
 
 
 
2
  license: apache-2.0
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ inference: false
5
+ tags:
6
+ - pytorch
7
+ - causal-lm
8
  license: apache-2.0
9
+ datasets:
10
+ - the_pile
11
+ pipeline_tag: text-generation
12
  ---
13
+
14
+ ## Intermediate Checkpoints
15
+
16
+ # <span style="color:red"> Warning: the checkpoints in this repo are not fully trained model. </span>
17
+
18
+ **For final model checkpoint, please see:** https://huggingface.co/cerebras/Cerebras-GPT-13B
19
+
20
+ ## Uses and Limitations
21
+
22
+ ### Intended Use
23
+ The primary intended use is to further research into large language models. These models can be used as a foundation model for NLP, applications, ethics, and alignment research. Our primary intended users are researchers who are working to improve LLMs and practitioners seeking reference implementations, training setups, hyperparameters, or pre-trained models. We release these models with a fully permissive Apache license for the community to use freely.
24
+
25
+ You may fine-tune and adapt Cerebras-GPT models for deployment via either Cerebras [Model Studio](https://www.cerebras.net/product-cloud/) or third-party libraries. Further safety-related testing and mitigations should be applied beore using the Cerebras-GPT model family in production downstream applications.
26
+
27
+ Due to financial and compute budgets, Cerebras-GPT models were only trained and evaluated following the approaches described in the paper.
28
+
29
+ ### Out of Scope Use
30
+ Cerebras-GPT models are trained on the Pile, with English language only, and are not suitable for machine translation tasks.
31
+
32
+ Cerebras-GPT models have not been tuned for human-facing dialog applications like chatbots and will not respond to prompts in a similar way to models that have received instruction tuning or reinforcement learning from human feedback (RLHF) like Flan-T5 or ChatGPT. Cerebras-GPT models can be tuned using those methods.
33
+
34
+ ### Risk, Bias, Ethical Considerations
35
+ * **Data**: The Pile dataset has been thoroughly analyzed from various ethical standpoints such as toxicity analysis, gender bias, pejorative content, racially sensitive content etc. Please refer to Pile dataset references.
36
+ * **Human life**: The outputs from this model may or may not align with human values. The risk needs to be thoroughly investigated before deploying this model in a production environment where it can directly impact human life.
37
+ * **Risks and harms**: There can be distributional bias in the Pile dataset that can manifest in various forms in the downstream model deployment. There are other risks associated with large language models such as amplifying stereotypes, memorizing training data, or revealing private or secure information.
38
+ * **Mitigations**: Only mitigations in standard Pile dataset pre-processing were employed when pre-training Cerebras-GPT.
39
+
40
+ ## Acknowledgements
41
+
42
+ We are thankful to all Cerebras engineers, past and present, that made this work possible.