nazneen commited on
Commit
409fd6c
1 Parent(s): 95a7ea7

model documentation

Browse files
Files changed (1) hide show
  1. README.md +163 -0
README.md ADDED
@@ -0,0 +1,163 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+
3
+ tags:
4
+ - text-generation
5
+ ---
6
+ # Model Card for GPT-J-6B-Skein
7
+
8
+ # Model Details
9
+
10
+ ## Model Description
11
+
12
+
13
+ - **Developed by:** KoboldAI
14
+ - **Shared by [Optional]:** More information needed
15
+ - **Model type:** Text Generation
16
+ - **Language(s) (NLP):** More information needed
17
+ - **License:** More information needed
18
+ - **Related Models:** [GPT-J 6B](https://huggingface.co/EleutherAI/gpt-j-6B?text=My+name+is+Mariama%2C+my+favorite)
19
+ - **Parent Model:** GPT-J
20
+ - **Resources for more information:**
21
+ - [GitHub Repo](https://github.com/kingoflolz/mesh-transformer-jax)
22
+ - [Associated Model Doc](https://huggingface.co/docs/transformers/main/en/model_doc/gptj#transformers.GPTJForCausalLM)
23
+
24
+ # Uses
25
+
26
+
27
+ ## Direct Use
28
+
29
+ This model can be used for the task of text generation
30
+
31
+ ## Downstream Use [Optional]
32
+
33
+ More information needed
34
+
35
+ ## Out-of-Scope Use
36
+
37
+ The model should not be used to intentionally create hostile or alienating environments for people.
38
+
39
+ # Bias, Risks, and Limitations
40
+ The core functionality of GPT-J is taking a string of text and predicting the next token. While language models are widely used for tasks other than this, there are a lot of unknowns with this work. When prompting GPT-J it is important to remember that the statistically most likely next token is often not the token that produces the most "accurate" text. Never depend upon GPT-J to produce factually accurate output.
41
+ GPT-J was trained on the Pile, a dataset known to contain profanity, lewd, and otherwise abrasive language. Depending upon use case GPT-J may produce socially unacceptable text. See Sections 5 and 6 of the Pile paper for a more detailed analysis of the biases in the Pile.
42
+ As with all language models, it is hard to predict in advance how GPT-J will respond to particular prompts and offensive content may occur without warning. We recommend having a human curate or filter the outputs before releasing them, both to censor undesirable content and to improve the quality of the results.
43
+
44
+ See the [GPT-J 6B model card](https://huggingface.co/EleutherAI/gpt-j-6B?text=My+name+is+Mariama%2C+my+favorite) for more information.
45
+
46
+ ## Recommendations
47
+
48
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
49
+
50
+
51
+ # Training Details
52
+
53
+ ## Training Data
54
+
55
+ More information needed
56
+
57
+ ## Training Procedure
58
+
59
+
60
+ ### Preprocessing
61
+
62
+ More information needed
63
+
64
+ ### Speeds, Sizes, Times
65
+
66
+ More information needed
67
+
68
+ # Evaluation
69
+
70
+
71
+ ## Testing Data, Factors & Metrics
72
+
73
+ ### Testing Data
74
+
75
+ More information needed
76
+
77
+ ### Factors
78
+
79
+
80
+ ### Metrics
81
+
82
+ More information needed
83
+ ## Results
84
+
85
+ More information needed
86
+
87
+ # Model Examination
88
+
89
+ More information needed
90
+
91
+ # Environmental Impact
92
+
93
+
94
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
95
+
96
+ - **Hardware Type:** More information needed
97
+ - **Hours used:** More information needed
98
+ - **Cloud Provider:** More information needed
99
+ - **Compute Region:** More information needed
100
+ - **Carbon Emitted:** More information needed
101
+
102
+ # Technical Specifications [optional]
103
+
104
+ ## Model Architecture and Objective
105
+
106
+ More information needed
107
+
108
+ ## Compute Infrastructure
109
+
110
+ More information needed
111
+
112
+ ### Hardware
113
+
114
+ More information needed
115
+
116
+ ### Software
117
+ More information needed
118
+
119
+ # Citation
120
+
121
+
122
+ **BibTeX:**
123
+ ```
124
+ @misc{mesh-transformer-jax,
125
+ author = {Wang, Ben},
126
+ title = {{Mesh-Transformer-JAX: Model-Parallel Implementation of Transformer Language Model with JAX}},
127
+ howpublished = {\url{https://github.com/kingoflolz/mesh-transformer-jax}},
128
+ year = 2021,
129
+ month = May
130
+ }
131
+ ```
132
+
133
+ # Glossary [optional]
134
+ More information needed
135
+
136
+ # More Information [optional]
137
+
138
+ More information needed
139
+
140
+ # Model Card Authors [optional]
141
+
142
+
143
+ KoboldAI in collaboration with Ezi Ozoani and the Hugging Face team
144
+
145
+ # Model Card Contact
146
+
147
+ More information needed
148
+
149
+ # How to Get Started with the Model
150
+
151
+ Use the code below to get started with the model.
152
+
153
+ <details>
154
+ <summary> Click to expand </summary>
155
+
156
+ ```python
157
+ from transformers import AutoTokenizer, AutoModelForCausalLM
158
+
159
+ tokenizer = AutoTokenizer.from_pretrained("KoboldAI/GPT-J-6B-Skein")
160
+
161
+ model = AutoModelForCausalLM.from_pretrained("KoboldAI/GPT-J-6B-Skein")
162
+ ```
163
+ </details>