nazneen commited on
Commit
f932230
1 Parent(s): 8940daf

model documentation

Browse files
Files changed (1) hide show
  1. README.md +166 -5
README.md CHANGED
@@ -1,16 +1,177 @@
1
  ---
2
  language:
3
  - ru
 
4
  tags:
5
  - PyTorch
6
  - Transformers
 
7
  thumbnail: "https://github.com/sberbank-ai/model-zoo"
8
  ---
9
- # ruT5-base
10
- Model was trained by [SberDevices](https://sberdevices.ru/) team.
11
- * Task: `text2text generation`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  * Type: `encoder-decoder`
13
  * Tokenizer: `bpe`
14
- * Dict size: `32 101`
15
  * Num Parameters: `222 M`
16
- * Training Data Volume `300 GB`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  language:
3
  - ru
4
+
5
  tags:
6
  - PyTorch
7
  - Transformers
8
+
9
  thumbnail: "https://github.com/sberbank-ai/model-zoo"
10
  ---
11
+
12
+
13
+ # Model Card for ruT5-base
14
+
15
+ # Model Details
16
+
17
+ ## Model Description
18
+
19
+ More information needed
20
+
21
+ - **Developed by:** [SberDevices](https://sberdevices.ru/) team
22
+ - **Shared by [Optional]:** [SberDevices](https://sberdevices.ru/) team
23
+ - **Model type:** Text2text Generation
24
+ - **Language(s) (NLP):** Russian
25
+ - **License:** More information needed
26
+ - **Parent Model:** T5 base
27
+ - **Resources for more information:** More information neeeded
28
+
29
+
30
+
31
+ # Uses
32
+
33
+
34
+ ## Direct Use
35
+ This model can be used for the task of text2text generation
36
+
37
+ ## Downstream Use [Optional]
38
+
39
+ More information needed.
40
+
41
+ ## Out-of-Scope Use
42
+
43
+ The model should not be used to intentionally create hostile or alienating environments for people.
44
+
45
+ # Bias, Risks, and Limitations
46
+
47
+
48
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
49
+
50
+
51
+
52
+ ## Recommendations
53
+
54
+
55
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
56
+
57
+ # Training Details
58
+
59
+ ## Training Data
60
+
61
+ * Dict size: `32 101`
62
+ * Training Data Volume `300 GB`
63
+
64
+ ## Training Procedure
65
+
66
+
67
+ ### Preprocessing
68
+
69
+ More information needed
70
+
71
+
72
+
73
+
74
+
75
+ ### Speeds, Sizes, Times
76
  * Type: `encoder-decoder`
77
  * Tokenizer: `bpe`
 
78
  * Num Parameters: `222 M`
79
+
80
+
81
+ # Evaluation
82
+
83
+
84
+ ## Testing Data, Factors & Metrics
85
+
86
+ ### Testing Data
87
+
88
+ More information needed
89
+
90
+
91
+ ### Factors
92
+ More information needed
93
+
94
+ ### Metrics
95
+
96
+ More information needed
97
+
98
+
99
+ ## Results
100
+
101
+ More information needed
102
+
103
+
104
+ # Model Examination
105
+
106
+ More information needed
107
+
108
+ # Environmental Impact
109
+
110
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
111
+
112
+ - **Hardware Type:** More information needed
113
+ - **Hours used:** More information needed
114
+ - **Cloud Provider:** More information needed
115
+ - **Compute Region:** More information needed
116
+ - **Carbon Emitted:** More information needed
117
+
118
+ # Technical Specifications [optional]
119
+
120
+ ## Model Architecture and Objective
121
+
122
+ * Type: `encoder-decoder`
123
+
124
+
125
+ ## Compute Infrastructure
126
+
127
+ More information needed
128
+
129
+ ### Hardware
130
+
131
+
132
+ More information needed
133
+
134
+ ### Software
135
+
136
+ More information needed.
137
+
138
+ # Citation
139
+
140
+
141
+ More information needed
142
+
143
+
144
+
145
+
146
+ # Glossary [optional]
147
+ More information needed
148
+
149
+ # More Information [optional]
150
+ More information needed
151
+
152
+
153
+ # Model Card Authors [optional]
154
+
155
+ [SberDevices](https://sberdevices.ru/) team in collaboration with Ezi Ozoani and the Hugging Face team
156
+
157
+
158
+ # Model Card Contact
159
+
160
+ More information needed
161
+
162
+ # How to Get Started with the Model
163
+
164
+ Use the code below to get started with the model.
165
+
166
+ <details>
167
+ <summary> Click to expand </summary>
168
+
169
+ ```python
170
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
171
+
172
+ tokenizer = AutoTokenizer.from_pretrained("sberbank-ai/ruT5-base")
173
+
174
+ model = AutoModelForSeq2SeqLM.from_pretrained("sberbank-ai/ruT5-base")
175
+ ```
176
+ </details>
177
+