AlexWortega
commited on
Commit
•
5b5f9b2
1
Parent(s):
60d135a
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,29 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
4 |
+
# ruGPT-Neo 1.3B
|
5 |
+
|
6 |
+
## Model Description
|
7 |
+
|
8 |
+
ruGPT-Neo 1.3B is a transformer model designed using EleutherAI's replication of the GPT-3 architecture. ruGPT-Neo refers to the class of models, while 1.3B represents the number of parameters of this particular pre-trained model.
|
9 |
+
|
10 |
+
|
11 |
+
## Training procedure
|
12 |
+
|
13 |
+
This model was trained on the wiki, gazeta summorization, for 38k steps, on 4*v100 gpu, still training . It was trained as a masked autoregressive language model, using cross-entropy loss.
|
14 |
+
|
15 |
+
## Intended Use and Limitations
|
16 |
+
|
17 |
+
This way, the model learns an inner representation of the English language that can then be used to extract features useful for downstream tasks. The model is best at what it was pretrained for however, which is generating texts from a prompt.
|
18 |
+
|
19 |
+
### How to use
|
20 |
+
|
21 |
+
You can use this model directly with a pipeline for text generation. This example generates a different sequence each time it's run:
|
22 |
+
|
23 |
+
```py
|
24 |
+
>>> from transformers import pipeline
|
25 |
+
>>> generator = pipeline('text-generation', model='AlexWortega/rugpt-neo-1.3b')
|
26 |
+
>>> generator("EleutherAI has", do_sample=True, min_length=50)
|
27 |
+
|
28 |
+
[{'generated_text': 'EleutherAI has made a commitment to create new software packages for each of its major clients and has'}]
|
29 |
+
```
|