Edit model card

ruGPT-Neo 1.3B [IN TRANING, 100k/2M NOT FINAL CHECKPOINT]

Model Description

ruGPT-Neo 1.3B is a transformer model designed using EleutherAI's replication of the GPT-3 architecture. ruGPT-Neo refers to the class of models, while 1.3B represents the number of parameters of this particular pre-trained model.

Training procedure

This model was trained on the wiki, gazeta summorization, for 38k steps, on 1*v100 gpu, still training . It was trained as a masked autoregressive language model, using cross-entropy loss.

How to use

You can use this model directly with a pipeline for text generation. This example generates a different sequence each time it's run:

>>> from transformers import pipeline
>>> generator = pipeline('text-generation', model='AlexWortega/rugpt-neo-1.3b')
>>> generator("Как какать? Ответ:", do_sample=True, min_length=50)

[{'generated_text': 'Как какать? Ответ: Cпустите штаны и покакайте, затем воспользуйтесь бумагой'}]
Downloads last month
59
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train AlexWortega/rugpt-neo-1.3b