metadata

license: apache-2.0
language:
  - ru
  - en
library_name: transformers

RoBERTa-base from deepvk

Pretrained bidirectional encoder for russian language.

Model Details

Model Description

Model was pretrained using standard MLM objective on a large text corpora including open social data, books, Wikipedia, webpages etc.

Developed by: VK Applied Research Team
Model type: RoBERTa
Languages: Mostly russian and small fraction of other languages
License: Apache 2.0

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("deepvk/roberta-base")
model = AutoModel.from_pretrained("deepvk/roberta-base")

text = "Привет, мир!"

inputs = tokenizer(text, return_tensors='pt')
predictions = model(**inputs)

Training Details

Training Data

Mix of the following data:

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

Standard RoBERTA-base size;

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Compute Infrastructure

Model was trained using 8xA100 for ~22 days.