Edit model card

This is a tiny Longformer model designed for Russian language. It was initialized from cointegrated/rubert-tiny2 weights and has been modified to support a context length of up to 16384 tokens. We fine-tuned it on a dataset of Russian books, news, wiki and habr, however it still undrestands English, thanks to the source model. For a detailed information check out our post on Habr.

Model attributes:

  • 12 attention heads
  • 3 hidden layers
  • 16384 tokens length of context

The model can be used as-is to produce text embeddings or it can be further fine-tuned for a specific downstream task.

Text embeddings can be produced as follows:

# pip install transformers sentencepiece
import torch
from transformers import LongformerModel, LongformerTokenizerFast

model = LongformerModel.from_pretrained('kazzand/ru-longformer-tiny-16384')
tokenizer = LongformerTokenizerFast.from_pretrained('kazzand/ru-longformer-tiny-16384')

def get_cls_embedding(text, model, tokenizer, device='cuda'):
    model.to(device)
    batch = tokenizer(text, return_tensors='pt')

    #set global attention for cls token
    global_attention_mask = [
            [1 if token_id == tokenizer.cls_token_id else 0 for token_id in input_ids]
            for input_ids in batch["input_ids"]
        ]

    #add global attention mask to batch
    batch["global_attention_mask"] = torch.tensor(global_attention_mask)

    with torch.no_grad():
        output = model(**batch.to(device))
    return output.last_hidden_state[:,0,:]

P.S. Thanks for moral and technical support AbstractDL

Downloads last month
1,250
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.