Edit model card

This is a finetuned version of RuRoBERTa-large for the task of linguistic acceptability classification on the RuCoLA benchmark.

The hyperparameters used for finetuning are as follows:

  • 5 training epochs (with early stopping based on validation MCC)
  • Peak learning rate: 1e-5, linear warmup for 10% of total training time
  • Weight decay: 1e-4
  • Batch size: 32
  • Random seed: 5
  • Optimizer: torch.optim.AdamW
Downloads last month
309