Edit model card

Greek Longformer

A Greek version of the Longformer Language Model.

This model is a (from scratch) Greek Longformer model based on the configuration of allenai/longformer-base-4096, and trained on the combined datasets from the Greek Wikipedia and the Greek part of OSCAR. It achieves the following results on the evaluation set:

  • Loss: 1.1080
  • Accuracy: 0.7765

Pre-training corpora

The pre-training corpora of greek-longformer-base-4096 include:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 6.0

Training results

Framework versions

  • Transformers 4.28.0.dev0
  • Pytorch 2.0.0+cu118
  • Datasets 2.11.0
  • Tokenizers 0.13.2

Citing & Authors

The model has been officially released with the article "From Pre-training to Meta-Learning: A journey in Low-Resource-Language Representation Learning". Dimitrios Zaikis and Ioannis Vlahavas. In: IEEE Access.

If you use the model, please cite the following:


@ARTICLE{10288436,
    author =  {Zaikis, Dimitrios and Vlahavas, Ioannis},
    journal = {IEEE Access},
    title =   {From Pre-training to Meta-Learning: A journey in Low-Resource-Language Representation Learning},
    year =    {2023},
    volume =  {},
    number =  {},
    pages =   {1-1},
    doi =     {10.1109/ACCESS.2023.3326337}
  }
Downloads last month
3
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results

  • Accuracy on dataset/wiki_oscar_combined_normalized_uncased
    self-reported
    0.777