FINE-TUNED-VIQUAD-HGF

This model is a fine-tuned version of bhavikardeshna/xlm-roberta-base-vietnamese on the UIT-ViQuAD dataset.

Model description

The model is described in Cascading Adaptors to Leverage English Data to Improve Performance of Question Answering for Low-Resource Languages paper

Training and evaluation data

A new dataset for the low-resource language as Vietnamese to evaluate MRC models. This dataset comprises over 23,000 human-generated question-answer pairs based on 5,109 passages of 174 Vietnamese articles from Wikipedia. However in processing, I eliminated more than 3000 questions with no answers.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

  • EM: 52.38
  • F1-SCORE: 77.67

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.13.1+cu116
  • Datasets 2.10.1
  • Tokenizers 0.13.2
Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.