Edit model card

LernnaviBERT Model Card

LernnaviBERT is finetuning of German BERT on educational textual data from the Lernnavi Intelligent Tutoring Systems (ITS). It is trained on masked language modeling following the BERT training scheme.

Model Sources

Direct Use

Being a fine-tuning of a base BERT model, LernnaviBERT is suitable for all BERT uses, especially in the educational domain in the German language.

Downstream Use

LernnaviBERT has been fine-tuned for MCQ answering and Student Answer Forecasting (like MCQStudentBertCat and MCQStudentBertSum) as described in https://arxiv.org/abs/2405.20079

Training Details

The model was trained on text data from a real-world ITS, Lernnavi, on ~40k text pieces for 3 epochs with a batch size of 16, going from an initial perplexity of 1.21 on Lernnavi data to a final perplexity of 1.01

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.0385 1.0 2405 0.0137
0.0142 2.0 4810 0.0084
0.0096 3.0 7215 0.0072

Citation

If you find this useful in your work, please cite our paper

@misc{gado2024student,
      title={Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning}, 
      author={Elena Grazia Gado and Tommaso Martorella and Luca Zunino and Paola Mejia-Domenzain and Vinitra Swamy and Jibril Frej and Tanja Käser},
      year={2024},
      eprint={2405.20079},
      archivePrefix={arXiv},
}
Gado, E., Martorella, T., Zunino, L., Mejia-Domenzain, P., Swamy, V., Frej, J., Käser, T. (2024). 
Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning. 
In: Proceedings of the Conference on Educational Data Mining (EDM 2024). 

Framework versions

  • Transformers 4.37.1
  • Pytorch 2.2.0
  • Datasets 2.2.1
  • Tokenizers 0.15.1
Downloads last month
4
Safetensors
Model size
110M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for epfl-ml4ed/LernnaviBERT

Finetuned
this model

Collection including epfl-ml4ed/LernnaviBERT