README.md · etri-lirs/t5-base-rc-feedback at main

metadata

license: apache-2.0
datasets:
  - ehovy/race
language:
  - en
base_model:
  - google-t5/t5-base

Model Card for t5-base-rc-feedback (220M #params)

Description

The provided model was trained to respond to incorrect student answers in an interactive reading comprehension exercise setting. Incorrect student answers can become valuable learning opportunities, provided that the student understands where they went wrong and why. To this end, rather than being given the correct answer, students should receive elaborated feedback on how to correct a mistake on their own. Highlighting the complex demands that the generation of such feedback places on a model's input utilization abilities, we proposed two extensions to the training pipeline. Firstly, we employed a KL regularization term between a standard and enriched input format to achieve more targeted input representations. Secondly, we added a preference optimization step to encourage student answer-adaptive feedback generation.

Automatic Evaluation Results

The final model was trained and evaluated on all feedback turns from the DIRECT and DIRECT-Feedback datasets partially available at https://github.com/DIRECTDataset/DIRECTFeedback/blob/main/data/feedback_data_partial.csv

BLEU	METEOR	ROUGE	BERTScore

6.9	21.7	21.4	19.0

For additional details we refer the reader to our paper.

Manual Evaluation Results

We sampled 250 items for the joined DIRECT+DIRECT-F feedback set and had one of the authors of this paper manually evaluate the generated feedback.

appropriate (verification, explanation and hint feedback)	direct (correction feedback)	irrelevant or ambigue	unfaithful (contradicting the passage or alluding to an incorrect answer)

43.6%	23.6%	22%	10.8%

Execution

Code and instructions on how to perform inference on the model are provided at https://github.com/DIRECTDataset/DIRECTFeedback

Citation

Liermann, W., Huang J., Lee, Y., Lee, K. (2024, November). More Insightful Feedback for Tutoring: Enhancing Generation Mechanisms and Automatic Evaluation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.