AI & ML interests

Question Generation, Evalutaion

DataSet

A benchmark for multi-dimensional question generation evaluation, which consists of 200 instances from SQuAD and HotpotQA, each instance contains 15 questions generated by 15 different QG models.

Evalutaion dimensions:

  • fluency
  • clarity
  • conciseness
  • relevance
  • consistency
  • answerability
  • answer consistency

Models

Trained QG models used for generating questions to be evaluated.