Paper accepted at the READI workshop at LREC-COLING 2024: "Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models"

  • We developed a new method for evaluating the quality of multiple-choice reading comprehension test items.
  • The method works for human evaluation and automatic evaluation with large language models.
  • We used the method to evaluate items generated by Llama 2 and GPT-4 in a zero-shot setting.
  • The results showed that the method is effective and that the quality of generated items is still limited, especially for Llama 2.

The paper was accepted for a poster presentation at the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) at LREC-COLING in Turin (Italy) on May 20, 2024.

Read the paper here.