Abstract
Purpose :
Measuring reading performance using reading charts requires highly standardized reading material. Because such material is hard to produce, most charts are currently limited by their relatively small number of texts or sentences. Using the MNREAD as a proof-of-concept, we propose a new automated method to generate highly constrained short sentences following the strict rules of MNREAD and applicable to different languages.
Methods :
Our approach is based on Multi-valued Decision Diagrams (MDD), a data structure widely used in the field of constraint programming to solve optimization problems. First, we segment a large text corpus into n-grams (sequences of n-words), and we filter out unallowed n-grams with respect to the lexicon and grammar imposed by the MNREAD rules. Then we compute an MDD by linking allowed n-grams with each other while satisfying MNREAD constraints (length, display). Thus, we generate under constraints all possible MNREAD sentences that can be built from a given corpus. Finally, sentences are ranked from ‘good’ to ‘poor’ using a transformers-based language model (GPT-2).
To validate our method, we asked 14 normally sighted participants (age 14 to 56) to read 3 sets of French sentences: 30 MNREAD sentences, 30 ‘good’ generated sentences and 30 ‘poor’ generated sentences. All were displayed at 40cm, in regular polarity with a fixed print size of 1.3 logMAR. Corrected reading speed was measured in words/min (wpm) and analyzed using a mixed-effects model.
Results :
On average, ‘good’ generated sentences were read at 164 wpm. This value was not significantly different from the reading speed of 162 wpm yield by MNREAD sentences (p=0.5). On the other hand, reading speed was significantly slower for ‘poor’ generated sentences, with an average value of 151 wpm (p<0.001).
Conclusions :
Our method seems to provide valid standardized sentences that follow the MNREAD rules and yield similar performance, at least in French. Since our method is easily applicable to other languages, further investigations is needed to validate the generation of sentences in English, Spanish, Italian, etc.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.