ETDD70: Eye-Tracking Dyslexia Dataset

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/13332133

下载链接

链接失效反馈

官方服务：

资源简介：

The ETDD70 dataset comprises eye-tracking recordings from 70 Czech participants, equally divided into dyslexic and non-dyslexic readers, all aged 9–10 years. The dataset captures eye movements during three text-reading tasks in Czech: syllable reading (Task 1), meaningful text reading (Task 4), and pseudo-text reading (Task 5). This dataset is the result of the project “Diagnostics of Dyslexia Using Eye-Tracking and Artificial Intelligence” conducted by our research team. The project aims to leverage artificial intelligence tools and advanced technical equipment (eye tracking) to more effectively diagnose dyslexia, one of the most common specific learning disorders, and thereby significantly improve re-education strategies for dyslexic students. The primary goal is to develop models that accurately distinguish between dyslexic and non-dyslexic readers based on eye movement patterns recorded during these tasks. Data collection took place between October 2022 and August 2023, adhering to ethical standards. The project was approved by the Research Ethics Committee of Masaryk University in Brno, Czech Republic. Please contact us if you have any questions or feedback at nicol.dostalova@mail.muni.cz or at svaricek@phil.muni.cz. The ETDD70 dataset is freely available for research purposes. PARTICIPANTS The eye-tracking data were captured from 70 participants: 35 dyslexic and 35 non-dyslexic readers. In all cases, participants are elementary school pupils aged 9-10 years (i.e., 4th grade of elementary school). Recruitment of suitable participants was conducted in cooperation with a psychological counseling center, which facilitated the recruitment of pupils diagnosed with dyslexia. The non-dyslexic readers, who showed no symptoms of dyslexia, were recruited in cooperation with the counseling facilities of selected elementary schools. The dataset was collected from October 2022 to August 2023. The legal representatives of all participants were properly informed about the research procedure and agreed to participate in the study, for which they subsequently received compensation. STIMULI We designed three verbal tasks based on standardized paper-based dyslexia diagnostics used in the Czech Republic. These source texts were transferred to a digital version in a controlled form (e.g., amount of text, font size, line spacing, background color, etc.) for the requirements of eye-tracking measurements. Task called Syllables contains 90 syllables arranged in a 9 x 10 matrix. The syllables are commonly encountered in the Czech language. The individual rows of syllables were categorized according to syllable composition (based on linguistic aspects) as follows: open syllables with no meaning, i.e., consonant + vowel (e.g., "ta," "na"), closed syllables with a central vowel bearing a meaning, i.e., consonant + vowel + consonant (e.g., "suk," "mák"), meaningless syllables consisting of two consonants (e.g., "vl," "bz"), a meaningless syllable formed by a cluster of two consonants ending in a vowel (e.g., "tle," "mra"), and finally a meaningful syllable formed by a cluster of three consonants with one vowel in the 3rd position (e.g., "mrak," "vlak"). All syllables were presented in black font, with Times New Roman on a gray background. The objective of the task is to read aloud all syllables from left to right and from top to bottom. A fixation cross was placed in the lower right corner for gaze-contingent task closure—when the participant looks at this cross, the recording is automatically terminated. Task called MeaningfulText consists of a passage about a young boy who watches a squirrel from his window. This text is intended for elementary school readers in grades 3 and 4. The stimulus text contains a total of seven text lines with six logical sentences. The text is again written in black-colored font with double line spacing on a grey background and the fixation cross in the lower right corner. The aim of the task is to read the entire text aloud. Task called PseudoText comprises fictional, meaningless words. This text has a total of seven lines with 15 artificial sentences. The text formatting, as well as the ending fixation cross, are the same as in Task MeaningfulText. The objective of the task is to read the entire text aloud as smoothly as possible. EYE-TRACKING FEATURES The raw eye-tracking data recorded for each task were further processed to extract event-based characteristics—fixations, saccades, and dozens of derived statistical characteristics. The fixations were detected using the i2mc algorithm (Hessels et al., 2017), as it was specifically designed to be noise-robust for measurements in children. The minimum fixation duration was set to 40 ms. The derived characteristics provide additional information about how participants interact with text. These characteristics are divided into whole-task and region-of-interest (ROI) characteristics. While the whole-task characteristics describe the semantics on the global level of the whole screen, the ROI ones characterize the semantics on the local level of a small rectangular area. Feature-based characteristics for each task: Syllables First fixation duration, average fixation duration, number of fixations, number of fixations and saccades without the incoming/outgoing saccade, number of revisits—incoming saccades hitting this ROI from outside. MeaningfulText, PseudoText Whole-task (features extracted from the whole trial): number of regressions, ratio of progressive to regressive saccades, average saccadic amplitude, total reading duration, average fixation duration, number of fixations. ROI (features extracted for separated regions of interest, i.e. lines and words): average fixation duration, number of fixations, number of revisits—incoming saccades hitting this ROI from outside, landing position of the first fixation. AI CLASSIFICATION APPROACH The AI-based methods used for the classification of dyslexia are available here: https://gitlab.fi.muni.cz/xsedmid/dyslex CITE THIS DATASET Dostalova, N., Svaricek, R., Sedmidubsky, J., Culemann, W., Sasinka, C., Zezula, P., & Cenek, J. (2024). ETDD70: Eye-tracking Dyslexia Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.13332134 CITE THE ASSOCIATED PAPER Sedmidubsky, J., Dostalova, N., Svaricek, R., & Culemann, W. (2024). ETDD70: Eye-tracking dataset for classification of dyslexia using AI-based methods. In Proceedings of the 17th International Conference on Similarity Search and Applications (SISAP) (pp. 1-14). Springer.

创建时间：

2024-09-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集