MultiplEYE Data Collection Metadata Form

    1. Enter the title for your data collection

    To ensure consistency within the COST Action Project “MultiplEYE”, it is essential that the name of your data collection/dataset follows the MultiplEYE naming convention and is consistently applied throughout. The name is composed of the terms “MultiplEYE”, the tested language (ISO-639-1, 2-letter language code), the name of your country (ISO-3166, 2-letter country code), the name of your city, your identifier, and the year when your data collection will end. The name should have already been generated at pre-registration.The entire name MUST be identical with the one that has been pre-registered.

    MultiplEYE_

    _

    _

    _

    _

    Example: MultiplEYE_DE_DE_Berlin_1_2024

    2. Enter all person(s) responsible for the research data / for creating the dataset at the collection site

    2.1. Please provide the contact information (i.e., full name and email address of the corresponding contributor at the responsible institution).

    Contact:

    *

    2.2. Provide the name of any person(s) as lead creator(s)/contributor(s) who contributed to your dataset and how they contributed. A lead contributor to a data collection is anyone who is responsible for the data collection at a lab. This includes those who take on central roles in planning, organizing, and executing the data collection process, ensuring the quality and accuracy of the data collected, and providing documentation. Please name all lead contributors specifically from your collection site / lab / institution.

    Note: Add as many names as you need. Use a semicolon to separate the persons. Use full names (titles are omitted) and state the type(s) of contribution.

    Lead Creator(s):

    *

    Responsible researcherSupervision of the data collection processOrganizational dutiesAdministrative dutiesTranslation duties (e.g. translation of stimuli, comprehension questions, instructions etc.)Other type of contribution (please specify below)Not specified

    If you would like to add another lead creator / contributor, please select below:

    Responsible researcherSupervision of the data collection processOrganizational dutiesAdministrative dutiesTranslation duties (e.g. translation of stimuli, comprehension questions, instructions etc.)Other type of contribution (please specify below)Not specified

    If you would like to add another lead creator / contributor, please select below:

    Responsible researcherSupervision of the data collection processOrganizational dutiesAdministrative dutiesTranslation duties (e.g. translation of stimuli, comprehension questions, instructions etc.)Other type of contribution (please specify below)Not specified

    If you would like to add another lead creator / contributor, please select below:

    Responsible researcherSupervision of the data collection processOrganizational dutiesAdministrative dutiesTranslation duties (e.g. translation of stimuli, comprehension questions, instructions etc.)Other type of contribution (please specify below)Not specified

    If you would like to add another lead creator / contributor, please select below:

    Responsible researcherSupervision of the data collection processOrganizational dutiesAdministrative dutiesTranslation duties (e.g. translation of stimuli, comprehension questions, instructions etc.)Other type of contribution (please specify below)Not specified

    2.3. If applicable, provide the name of any person(s) as supporting creator(s) who contributed to your dataset and how they contributed. A supporting contributor to a data collection includes students or other supporting staff who assist with data collection and support running the experiment in the lab with participants. They help in various aspects of the data collection process but may not hold overall responsibility for it. They play a supporting role alongside the lead contributors, assisting in various aspects of the data collection process.

    Note: Add as many names as you need. Use a semicolon to separate the persons.

    Supporting Creator(s):

    example: Martin Average, collecting data from participants in the lab; Hailey Miller, participant appointment coordination….

    3. Please, specify the time frame of your data collection

    When did your data collection start? (use: yyyy-mm)

    When did your data collection end? (use: yyyy-mm)

    Example: 2022-11, 2023-10

    4. State your location where the study took place

    Note: Complete below: name of institution or lab, city and country

    Location “institution”:

    Example: University of Zurich, Department of Computational Linguistics

    Location “city”:

    Example: Zurich

    Location “country”:

    Example: Switzerland

    Please select below if your setting includes more than one location (i.e., if your data collection occurred at two different labs) and therefore is connected to another data collection registration?
    If applicable, please enter the pre-registered title / ID of the corresponding data collection below.

    Note: If your data collection happens at two different labs, BOTH labs must have pre-registered and will have to complete this form.

    Corresponding data collection name:

    Example: MultiplEYE_DE_DE_Berlin_1_2024

    5. Enter a description of your dataset

    Please specify whether your dataset is part of the core data collection of MultiplEYE or if it pertains to any other additional data collection related to the MultiplEYE core dataset. This includes additional datasets serving different purposes or aimed at investigating supplementary research questions.

    Note: Please see the MultiplEYE data management plan or data collection guidelines for definitions of core and additional data collection. If you have also collected or are planning to collect data for additional datasets (next to this core dataset), please provide further information below, so the core dataset can be linked to the additional dataset(s).

    Please Select:


    a. Data collection description: The core data collection for the MultiplEYE COST Action aims at fostering an interdisciplinary network of research groups working on eye-tracking data from reading across multiple languages. For this purpose, the data is collected through eye trackers from adult native speakers in many different countries. The development of such a large multilingual eye-tracking corpus enables researchers to study human language processing from a psycholinguistic perspective as well as to improve and evaluate computational language processing from a machine learning perspective.

    Please indicate here if there are any other additional datasets already existing that are related to this dataset. If applicable, please enter the name of the related data collection (i.e.,
    “MultiplEYE_languageCode_countryCode_city_identifier_endYearOfDataCollection):


    Example: MultiplEYE_DE_CH_Zurich_2_2025

    Please specify here how your core dataset is related to the other mentioned dataset.


    example: This core dataset is related to an additional dataset investigating individual difference in human language processing.

    b. This dataset belongs to an additional data collection which was collected in addition to the core data collection for the MultiplEYE. Please provide a short description of your additional data collection.

    The additional dataset has tested / consists of:

    Same participants but different stimuli or different experiment than the MultiplEYE stimuli/experimentDirect replication of the MultiplEYE experiment (i.e., different participants but same stimuli than MultiplEYE)MultiplEYE experiment was conducted as a pilot studyDifferent group of participants with core MultiplEYE stimuli (e.g., with L2 speakers / with elderly participants/or children/ or participants with dyslexia etc.)Same MultiplEYE stimuli but different procedure (e.g. different stimulus presentation such as different font, or different order or amount of stimuli presentation per session etc.)

    Other, please specify:


    Please state here to which core dataset it is related to. If applicable, also provide the name of your related core dataset (i.e., “MultiplEYE_languageCode_countryCode_identifier_endYearOfDataCollection):


    Example: MultiplEYE_DE_CH_Zurich_1_2024

    Please specify here how this dataset is related to the core dataset.

    If your core dataset (which is related to the above mentioned additional dataset) has already been published, insert the link to its publication below:

    Are there any other additional datasets existing which are related to this dataset? If so, enter their name here (i.e., “MultiplEYE_languageCode_countryCode_identifier_endYearOfDataCollection):

    6. Please provide information about sponsoring agencies, individuals, or contractual arrangements for your study


    This study is part of the COST Action “MultiplEYE” (CA21131) funded by the European Union. COST Actions are research networks supported by the European Cooperation in Science and Technology.


    If your data collection was funded by any funding agency, please add the information below.

    Note: Include the grant ID. If there is more than one funding/sponsor involved in your research, please separate the sponsors by a semicolon.

    Example: Institute for Advanced Research (IAR), grant no./funding code: IAR-2024-001; Foundation for Scientific Innovation (FSI), grant no./funding code: FSI-2024-123

    7. Describe the research goals and objectives for your data collection

    Please specify whether the goals and objectives refer to the core or additional data collection by checking one of the following boxes. If you are filling out this form for an additional data collection, check the associated box and describe the research objectives in your own words.

    The following description of the objectives refers to the core dataset: The aim of this data collection/study, within the framework of the COST Action MultiplEYE, is to contribute to the development of a multilingual eye-tracking corpus. Each dataset contributed to the core data collection will be made accessible to collaborators of the COST Action MultiplEYE, as well as to the broader scientific community and the general public. These datasets will serve to investigate research inquiries related to human language processing from a psycholinguistic perspective and to enhance computational language processing techniques using machine learning methodologies. The primary focus of the MultiplEYE core dataset is to investigate reading comprehension across multiple languages based on eye movement measurements.

    The following description of the objectives refers to the additional dataset:

    8. Choose the language that has been tested within your reading experiments / for your data collection


    Note: Use
    ISO-639-1
    for the representation of names of languages.

    8.1. Language code:

    8.2. Select the language family to which the language belongs to.

    Other, namely:

    8.3. Select the language script.

    Other, namely:

    9. State your type of research design and describe the used methods

    9.1. If you are filling out this form for a core dataset please check the following box describing the research design of the MultiplEYE experiment.

    The study, as a part of the MultiplEYE core data collection, uses an experimental research design (i.e., eye-tracking-while-reading experiment).

    9.2. If you are filling out this form for an additional dataset and the research design differs from the one above (for example, if your research design does not include an experimental manipulation, or if it does not represent an experimental research design), please describe here and select your used method.

    Have you used a reading experiment for the collection of your additional data?

    YesNo

    Used method:

    Eye-trackingSelf-paced readingWeb-based eye-trackingEEGElicitation (i.e., text continuation, cloze text)Question answering task

    Other, namely:

    10. Tests and measures

    10.1. Select all your measures collected in your study:

    Recording of eye movements (i.e., horizontal and vertical gaze coordinates) via eye trackerResponses to the MultiplEYE comprehension questionsStandard MultiplEYE participant questionnaire (i.e., demographic data, text familiarity and difficulty assessment)Psychometric tests (please indicate further in section 10.2.)

    Other measures, namely:

    10.2. State here if you collected any data through psychometric test(s).

    MultiplEYE has selected and implemented six psychometric tests to assess various cognitive capacities. Collecting data through those tests was non-obligatory for the core data collection. Please indicate below if you have collected data from psychometric tests. If yes, select which of the following tests you have conducted.

    Verbal and non-verbal working memory: Lewandowsky et al. Working Memory Capacity (LWMC) battery

    Collected for all participants

    Collected for the following number of participants:

    Rapid automatized naming: RAN task

    Collected for all participants

    Collected for the following number of participants:

    Cognitive control: Stroop task

    Collected for all participants

    Collected for the following number of participants:

    Cognitive control: Flanker task

    Collected for all participants

    Collected for the following number of participants:

    Metalinguistic aptitude: PLAB

    Collected for all participants

    Collected for the following number of participants:

    Vocabulary test: WikiVocab

    Collected for all participants

    Collected for the following number of participants:

    Other psychometric test, namely:

    Collected for all participants

    Collected for the following number of participants:

    10.3. State here if you collected any data through additional participant questionnaire(s) (in addition to the MultiplEYE participant questionnaire).

    No additional participant information/demographic data collected

    Additional participant/demographic data, collected for the following number of participants:

    10.4. State here whether you collected any data through additional stimuli (next to the 10 stimulus texts + 2 practice texts), and/or through additional experimental tasks.

    No data collected through additional stimuli and/or through additional experimental tasks

    Data collected through additional stimuli, for the following number of participants:

    Data collected through additional experimental tasks, for the following number of participants:

    10.5. Testing

    Testing for the MultiplEYE core data collection should only include monocular dominant eye testing. In this case, please check the corresponding box.

    Monocular dominant RIGHT eye testing, for the following number of participants:

    Monocular dominant LEFT eye testing, for the following number of participants:

    If your testing differs from the MultiplEYE specification (for example, if you fill out this form for an additional dataset which might include other testings OR if the testing of the dominant eye was not possible for the core data collection due to certain reasons), please indicate here.

    Please select first to which data collection the deviation in testing applies to:

    Core data collectionAdditional data collection

    The deviant testing includes (please select):

    Testing of both monocular left and rightTesting of >80% dominant eye, but some participants' non-dominant eyeBinocular testing

    Please explain the reason for this deviation in testing (for instance, if the determination of the dominant eye was unsuccessful):

    Tracking of pupil size (please note that the tracking of the pupil size is non-obligatory for the core data collection):

    YesBy radiusBy diameterBy surfaceNo tracking of pupil size

    Other tracking used, namely:

    If additional details about your data collection are required, please provide them in the space below. This may include specific methodologies employed, participant demographics, experimental conditions, or any other relevant information that enhances understanding and interpretation of the collected data.

    11. Apparatus and lab equipment

    11.1. Was your data collection conducted stationary (i.e., the data collection took place in one or several rooms that remained the same throughout the testing phase)?

    11.2. Please provide information on the computer operation system (OS) you use in your lab.

    Name of computer operation system:

    example: Windows 10, Linux Mint 21.3, macOS 14 Sonoma

    11.3. Precise name of eye tracker:

    Did you use more than one eye tracker (i.e., data collection via 2 different eye trackers)?

    11.4. Description of eye tracker camera system:

    Desktop mountTower mountLong-range mount cameraWebcamOther

    If “other”, please specify below:

    If you have used more than one eye tracker (i.e., data collection via 2 different eye trackers) and the eye tracker camera system differs, please state below:

    11.5. Use of chin or forehead rest:

    Other equipment, namely:

    See example below for a combination of a chin- and forehead rest:

    11.6. Give information on the sampling rate.

    Maximal sampling rate in hz of eye tracker:

    What frequency did you set as your default?

    Have there been another sampling rate in your standard setting (i.e., default setting)? Please indicate here:

    11.7. Please select (!!) and describe distances from eye(s) to camera in your standard setting.

    For MONOCULAR testing, eye to camera distances in cm:

    From right angle of the tracked eye to camera:

    From middle of the tracked eye to camera:

    From top edge of the tracked eye to camera:

    From bottom edge of the tracked eye to camera:

    For BINOCULAR testing, eye to camera distances in cm:

    From nasal bone to camera:

    11.8. Monitor(s)/screen(s):

    Number of monitors:

    Precise name of monitor (provide the following information for each monitor/screen):

    Monitor physical size in cm without frame:

    Resolution in px:

    If you would like to add another monitor, please select below:

    Monitor physical size in cm without frame:

    Resolution in px:

    11.9. Please select (!!) one of the following options and describe distances from eye(s) to screen in your standard setting.

    For MONOCULAR testing, eye to screen distances in cm: The OBLIGATORY eye-to-screen distance for monocular testing (measured in cm, 90 degree angle) is 60 cm. If the distance measured at your lab differs from the MultiplEYE specification, please state below.

    For BINOCULAR testing, enter the eye to screen distance below:

    From nasal bone to screen:

    11.10. Please select and name the response device for participants which you have used within your data collection.

    Name / model of the selected device:

    Examples: QWERTY keyboard (Fujitsu); gaming controller (hand console of Xbox series)

    11.11. Use of specific chair / table / setup (please see the MultiplEYE data collection guidelines section 11.3. for the lab specifications):

    Use of height adjustable tableUse of non-moving chair (i.e., chair with NO wheels)

    Other setup, namely:

    11.12. Please describe here if there are any deviations from the MultiplEYE specifications (MultiplEYE data collection guidelines section 11.3. regarding your lab setup:

    11.13. Please also describe here if there is anything you would like to add.

    12. Informed consent

    Please, state here how the informed consent was presented:

    13. Stimulus presentation

    How are the stimuli and instructions presented on screen? If the stimulus presentation in your standard setting complies with the MultiplEYE specifications (meaning there were no issues with implementing the experiment screens/images), please check the box below:

    The MultiplEYE experiment includes the following specification regarding the screen images: Font: JetBrains Mono (Google Font); Font size: approx. 7 millimeter; Background color: grey scale; Foreground (i.e., font) color: black

    If the stimulus presentation is different in your study, please specify here:

    14. Describe symbol(s) used for calibration / drift correction / page turning.

    Describe symbols used for calibration.

    Describe how you performed drift correction (for example, if you have used a fixation trigger to check the calibration, or if you have used a drift check and re-calibration, etc.). If you have used the MultiplEYE experiment implementation for drift correction performance, please tick check box below. If the drift correction performance in your experiment setup differs from the MultiplEYE implementation, please provide a detailed description below.

    For the MultiplEYE experiment, a fixation trigger has been used to check calibration. Before each text or question page, there is a fixation trigger. A black dot with a white center is shown in the top left corner of the screen. It triggers if there is a fixation on the dot.

    Describe how participants had to move / turn to the next page (for instance, by clicking with the mouse on a black dot in the right bottom corner).

    15. Experiment Procedure / Protocol

    15.1. Describe the procedure of the experiment. Select first if the procedure refers to the MultiplEYE core data collection or to the collection of additional data.

    For the collection of core dataset, please check the box describing the standard MultiplEYE procedure below:

    The experiment starts with the experimenter entering the participant ID into the experiment software. Subsequently, the participant is greeted by a welcome screen and prompted to electronically sign the consent form. Following this, the experiment instructions are displayed on the screen. Next, the camera setup is initiated, which includes calibration and validation processes. A practice phase commences, involving two short texts followed by two comprehension questions each. These practice trials serve to familiarize participants with the experimental setup and keyboard usage, as well as to provide practice with comprehension questions. The main phase of the reading experiment starts thereafter. Ten texts are presented in a randomized order, shuffled uniquely for each participant. A fixation trigger precedes the turn to each new page, requiring participants to fixate on a dot on the screen to validate calibration accuracy and allow for a smooth and fast transition to the next page, if the calibration is still sufficiently precise. Re-calibration procedures follow if warranted to regain precision of the eye-tracking measurements. Three rating scales follow the presentation of each text: two inquiring about the participant’s familiarity with the text (rating the familiarity of the text and whether participants have read or listened to the text previously), and the last one inquiring about the perceived difficulty. Following the rating scales, participants are prompted to respond to six comprehension questions after having completely read each text. Re-visitations of the texts are disabled during the presentation of the comprehension questions. After the completion of the experiment phase, participants are asked to fill out a participant questionnaire on a screen.

    Is there anything you would like to add:

    If your are filling out this form for your additional data collection, please describe your procedure below.


    Note: Refer to how it differed from the experiment procedure for the core data collection. What aspects have changed compared to the standardized MultiplEYE procedure?


    15.2. Please specify here, if your data collection followed the MultiplEYE experiment protocol, but the reading task of 10 texts was not conducted in a single session and was instead split into several sessions.

    Please select:

    Other, namely:

    15.3. Please specify here, and if applicable, when you conducted psychometric tests during the experiment.


    Note: Conducting psychometric tests are always considered as a separate session in addition to the eye-tracking session with the reading task (even if there is only be a small break in between sessions).

    Please select:

    Other, namely:

    Were the psychometric tests conducted in a single session or split into several sessions?

    Please select:

    Other, namely:

    Did any psychometric test session include more than one participant? In other words, were psychometric tests conducted simultaneously with multiple participants in a single session?

    Please select:

    16. Participants

    16.1. Please describe how you recruited the participants for the MultiplEYE experiment (core data collection and / or additional dataset).

    16.2. The invitation letter was send via (multiple answers can be selected):

    Postal mailEmail (no distributor)Social media (please specify further below)Email distributor (please specify further below)Other method (please specify further below)

    16.3. Describe your participants:

    Please select first whether you are describing your participants for the core data collection or for the collection of additional data.

    For the collection of core dataset, please check the box describing the participants’ inclusion criteria below:

    Participants for the MultiplEYE core data collection are native speakers of the language being tested in the experiment. They are adults and literate, meaning they have received reading instruction through standard formal education and have no suspected or attested reading difficulties. They must have corrected-to-normal vision. They report having no attested reading or language disorders, whether developmental or acquired. They also report no intellectual disability or any psychiatric diagnoses. Participants' ages may range from 18 to 65 years.

    For the collection of additional dataset(s), please choose from the following selection to further specify how the inclusion criteria of the participants for your additional data collection might have differed from the ones of the core data collection (i.e.: Participants for the MultiplEYE core data collection are native speakers of the language being tested in the experiment. They are adults and literate, meaning they have received reading instruction through standard formal education and have no suspected or attested reading difficulties. They must have corrected-to-normal vision. They report having no attested reading or language disorders, whether developmental or acquired. They also report no intellectual disability or any psychiatric diagnoses. Participants’ ages may range from 18 to 65 years.):

    Same criteria as for core data collection, no other inclusion criteria added

    Same criteria as for core data collection, but with the following inclusion criteria added

    Different inclusion criteria as for core data collection

    For the collection of additional dataset(s), additionally describe your participants in your own words (include information about age, gender, native language, any impairments and other relevant details):

    16.4. Exclusion criteria:

    For the collection of core dataset, we use the following exclusion criteria for participants (please check the following box, if applicable):

    Participants must NOT be: 1) minors (i.e., younger than 18 years of age) or elderly (i.e., older than 65 years of age), 2) second language learners of the language of the experiment, 3) of low literacy (as indicated by absence of formal reading instruction, suspected or attested reading difficulty), 4) report language or reading disorders (neither developmental, e.g., dyslexia, reading or writing impairment, learning disorder, developmental language disorder, auditory processing disorder, expressive or receptive language disorder, language disorder associated with another medical condition; nor acquired such as cerebral palsy, childhood aphasia, aphasia, acquired dyslexia), 5) report any psychiatric disorders, 6) report any intellectual disability, 7) report sensory disability (e.g., be hard of hearing, deafness or blindness)

    Add additional lab-specific exclusion criteria, if applicable, for example if you are completing this form for the collection of additional dataset(s):

    Participants wearing glasses with more than one power: bifocals, trifocals, and progressives (e.g., varilux)Participants wearing glasses in general (participants were not allowed to wear any glasses)Participants wearing soft lensesParticipants wearing hard lensesParticipants had eye surgery: corneal (e.g., LASIK, RK), cataract, intraocular implantsParticipants with eye movement or alignment abnormalities: lazy eye, strabismus, nystagmusParticipants with eyelid ptosisParticipants with epiphora (wet eyes)

    Is there anything you would like to add?

    16.5. Please specify your participants age.

    Specify your participants’ age range (from minimum age to maximum age, f.ex. “22 – 56”):

    Mean age of all participants (rounded to one decimal place):

    Standard deviation of participants’ age (rounded to one decimal place):

    Median of participants’ age:

    16.6. Sex / Gender:

    MaleFemaleOther

    16.7. Participants are (please select the following options):

    Native speakersWith corrected-to-normal visionOther population as a part of an add-on data collection

    If participants belong to another population as a part of an add-on data collection, please specify below:

    Non-native speakersParticipants with dyslexiaParticipants with aphasiaParticipants with other cognitive impairment

    16.8. Provide some information about the years of education of your participants:

    Specify your participants’ years of education range (from minimum to maximum years of education, f.ex. “5 – 16”):

    16.9. State your sampling method:

    Note: MultiplEYE allow all kinds of sampling methods.

    Sampling consists of every person who applied for the experiment participation and met the inclusion criteriaConvenience sampling(Simple) random samplingSystematic samplingOther (please specify further below)

    Find here some information about convenience sampling, random sampling, and systematic sampling.

    16.10. Provide the achieved sample size as total number of cases N:

    Note: Include only samples which have completed at least one trial, meaning one reading task (i.e., one experiment text; practice texts not included) plus the comprehension questions associated with the text.

    Provide the total number of participants who have completed at least ONE trial (i.e., one reading task and its associated comprehension questions):

    Provide the total number of participants who have completed at least ONE trial and each of the six psychometric tests mentioned in section 10.2. of this form:

    Provide the total number of participants who have completed ALL trials (i.e., all reading tasks and their associated comprehension questions):

    Provide the total number of participants who have completed ALL trials and ALL six psychometric tests:

    17. Additional information about your dataset

    17.1. Does your dataset include data of the 2 practice trials:

    17.2. Is there any information about your data collection or dataset that you would like to add or comment on which has not been addressed yet?