Automated Processing of Mobile Eye-Tracking Across Contexts: Accessible Methods for Implementing a Pre-Trained Computer Vision Model
Research Poster Social & Behavioral Sciences 2025 Graduate ExhibitionPresentation by Marisa Lytle
Exhibition Number 72
Abstract
Psychology has often limited its work to the laboratory due to technological constraints and the desire for experimental control. However, it may mean that our observations in the laboratory may not transfer over to more natural behaviors or settings. Mobile eye tracking (MET) technology has gained popularity for studying visual attention in observational and naturalistic settings. However, quantifying gaze targets for behavioral analysis has relied heavily on manual video coding by trained staff, which is labor-intensive, costly, and sensitive to human biases. Manually coding at the finest frame-by-frame timescale can take up to an hour per minute of video, creating significant barriers for researchers with limited resources. While machine learning models for object and human detection have seen accuracy improvements, their application to MET data often requires advanced programming skills or relies on “black box” tools that limit flexibility and control across varied social contexts. The goal of this project is to provide an open-source implementation of a pre-trained computer vision model (YOLO-v8) that will help reduce both time and resource demands while maintaining accuracy in processing gaze data. Our pipeline is designed to be user-friendly for researchers with basic programming knowledge, offering a more accessible alternative to existing tools. I am applying this approach across two studies—one with adults and one with children—demonstrating how MET can capture social interactions across diverse, naturalistic environments. My research will generate validation results which compare the computer vision model to hand-coded data.
Importance
This project provides both a technical resource and guidance for researchers interested in using mobile eye tracking in their own research. By providing examples of how we implement a pre-trained computer vision model to accelerate the processing of gaze data, future researchers can readily adapt this procedure for their own needs. Using automated pipelines removes the time intensive requirement of manually processing gaze data by hand, a task which is often delegated to undergraduate research assistants. By transferring this time intensive task to the computer, students may instead focus on developing their own research projects and learning advanced skills which prepare them for graduate school and the workforce.