five

Data-Centric Pruning Pipeline for Robust Gaze Vector Estimation: Relabelled GazeCapture subset

收藏
IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/data-centric-pruning-pipeline-robust-gaze-vector-estimation-relabelled-gazecapture-subset
下载链接
链接失效反馈
官方服务:
资源简介:
In-the-wild eye tracking datasets to train gaze vector estimation methods are notoriously noisy and unreliable, due both to limited image quality and to weak guarantees on label accuracy. At the same time, simply increasing dataset size does not guarantee better results, especially when diversity, balance and label reliability are not taken into account. In this work, we adopt a data-centric perspective and introduce an innovative data-pruning pipeline for robust gaze vector estimation. Our methodology leverages vision-language model embeddings and their zero-shot classification capabilities, together with an Early-Learning prediction-error analysis, to assign task-aware quality scores to individual samples. In addition, we conduct an exploratory study of data sample uniqueness and representativeness in the latent space providing additional insights into the structure and reliability of a large-scale dataset. Finally, we validate our approach on a large relabelled subset of the GazeCapture dataset by training a hybrid gaze-estimation model, and show that a curated subset of 100K samples yields an accuracy improvement over a model trained on the full 2.45M-sample dataset.
提供机构:
Alejandro Garcia de la Santa; Adrian Carrizo Pérez
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作