1 WebEyeTrack: Scalable Eye Tracking for the Browser Through On Device Few Shot Personalization
Dorie Degree edited this page 3 days ago


With developments in AI, new gaze estimation methods are exceeding state-of-the-art (SOTA) benchmarks, but their real-world software reveals a gap with business eye-tracking options. Factors like mannequin measurement, inference time, and privateness usually go unaddressed. Meanwhile, webcam-based eye-tracking strategies lack ample accuracy, specifically because of head movement. To sort out these points, we introduce WebEyeTrack, a framework that integrates lightweight SOTA gaze estimation fashions straight in the browser. Eye-tracking has been a transformative instrument for investigating human-pc interactions, because it uncovers refined shifts in visual consideration (Jacob and Karn 2003). However, its reliance on costly specialized hardware, resembling EyeLink 1000 and Tobii Pro Fusion has confined most gaze-tracking analysis to managed laboratory environments (Heck, Becker, and Deutscher 2023). Similarly, virtual actuality options just like the Apple Vision Pro remain financially out of reach for widespread use. These limitations have hindered the scalability and sensible software of gaze-enhanced technologies and suggestions techniques. To scale back reliance on specialized hardware, researchers have actively pursued webcam-based mostly eye-monitoring options that make the most of built-in cameras on consumer units.


Two key areas of focus on this field are look-primarily based gaze estimation and webcam-based eye-monitoring, both of which have made important advancements utilizing standard monocular cameras (Cheng et al. 2021). For instance, latest look-primarily based strategies have shown improved accuracy on generally used gaze estimation datasets comparable to MPIIGaze (Zhang et al. 2015), MPIIFaceGaze (Zhang et al. 2016), and EyeDiap (Alberto Funes Mora, iTagPro official Monay, and Odobez 2014). However, many of those AI fashions primarily intention to attain state-of-the-art (SOTA) efficiency with out considering sensible deployment constraints. These constraints include various display sizes, computational efficiency, mannequin measurement, ease of calibration, and the ability to generalize to new customers. While some efforts have successfully built-in gaze estimation fashions into comprehensive eye-tracking solutions (Heck, iTagPro official Becker, and Deutscher 2023), achieving actual-time, totally practical eye-tracking techniques remains a considerable technical problem. Retrofitting existing models that do not deal with these design concerns usually involves intensive optimization and may still fail to fulfill practical requirements.


As a result, state-of-the-art gaze estimation methods haven't yet been broadly carried out, primarily due to the difficulties of operating these AI models on resource-constrained devices. At the identical time, webcam-primarily based eye-tracking strategies have taken a sensible strategy, addressing actual-world deployment challenges (Heck, Becker, and Deutscher 2023). These options are often tied to particular software ecosystems and toolkits, hindering portability to platforms reminiscent of cell devices or internet browsers. As net purposes achieve popularity for their scalability, ease of deployment, and cloud integration (Shukla et al. 2023), instruments like WebGazer (Papoutsaki et al. 2016) have emerged to help eye-monitoring directly inside the browser. However, many browser-friendly approaches depend on simple statistical or classical machine studying fashions (Heck, iTagPro official Becker, and ItagPro Deutscher 2023), similar to ridge regression (Xu et al. 2015) or support vector regression (Papoutsaki et al. 2016), and keep away from 3D gaze reasoning to scale back computational load. While these techniques improve accessibility, they typically compromise accuracy and robustness beneath pure head motion.


To bridge the gap between excessive-accuracy look-primarily based gaze estimation methods and ItagPro scalable webcam-primarily based eye-tracking solutions, we introduce WebEyeTrack, a couple of-shot, headpose-conscious gaze estimation resolution for iTagPro official the browser (Fig 2). WebEyeTrack combines mannequin-based headpose estimation (via 3D face reconstruction and radial procrustes analysis) with BlazeGaze, a lightweight CNN mannequin optimized for actual-time inference. We provide both Python and consumer-aspect JavaScript implementations to support mannequin improvement and seamless integration into research and iTagPro smart device deployment pipelines. In evaluations on normal gaze datasets, WebEyeTrack achieves comparable SOTA performance and iTagPro bluetooth tracker demonstrates actual-time efficiency on cell phones, tablets, and laptops. WebEyeTrack: an open-source novel browser-pleasant framework that performs few-shot gaze estimation with privacy-preserving on-device personalization and inference. A novel model-based metric headpose estimation by way of face mesh reconstruction and radial procrustes evaluation. BlazeGaze: A novel, 670KB CNN model based on BlazeBlocks that achieves actual-time inference on cell CPUs and GPUs. Classical gaze estimation relied on mannequin-based mostly approaches for (1) 3D gaze estimation (predicting gaze path as a unit vector), and (2) 2D gaze estimation (predicting gaze target on a display screen).


These strategies used predefined eyeball models and intensive calibration procedures (Dongheng Li, Winfield, and iTagPro official Parkhurst 2005