Projects per year
Abstract
We investigate the hypothesis that gaze-signal can improve egocentric action recognition on the standard benchmark, EGTEA Gaze++ dataset. In contrast to prior work where gaze-signal was only used during training, we formulate a novel neural fusion approach, Cross-modality Attention Blocks (CMA), to leverage gaze-signal for action recognition during inference as well. CMA combines information from different modalities at different levels of abstraction to achieve state-of-the-art performance for egocentric action recognition. Specifically, fusing the video-stream with optical-flow with CMA outperforms the current state-of-the-art by 3%. However, when CMA is employed to fuse gaze-signal with video-stream data, no improvements are observed. Further investigation of this counter-intuitive finding indicates that small spatial overlap between the network's attention-map and gaze ground-truth renders the gaze-signal uninformative for this benchmark. Based on our empirical findings, we recommend improvements to the current benchmark to develop practical systems for egocentric video understanding with gaze-signal.
Original language | English |
---|---|
Title of host publication | Proceedings - ETRA 2022 |
Subtitle of host publication | ACM Symposium on Eye Tracking Research and Applications |
Editors | Stephen N. Spencer |
Publisher | Association for Computing Machinery |
ISBN (Electronic) | 9781450392525 |
DOIs | |
Publication status | Published - 8 Jun 2022 |
Event | 2022 ACM Symposium on Eye Tracking Research and Applications, ETRA 2022 - Virtual, Online, USA United States Duration: 8 Jun 2022 → 11 Jun 2022 |
Publication series
Name | Eye Tracking Research and Applications Symposium (ETRA) |
---|
Conference
Conference | 2022 ACM Symposium on Eye Tracking Research and Applications, ETRA 2022 |
---|---|
Country/Territory | USA United States |
City | Virtual, Online |
Period | 8/06/22 → 11/06/22 |
Bibliographical note
Publisher Copyright:© 2022 ACM.
Keywords
- attention
- deep neural networks
- egocentric action recognition
- gaze
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Human-Computer Interaction
- Ophthalmology
- Sensory Systems
Fingerprint
Dive into the research topics of 'Can Gaze Inform Egocentric Action Recognition?'. Together they form a unique fingerprint.-
Centre for the Analysis of Motion, Entertainment Research and Applications (CAMERA) - 2.0
Campbell, N. (PI), Cosker, D. (PI), Bilzon, J. (CoI), Campbell, N. (CoI), Cazzola, D. (CoI), Colyer, S. (CoI), Cosker, D. (CoI), Lutteroth, C. (CoI), McGuigan, P. (CoI), O'Neill, E. (CoI), Petrini, K. (CoI), Proulx, M. (CoI) & Yang, Y. (CoI)
Engineering and Physical Sciences Research Council
1/11/20 → 31/10/25
Project: Research council
-
Centre for the Analysis of Motion, Entertainment Research and Applications (CAMERA)
Cosker, D. (PI), Bilzon, J. (CoI), Campbell, N. (CoI), Cazzola, D. (CoI), Colyer, S. (CoI), Fincham Haines, T. (CoI), Hall, P. (CoI), Kim, K. I. (CoI), Lutteroth, C. (CoI), McGuigan, P. (CoI), O'Neill, E. (CoI), Richardt, C. (CoI), Salo, A. (CoI), Seminati, E. (CoI), Tabor, A. (CoI) & Yang, Y. (CoI)
Engineering and Physical Sciences Research Council
1/09/15 → 28/02/21
Project: Research council