Coarse or Fine? Recognising Action End States without Labels

Davide Moltisanti, Hakan Bilen, Laura Sevilla-Lara, Frank Keller

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

Abstract

We focus on the problem of recognising the end state of an action in an image, which is critical for understanding what action is performed and in which manner. We study this focusing on the task of predicting the coarseness of a cut, i.e., deciding whether an object was cut "coarsely"or "finely". No dataset with these annotated end states is available, so we propose an augmentation method to synthesise training data. We apply this method to cutting actions extracted from an existing action recognition dataset. Our method is object agnostic, i.e., it presupposes the location of the object but not its identity. Starting from less than a hundred images of a whole object, we can generate several thousands images simulating visually diverse cuts of different coarseness. We use our synthetic data to train a model based on UNet and test it on real images showing coarsely/finely cut objects. Results demonstrate that the model successfully recognises the end state of the cutting action despite the domain gap between training and testing, and that the model generalises well to unseen objects.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
Place of PublicationU. S. A.
PublisherIEEE
Pages1191-1200
Number of pages10
ISBN (Electronic)9798350365474
DOIs
Publication statusPublished - 27 Sept 2024
Event2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 - Seattle, USA United States
Duration: 16 Jun 202422 Jun 2024

Publication series

NameIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
ISSN (Print)2160-7508
ISSN (Electronic)2160-7516

Conference

Conference2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
Country/TerritoryUSA United States
CitySeattle
Period16/06/2422/06/24

Keywords

  • adverb recognition
  • fine-grained recognition
  • object end-state recognition

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Coarse or Fine? Recognising Action End States without Labels'. Together they form a unique fingerprint.

Cite this