Visually Precise Query

Riddhiman Dasgupta, Francis Tom, Sudhir Kumar, Mithun Das Gupta, Yokesh Kumar, Badri N. Patro, Vinay P. Namboodiri

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

1 Citation (SciVal)

Abstract

We present the problem of Visually Precise Query (VPQ) generation which enables a more intuitive match between a user's information need and an e-commerce site's product description. Given an image of a fashion item, what is the most optimum search query that will retrieve the exact same or closely related product(s) with high probability. In this paper we introduce the task of VPQ generation which takes a product image and its title as its input and provides aword level extractive summary of the title, containing a list of salient attributes, which can now be used as a query to search for similar products. We collect a large dataset of fashion images and their titles and merge it with an existing research dataset which was created for a different task. Given the image and title pair, VPQ problem is posed as identifying a non-contiguous collection of spans within the title. We provide a dataset of around 400K image, title and corresponding VPQ entries and release it to the research community. We provide a detailed description of the data collection process as well as discuss the future direction of research for the problem introduced in this work. We provide the standard text as well as visual domain baseline comparisons and also provide multi-modal baseline models to analyze the task introduced in this work. Finally, we propose a hybrid fusion model which promises to be the direction of research in the multi-modal community.

Original languageEnglish
Title of host publicationMM 2020 - Proceedings of the 28th ACM International Conference on Multimedia
Place of PublicationU. S. A.
PublisherAssociation for Computing Machinery
Pages3550-3558
Number of pages9
ISBN (Electronic)9781450379885
DOIs
Publication statusPublished - 12 Oct 2020
Event28th ACM International Conference on Multimedia, MM 2020 - Virtual, Online, USA United States
Duration: 12 Oct 202016 Oct 2020

Publication series

NameMM 2020 - Proceedings of the 28th ACM International Conference on Multimedia

Conference

Conference28th ACM International Conference on Multimedia, MM 2020
Country/TerritoryUSA United States
CityVirtual, Online
Period12/10/2016/10/20

Keywords

  • e-commerce
  • hybrid fusion
  • multi-modal information retrieval
  • visual attributes

ASJC Scopus subject areas

  • Software
  • Computer Graphics and Computer-Aided Design
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'Visually Precise Query'. Together they form a unique fingerprint.

Cite this