Learning Similarity Metrics for Dynamic Scene Segmentation

Damien Teney, Matthew Brown, Dmitry Kit, Peter Hall

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

12 Citations (SciVal)
71 Downloads (Pure)


This paper addresses the segmentation of videos with arbitrary motion, including dynamic textures, using novel motion features and a supervised learning approach. Dynamic textures are commonplace in natural scenes, and exhibit complex patterns of appearance and motion (e.g. water, smoke, swaying foliage). These are difficult for existing segmentation algorithms, often violate the brightness constancy assumption needed for optical flow, and have complex segment characteristics beyond uniform appearance or motion. Our solution uses custom spatiotemporal filters that capture texture and motion cues, along with a novel metric-learning framework that optimizes this representation for specific objects and scenes. This is used within a hierarchical, graph-based segmentation setting, yielding state-of-the-art results for dynamic texture segmentation. We also demonstrate the applicability of our approach to general object and motion segmentation, showing significant improvements over unsupervised segmentation and results comparable to the best task specific approaches.
Original languageEnglish
Title of host publicationIEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015
Number of pages10
ISBN (Print)9781467369640
Publication statusPublished - 15 Oct 2015
EventComputer Vision and Pattern Recogntion 2015 - Boston, USA United States
Duration: 8 Jun 201510 Jun 2015

Publication series

Name2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
ISSN (Print)1063-6919
ISSN (Electronic)1063-6919


ConferenceComputer Vision and Pattern Recogntion 2015
Country/TerritoryUSA United States


  • dynamic scene
  • segmentation
  • distance metric
  • learning


Dive into the research topics of 'Learning Similarity Metrics for Dynamic Scene Segmentation'. Together they form a unique fingerprint.

Cite this