Abstract
In this paper, we examine the problem of internet video categorization. Specifically, we explore the representation of a video as a ldquobag of wordsrdquo using various combinations of spatial and temporal descriptors. The descriptors incorporate both spatial and temporal gradients as well as optical flow information. We achieve state-of-the-art results on a standard human activity recognition database and demonstrate promising category recognition performance on two new databases of approximately 1000 and 1500 online user-submitted videos, which we will be making available to the community.
Original language | English |
---|---|
DOIs | |
Publication status | Published - Jun 2008 |
Event | CVPRW '08: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008 - Anchorage Duration: 23 Jun 2008 → 28 Jun 2008 |
Conference
Conference | CVPRW '08: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008 |
---|---|
City | Anchorage |
Period | 23/06/08 → 28/06/08 |