Abstract
In this paper we present a new method for categorizing
video sequences capturing different scene classes. This can
be seen as a generalization of previous work on scene classification
from single images. A scene is represented by a
collection of 3D points with an appearance based codeword
attached to each point. The cloud of points is recovered
by using a robust SFM algorithm applied on the
video sequence. A hierarchical structure of histograms located
at different locations and at different scales is used
to capture the typical spatial distribution of 3D points and
codewords in the working volume. The scene is classified
by SVM equipped with a histogram matching kernel, similar
to [21, 10, 16]. Results on a challenging dataset of 5
scene categories show competitive classification accuracy
and superior performance with respect to a state-of-the-art
2D pyramid matching methods [16] applied to individual
image frames.
Original language | English |
---|---|
Pages | 1655-1662 |
Number of pages | 8 |
DOIs | |
Publication status | Published - Sept 2009 |
Event | ICCV 2009: IEEE 12th International Conference on Computer Vision - Kyoto Duration: 29 Sept 2009 → 2 Oct 2009 |
Conference
Conference | ICCV 2009: IEEE 12th International Conference on Computer Vision |
---|---|
City | Kyoto |
Period | 29/09/09 → 2/10/09 |