Abstract
While speech driven animation for lip-synching and facial expression synthesis from speech has previously received much attention, there is no previous work on generating non-verbal actions such as laughing and crying automatically from an audio signal. In this article initial results on a system designed to address this issue are presented. 3D facial data is recorded for a participant making different actions-i.e. laughing, crying, yawning and sneezing-using a Qualysis (Sweden) optical motion-capture system while simultaneously recording audio data. 30 retro-reflective markers were placed on the participant's face to capture movement. Using this data, an analysis and synthesis machine was then trained consisting of a dual-input Hidden Markov Model (HMM) and a trellis search algorithm which converts HMM visual states and new input audio into new 3D motion-capture data.
Original language | English |
---|---|
Title of host publication | IET 4th European Conference on Visual Media Production (CVMP 2007) |
Publisher | IET |
Pages | 16 |
DOIs | |
Publication status | Published - 2007 |
Event | IET 4th European Conference on Visual Media Production - London, UK United Kingdom Duration: 27 Nov 2007 → 28 Nov 2007 |
Conference
Conference | IET 4th European Conference on Visual Media Production |
---|---|
Country/Territory | UK United Kingdom |
City | London |
Period | 27/11/07 → 28/11/07 |