Graphics Insertions into Real Video for Market Research

  • Joanna Tarko

Student thesis: Doctoral ThesisDoctor of Engineering (EngD)


Combining real videos with computer-generated content, either off-line (compositing) or in real-time (augmented and mixed reality, AR/MR), is an extensive field of research. It has numerous applications, including entertainment, medical imaging, education, sport, architecture, and marketing (advertising and commerce). However, even though well established in marketing as a part of a retail environment, there seem to be no known applications of merging real and virtual in market research.

The aim of market research is to help explain why a customer decided to buy a specific product. In a perfect scenario, study participants are placed in a real but fully controlled shopping environment, but in practice, such environments are very expensive or even impossible to build. Using virtual reality (VR) environments instead significantly reduces costs. VR is fully controllable and immersive but CG models often lack realism.

This research project aims at providing mixed-reality tools which combine real camera footage with computer-generated elements to create plausible but still controlled environments that can be used for market research. My work consists of the full graphics insertions pipeline for both perspective and 360° spherical cameras, with real-time user interaction with the inserted objects. It addresses the three main technical challenges: tracking the camera, estimating the illumination to light virtual objects plausibly, and rendering virtual objects and compositing them with the video in real-time.

Tracking and image-based lighting techniques for perspective cameras are well established both in research and industry. Therefore, I focused only on real-time compositing for perspective video. My pipeline takes camera tracking data and reconstructed points from external software and synchronises them with the video sequence in the Unity game engine. Virtual objects can be dynamically inserted, and users can interact with them. Differential rendering for image-based shadows adds to the realism of insertions.

Then I extend the pipeline to 360° spherical cameras with my implementation of omnidirectional structure from motion for camera tracking and scene reconstruction. Selected 360° video frames, after inverse tone mapping, act as spatially distributed environment maps for image-based lighting. Like in the perspective cameras case, differential rendering enables shadow casting, and user can interact with inserted objects.

The proposed pipeline enables compositing in the Unity game engine with correct synchronisation between the camera pose and the video, both for perspective and 360° videos, which is not available by default. This allows virtual objects to be inserted into moving videos, which extends the state of the art, which is limited to static videos only.

In user studies, I evaluated the perceived quality of virtual objects insertions, and I compared their level of realism against purely virtual environments.
Date of Award18 Nov 2020
Original languageEnglish
Awarding Institution
  • University of Bath
SponsorsCheckmate VR Ltd.
SupervisorChristian Richardt (Supervisor) & Peter Hall (Supervisor)

Cite this