EgoCap

egocentric marker-less motion capture with two fisheye cameras

Helge Rhodin, Christian Richardt, Dan Casas, Eldar Insafutinov, Mohammad Shafiei, H.-P. Seidel, Bernt Schiele, Christian Theobalt

Research output: Contribution to journalArticle

29 Citations (Scopus)
56 Downloads (Pure)

Abstract

Marker-based and marker-less optical skeletal motion-capture methods use an outside-in arrangement of cameras placed around a scene, with viewpoints converging on the center. They often create discomfort by possibly needed marker suits, and their recording volume is severely restricted and often constrained to indoor scenes with controlled backgrounds. Alternative suit-based systems use several inertial measurement units or an exoskeleton to capture motion. This makes capturing independent of a confined volume, but requires substantial, often constraining, and hard to set up body instrumentation. We therefore propose a new method for real-time, marker-less and egocentric motion capture which estimates the full-body skeleton pose from a lightweight stereo pair of fisheye cameras that are attached to a helmet or virtual reality headset. It combines the strength of a new generative pose estimation framework for fisheye views with a ConvNet-based body-part detector trained on a large new dataset. Our inside-in method captures full-body motion in general indoor and outdoor scenes, and also crowded scenes with many people in close vicinity. The setup time and effort is minimal and the captured user can freely move around, which enables reconstruction of larger-scale activities and is particularly useful in virtual reality to freely roam and interact, while seeing the fully motion-captured virtual body.
Original languageEnglish
Article number162
Pages (from-to)1-11
Number of pages11
JournalACM Transactions on Graphics
Volume35
Issue number6
DOIs
Publication statusPublished - 30 Nov 2016
EventSIGGRAPH Asia 2016: The 9th ACM SIGGRAPH conference and exhibition on computer graphics and interactive techniques in Asia - Macao, Macao
Duration: 5 Dec 20168 Dec 2016
https://sa2016.siggraph.org/en/

Fingerprint

Virtual reality
Cameras
Units of measurement
Detectors

Keywords

  • motion capture
  • first-person vision
  • markerless
  • optical
  • inside-in
  • crowded scenes
  • large-scale

Cite this

Rhodin, H., Richardt, C., Casas, D., Insafutinov, E., Shafiei, M., Seidel, H-P., ... Theobalt, C. (2016). EgoCap: egocentric marker-less motion capture with two fisheye cameras. ACM Transactions on Graphics, 35(6), 1-11. [162]. https://doi.org/10.1145/2980179.2980235

EgoCap : egocentric marker-less motion capture with two fisheye cameras. / Rhodin, Helge; Richardt, Christian; Casas, Dan; Insafutinov, Eldar; Shafiei, Mohammad; Seidel, H.-P.; Schiele, Bernt; Theobalt, Christian.

In: ACM Transactions on Graphics, Vol. 35, No. 6, 162, 30.11.2016, p. 1-11.

Research output: Contribution to journalArticle

Rhodin, H, Richardt, C, Casas, D, Insafutinov, E, Shafiei, M, Seidel, H-P, Schiele, B & Theobalt, C 2016, 'EgoCap: egocentric marker-less motion capture with two fisheye cameras', ACM Transactions on Graphics, vol. 35, no. 6, 162, pp. 1-11. https://doi.org/10.1145/2980179.2980235
Rhodin, Helge ; Richardt, Christian ; Casas, Dan ; Insafutinov, Eldar ; Shafiei, Mohammad ; Seidel, H.-P. ; Schiele, Bernt ; Theobalt, Christian. / EgoCap : egocentric marker-less motion capture with two fisheye cameras. In: ACM Transactions on Graphics. 2016 ; Vol. 35, No. 6. pp. 1-11.
@article{91fb6996bb234330a8c9c336a0b2935c,
title = "EgoCap: egocentric marker-less motion capture with two fisheye cameras",
abstract = "Marker-based and marker-less optical skeletal motion-capture methods use an outside-in arrangement of cameras placed around a scene, with viewpoints converging on the center. They often create discomfort by possibly needed marker suits, and their recording volume is severely restricted and often constrained to indoor scenes with controlled backgrounds. Alternative suit-based systems use several inertial measurement units or an exoskeleton to capture motion. This makes capturing independent of a confined volume, but requires substantial, often constraining, and hard to set up body instrumentation. We therefore propose a new method for real-time, marker-less and egocentric motion capture which estimates the full-body skeleton pose from a lightweight stereo pair of fisheye cameras that are attached to a helmet or virtual reality headset. It combines the strength of a new generative pose estimation framework for fisheye views with a ConvNet-based body-part detector trained on a large new dataset. Our inside-in method captures full-body motion in general indoor and outdoor scenes, and also crowded scenes with many people in close vicinity. The setup time and effort is minimal and the captured user can freely move around, which enables reconstruction of larger-scale activities and is particularly useful in virtual reality to freely roam and interact, while seeing the fully motion-captured virtual body.",
keywords = "motion capture, first-person vision, markerless, optical, inside-in, crowded scenes, large-scale",
author = "Helge Rhodin and Christian Richardt and Dan Casas and Eldar Insafutinov and Mohammad Shafiei and H.-P. Seidel and Bernt Schiele and Christian Theobalt",
year = "2016",
month = "11",
day = "30",
doi = "10.1145/2980179.2980235",
language = "English",
volume = "35",
pages = "1--11",
journal = "ACM Transactions on Graphics",
issn = "0730-0301",
publisher = "Association for Computing Machinery",
number = "6",

}

TY - JOUR

T1 - EgoCap

T2 - egocentric marker-less motion capture with two fisheye cameras

AU - Rhodin, Helge

AU - Richardt, Christian

AU - Casas, Dan

AU - Insafutinov, Eldar

AU - Shafiei, Mohammad

AU - Seidel, H.-P.

AU - Schiele, Bernt

AU - Theobalt, Christian

PY - 2016/11/30

Y1 - 2016/11/30

N2 - Marker-based and marker-less optical skeletal motion-capture methods use an outside-in arrangement of cameras placed around a scene, with viewpoints converging on the center. They often create discomfort by possibly needed marker suits, and their recording volume is severely restricted and often constrained to indoor scenes with controlled backgrounds. Alternative suit-based systems use several inertial measurement units or an exoskeleton to capture motion. This makes capturing independent of a confined volume, but requires substantial, often constraining, and hard to set up body instrumentation. We therefore propose a new method for real-time, marker-less and egocentric motion capture which estimates the full-body skeleton pose from a lightweight stereo pair of fisheye cameras that are attached to a helmet or virtual reality headset. It combines the strength of a new generative pose estimation framework for fisheye views with a ConvNet-based body-part detector trained on a large new dataset. Our inside-in method captures full-body motion in general indoor and outdoor scenes, and also crowded scenes with many people in close vicinity. The setup time and effort is minimal and the captured user can freely move around, which enables reconstruction of larger-scale activities and is particularly useful in virtual reality to freely roam and interact, while seeing the fully motion-captured virtual body.

AB - Marker-based and marker-less optical skeletal motion-capture methods use an outside-in arrangement of cameras placed around a scene, with viewpoints converging on the center. They often create discomfort by possibly needed marker suits, and their recording volume is severely restricted and often constrained to indoor scenes with controlled backgrounds. Alternative suit-based systems use several inertial measurement units or an exoskeleton to capture motion. This makes capturing independent of a confined volume, but requires substantial, often constraining, and hard to set up body instrumentation. We therefore propose a new method for real-time, marker-less and egocentric motion capture which estimates the full-body skeleton pose from a lightweight stereo pair of fisheye cameras that are attached to a helmet or virtual reality headset. It combines the strength of a new generative pose estimation framework for fisheye views with a ConvNet-based body-part detector trained on a large new dataset. Our inside-in method captures full-body motion in general indoor and outdoor scenes, and also crowded scenes with many people in close vicinity. The setup time and effort is minimal and the captured user can freely move around, which enables reconstruction of larger-scale activities and is particularly useful in virtual reality to freely roam and interact, while seeing the fully motion-captured virtual body.

KW - motion capture

KW - first-person vision

KW - markerless

KW - optical

KW - inside-in

KW - crowded scenes

KW - large-scale

UR - https://doi.org/10.1145/2980179.2980235

UR - http://gvv.mpi-inf.mpg.de/projects/EgoCap/

U2 - 10.1145/2980179.2980235

DO - 10.1145/2980179.2980235

M3 - Article

VL - 35

SP - 1

EP - 11

JO - ACM Transactions on Graphics

JF - ACM Transactions on Graphics

SN - 0730-0301

IS - 6

M1 - 162

ER -