Real-time RGBD-based Extended Body Pose Estimation

AI-generated keywords: RGBD Pose Estimation Human Mesh Model Kinect Azure Camera Facial Expression

AI-generated Key Points

  • System for real-time RGBD-based estimation of 3D human pose
  • Focus on body pose, hand pose, and facial expression
  • Utilizes parametric 3D deformable human mesh model (SMPL-X) and Kinect Azure RGB-D camera
  • Estimators trained for body pose, facial expression parameters using landmark extractors and custom annotated datasets
  • Hand pose estimated using a previously published method
  • Predictions combined to generate temporally-smooth human pose
  • Facial expression extractor trained with annotated talking face dataset
  • Body pose dataset collected and annotated from 56 people captured by 5 Kinect Azure RGB-D cameras, in addition to utilizing a large motion capture AMASS dataset
  • Results show outperformance of RGB-D body pose model compared to state-of-the-art RGB-only methods, comparable accuracy to slower RGB-D optimization-based solutions
  • Entire system runs at 30 frames per second on a server with a single GPU
  • Advanced system for real-time extended body pose estimation incorporating accurate estimations of body pose, hand pose, and facial expressions using RGBD data inputs
  • Demonstrates improved accuracy compared to RGB-only methods and achieves real-time performance on standard hardware configurations.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Renat Bashirov, Anastasia Ianina, Karim Iskakov, Yevgeniy Kononenko, Valeriya Strizhkova, Victor Lempitsky, Alexander Vakhitov

WACV 2021
License: CC BY-NC-SA 4.0

Abstract: We present a system for real-time RGBD-based estimation of 3D human pose. We use parametric 3D deformable human mesh model (SMPL-X) as a representation and focus on the real-time estimation of parameters for the body pose, hands pose and facial expression from Kinect Azure RGB-D camera. We train estimators of body pose and facial expression parameters. Both estimators use previously published landmark extractors as input and custom annotated datasets for supervision, while hand pose is estimated directly by a previously published method. We combine the predictions of those estimators into a temporally-smooth human pose. We train the facial expression extractor on a large talking face dataset, which we annotate with facial expression parameters. For the body pose we collect and annotate a dataset of 56 people captured from a rig of 5 Kinect Azure RGB-D cameras and use it together with a large motion capture AMASS dataset. Our RGB-D body pose model outperforms the state-of-the-art RGB-only methods and works on the same level of accuracy compared to a slower RGB-D optimization-based solution. The combined system runs at 30 FPS on a server with a single GPU. The code will be available at https://saic-violet.github.io/rgbd-kinect-pose

Submitted to arXiv on 05 Mar. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2103.03663v1

This paper presents a system for real-time RGBD-based estimation of 3D human pose, focusing on the body pose, hand pose and facial expression. The system utilizes a parametric 3D deformable human mesh model (SMPL-X) as a representation and leverages the Kinect Azure RGB-D camera for data input. The authors train estimators for body pose and facial expression parameters using previously published landmark extractors and custom annotated datasets. Hand pose is estimated directly using a previously published method. The predictions from these estimators are combined to generate a temporally-smooth human pose. To train the facial expression extractor, the authors annotate a large talking face dataset with facial expression parameters. For the body pose, they collect and annotate a dataset of 56 people captured from a rig of 5 Kinect Azure RGB-D cameras, in addition to utilizing a large motion capture AMASS dataset. The results show that the RGB-D body pose model outperforms state-of-the-art RGB-only methods while achieving comparable accuracy to slower RGB-D optimization-based solutions. The entire system runs at 30 frames per second on a server with a single GPU. In summary, this paper presents an advanced system for real-time extended body pose estimation that incorporates accurate estimations of body pose, hand pose and facial expressions using RGBD data inputs. The system demonstrates improved accuracy compared to RGB-only methods and achieves real-time performance on standard hardware configurations.
Created on 08 Jul. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.