Real-time RGBD-based Extended Body Pose Estimation
AI-generated Key Points
- System for real-time RGBD-based estimation of 3D human pose
- Focus on body pose, hand pose, and facial expression
- Utilizes parametric 3D deformable human mesh model (SMPL-X) and Kinect Azure RGB-D camera
- Estimators trained for body pose, facial expression parameters using landmark extractors and custom annotated datasets
- Hand pose estimated using a previously published method
- Predictions combined to generate temporally-smooth human pose
- Facial expression extractor trained with annotated talking face dataset
- Body pose dataset collected and annotated from 56 people captured by 5 Kinect Azure RGB-D cameras, in addition to utilizing a large motion capture AMASS dataset
- Results show outperformance of RGB-D body pose model compared to state-of-the-art RGB-only methods, comparable accuracy to slower RGB-D optimization-based solutions
- Entire system runs at 30 frames per second on a server with a single GPU
- Advanced system for real-time extended body pose estimation incorporating accurate estimations of body pose, hand pose, and facial expressions using RGBD data inputs
- Demonstrates improved accuracy compared to RGB-only methods and achieves real-time performance on standard hardware configurations.
Authors: Renat Bashirov, Anastasia Ianina, Karim Iskakov, Yevgeniy Kononenko, Valeriya Strizhkova, Victor Lempitsky, Alexander Vakhitov
Abstract: We present a system for real-time RGBD-based estimation of 3D human pose. We use parametric 3D deformable human mesh model (SMPL-X) as a representation and focus on the real-time estimation of parameters for the body pose, hands pose and facial expression from Kinect Azure RGB-D camera. We train estimators of body pose and facial expression parameters. Both estimators use previously published landmark extractors as input and custom annotated datasets for supervision, while hand pose is estimated directly by a previously published method. We combine the predictions of those estimators into a temporally-smooth human pose. We train the facial expression extractor on a large talking face dataset, which we annotate with facial expression parameters. For the body pose we collect and annotate a dataset of 56 people captured from a rig of 5 Kinect Azure RGB-D cameras and use it together with a large motion capture AMASS dataset. Our RGB-D body pose model outperforms the state-of-the-art RGB-only methods and works on the same level of accuracy compared to a slower RGB-D optimization-based solution. The combined system runs at 30 FPS on a server with a single GPU. The code will be available at https://saic-violet.github.io/rgbd-kinect-pose
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.