Human Pose Regression with Residual Log-likelihood Estimation

Jiefeng Li1
Siyuan Bian1
Ailing Zeng2
Can Wang3
Bo Pang1
Wentao Liu3
Cewu Lu1

1Shanghai Jiao Tong University
2The Chinese University of Hong Kong
3SenseTime Research

In ICCV 2021 (Oral)



Heatmap-based methods dominate in the field of human pose estimation by modelling the output distribution through likelihood heatmaps. In contrast, regression-based methods are more efficient but suffer from inferior performance. In this work, we explore maximum likelihood estimation (MLE) to develop an efficient and effective regression-based methods. From the perspective of MLE, adopting different regression losses is making different assumptions about the output density function. A density function closer to the true distribution leads to a better regression performance. In light of this, we propose a novel regression paradigm with Residual Log-likelihood Estimation (RLE) to capture the underlying output distribution. Concretely, RLE learns the change of the distribution instead of the unreferenced underlying distribution to facilitate the training process. With the proposed reparameterization design, our method is compatible with off-the-shelf flow models. The proposed method is effective, efficient and flexible. We show its potential in various human pose estimation tasks with comprehensive experiments. Compared to the conventional regression paradigm, regression with RLE bring 12.4 mAP improvement on MSCOCO without any test-time overhead. Moreover, for the first time, especially on multi-person pose estimation, our regression method is superior to the heatmap-based methods.


Results on COCO Keypoint

Results on Human3.6M dataset

Results on Retina OCT Segmentation

Robustness to Truncations

Our method can infer the joints outside the input bounding box, while heatmap-based methods failed.

Paper and Supplementary Material

Jiefeng Li, Siyuan Bian, Ailing Zeng, Can Wang, Bo Pang, Wentao Liu, Cewu Lu
Human Pose Regression with Residual Log-likelihood Estimation
In ICCV, 2021. (Oral)



This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.