The Importance of Disentanglement: SAGE Outperforms Unified VQ-VAE Baselines in Full-Body Motion

This article presents an ablation study confirming that disentangling motion latents into upper and lower halves significantly enhances 3D avatar reconstruction accuracy


This content originally appeared on HackerNoon and was authored by Zaddy

Abstract and 1. Introduction

  1. Related Work

    2.1. Motion Reconstruction from Sparse Input

    2.2. Human Motion Generation

  2. SAGE: Stratified Avatar Generation and 3.1. Problem Statement and Notation

    3.2. Disentangled Motion Representation

    3.3. Stratified Motion Diffusion

    3.4. Implementation Details

  3. Experiments and Evaluation Metrics

    4.1. Dataset and Evaluation Metrics

    4.2. Quantitative and Qualitative Results

    4.3. Ablation Study

  4. Conclusion and References

\ Supplementary Material

A. Extra Ablation Studies

B. Implementation Details

4.3. Ablation Study

We perform ablation study under S1 to justify the design choice of each component in our SAGE Net.

\ Table 4. Evaluation results under setting S3.

\ Table 5. Ablation results of different components in SAGE Net under setting S1.

\ Table 6. Evaluation results on the conditional strategy of the diffusion model under setting S1.

\ Disentangled Codebook: We establish a baseline using a unified motion representation to evaluate the disentangle strategy. Specifically, we developed a full-body VQ-VAE model that encodes full-body motion into a single, unified discrete codebook. Other components are the same as the original model. Results shown in the first and the last rows in Table 5, demonstrate that our approach employing disentangled latents significantly outperforms the baseline on all evaluation metrics. This demonstrates that the disentanglement can simplify the learning process by allowing the model to focus on a more limited set of movements and interactions. Additionally, Fig. 5 shows the visualization comparison between our model and baseline model, verifying that the disentangle can significantly improve the reconstruction results for the most challenging lower motions.

\

\ Disentanglement Strategy: To investigate the optimal disentanglement strategy, we explore an extreme disentanglement configuration by following the path from the root

\ Figure 6. Failure cases. All models are trained under setting S1.

\ (Pelvis) node to each leaf node along the kinematic tree. Specifically, we break down the body into five segments: the paths from the root to the left hand (a), right hand (b), head (c), left foot (d), and right foot (e). As reported in the last two rows of Tab. 5, the natural joint interconnections within the upper (or lower) body were disrupted when further disentangling the human body, resulting in performance drops and complicating the model design.

\

\ Limitation: In Fig. 6, both the previous state-of-the-art method and our model encounter difficulties in two main situations: (1) External Force-Induced Movements (the top row). (2) Unconventional Poses (the bottom row). The addition of more varied samples to the training dataset can potentially enhance the model’s performance in these areas.

\

:::info Authors:

(1) Han Feng, equal contributions, ordered by alphabet from Wuhan University;

(2) Wenchao Ma, equal contributions, ordered by alphabet from Pennsylvania State University;

(3) Quankai Gao, University of Southern California;

(4) Xianwei Zheng, Wuhan University;

(5) Nan Xue, Ant Group (xuenan@ieee.org);

(6) Huijuan Xu, Pennsylvania State University.

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\


This content originally appeared on HackerNoon and was authored by Zaddy


Print Share Comment Cite Upload Translate Updates
APA

Zaddy | Sciencx (2025-10-22T19:26:04+00:00) The Importance of Disentanglement: SAGE Outperforms Unified VQ-VAE Baselines in Full-Body Motion. Retrieved from https://www.scien.cx/2025/10/22/the-importance-of-disentanglement-sage-outperforms-unified-vq-vae-baselines-in-full-body-motion/

MLA
" » The Importance of Disentanglement: SAGE Outperforms Unified VQ-VAE Baselines in Full-Body Motion." Zaddy | Sciencx - Wednesday October 22, 2025, https://www.scien.cx/2025/10/22/the-importance-of-disentanglement-sage-outperforms-unified-vq-vae-baselines-in-full-body-motion/
HARVARD
Zaddy | Sciencx Wednesday October 22, 2025 » The Importance of Disentanglement: SAGE Outperforms Unified VQ-VAE Baselines in Full-Body Motion., viewed ,<https://www.scien.cx/2025/10/22/the-importance-of-disentanglement-sage-outperforms-unified-vq-vae-baselines-in-full-body-motion/>
VANCOUVER
Zaddy | Sciencx - » The Importance of Disentanglement: SAGE Outperforms Unified VQ-VAE Baselines in Full-Body Motion. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/10/22/the-importance-of-disentanglement-sage-outperforms-unified-vq-vae-baselines-in-full-body-motion/
CHICAGO
" » The Importance of Disentanglement: SAGE Outperforms Unified VQ-VAE Baselines in Full-Body Motion." Zaddy | Sciencx - Accessed . https://www.scien.cx/2025/10/22/the-importance-of-disentanglement-sage-outperforms-unified-vq-vae-baselines-in-full-body-motion/
IEEE
" » The Importance of Disentanglement: SAGE Outperforms Unified VQ-VAE Baselines in Full-Body Motion." Zaddy | Sciencx [Online]. Available: https://www.scien.cx/2025/10/22/the-importance-of-disentanglement-sage-outperforms-unified-vq-vae-baselines-in-full-body-motion/. [Accessed: ]
rf:citation
» The Importance of Disentanglement: SAGE Outperforms Unified VQ-VAE Baselines in Full-Body Motion | Zaddy | Sciencx | https://www.scien.cx/2025/10/22/the-importance-of-disentanglement-sage-outperforms-unified-vq-vae-baselines-in-full-body-motion/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.