DexAvatar: 3D Sign Language Reconstruction with Hand and Body Pose Priors [WACV'26 🏆 Best Paper Award Finalist 🏆]

The official repository of the paper with supplementary: DexAvatar

conda create -n dexavatar -y python=3.10
conda activate dexavatar
bash scripts/env_install.sh
bash scripts/bug_fix_dexavatar.sh
conda deactivate

Download the signfy frames from this link and place them in the ./data folder. For evaluation, also download the smplxgt files.

The folder structure should be as follows:

data/
└── images_sgnify/
    ├── sign1/
    │   └── images/
    │       ├── Img1.png
    │       ├── Img2.png
    │       └── ...
    ├── sign2/
    │   └── images/
    │       ├── Img1.png
    │       ├── Img2.png
    │       └── ...
    └── ...

The sign segmentations and the corresponding classes for each sign are already present in the ./data folder for SGNify dataset. If you want to have your own sign segmentations and classes for each sign, please generate them from the previous work in this link.

For Sapiens

Install sapiens lite from the original sapiens github repo. Please create a new environment called sapiens_lite by following their instructions. Please download the checkpoint of rtmpose from google drive and place them in the following directory structure.

sapiens/
└── lite/
    └── torchscript/
        ├── detector/
        │   └── checkpoints/
        │       └── rtmpose/
        │           └── rtmdet_m_8xb32-100e_coco-obj365-person-235e8209.pth
        └── pose/
            └── checkpoints/
                └── sapiens_1b/
                    └── sapiens_1b_coco_wholebody_best_coco_wholebody_AP_727_torchscript.pt2

For SMPLer-X

conda create -n smpler_x python=3.8 -y
conda activate smpler_x
pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
pip install mmcv-full==1.7.1 -f https://download.openmmlab.com/mmcv/dist/cu116/torch1.12.0/index.html
pip install -r preprocess/SMPLer-X/requirements.txt
cd preprocess/SMPLer-X/main/transformer_utils
pip install -v -e .
cd ../../../../
pip install setuptools==69.5.1 yapf==0.40.1 numpy==1.23.5
bash scripts/bug_fix_dexavatar.sh

Please download the following checkpoints and smplx files from the google drive and place them in the following directory structure.

DexAvatar/
├── checkpoints/
│   ├── smpler_x_h32.pth.tar
│   └── mmdet/
│       ├── faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
│       └── mmdet_faster_rcnn_r50_fpn_coco.py
└── SMPLer-X/
    └── common/
        └── utils/
            └── human_model_files/

Please download the SignBPoser and SignHPoser from the google drive and place them in the following directory structure.

dexavatar_fitting/
└── smplifyx/
    ├── signbposer/
    └── signhposer/

Run Fitting Code

Run the following command to execute the code:

python run_dexavatar.py --input_img_folder DATA_PATH --output_path OUTPUT_FOLDER --fitting_experiment ./dexavatar_fitting

About the project

This project is carried out at the Human-Centered AI Lab in the Faculty of Information Technology, Monash University, Melbourne (Clayton), Australia.

Project Members -

Kaustubh Kundu (Monash University, Melbourne, Australia),
Hrishav Bakul Barua (Monash University and TCS Research, Kolkata, India),
Lucy Robertson-Bell (Monash University, Melbourne, Australia),
Zhixi Cai (Monash University, Melbourne, Australia), and
Kalin Stefanov (Monash University, Melbourne, Australia)

Funding details

This work is supported by the prestigious Discovery Early Career Researcher Award (DECRA) fellowship by Australian Research Council (ARC) [Grant no. DE230100049 | Project: Towards automated Australian Sign Language translation]. We also acknowledge Monash University (M3 Cluster) and National Computational Infrastructure (NCI) for providing High Performance Computing (HPC) to carry out experiments.

Overview

The trend in sign language generation is centered around data-driven generative methods. These methods require vast amounts of precise 2D and 3D human pose data to achieve a generation quality acceptable to the Deaf com- munity. However, currently, most sign language datasets are video-based and limited to automatically reconstructed 2D human poses (i.e., keypoints) and lack accurate 3D in- formation. However, manual production of accurate 2D and 3D human pose information from videos is a labor- intensive process. Furthermore, existing state-of-the-art for automatic 3D human pose estimation from sign language videos is prone to self-occlusion, noise, and motion blur ef- fects, resulting in poor reconstruction quality. In response to this, we introduce DexAvatar, a novel framework to re- construct bio-mechanically accurate fine-grained hand ar- ticulations and body movements from in-the-wild monocu- lar sign language videos, guided by learned 3D hand and body priors. DexAvatar achieves strong performance in the SGNify motion capture dataset, the only benchmark avail- able for this task, reaching an improvement of 35.11% in the estimation of body and hand poses compared to the state- of-the-art.

Overall Architecture

Qualitative Results (check out the videos!!)

General.mp4

Motion blur cases

blur.mp4

Self-occlusion cases

occlusion.mp4

Gaussian Noise cases

noise.mp4

Citation

If you find our work (i.e., the code, the theory/concept, or the dataset) useful for your research or development activities, please consider citing our work as follows:

@InProceedings{Kundu_2026_WACV,
    author    = {Kundu, Kaustubh and Barua, Hrishav Bakul and Robertson-Bell, Lucy and Cai, Zhixi and Stefanov, Kalin},
    title     = {DexAvatar: 3D Sign Language Reconstruction with Hand and Body Pose Priors},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {March},
    year      = {2026},
    pages     = {5842-5852}
}

License and Copyright

----------------------------------------------------------------------------------------
Copyright 2025 | All the authors and contributors of this repository as mentioned above.
----------------------------------------------------------------------------------------

Please check the License Agreement.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
SMPLer-X		SMPLer-X
assets		assets
data		data
dexavatar_fitting		dexavatar_fitting
hamer		hamer
neural_renderer		neural_renderer
sapiens		sapiens
scripts		scripts
torch-mesh-isect		torch-mesh-isect
Full_running_command.sh		Full_running_command.sh
LICENSE		LICENSE
M3_mean_shape_smplerx.py		M3_mean_shape_smplerx.py
README.md		README.md
requirements.txt		requirements.txt
run_dexavatar.py		run_dexavatar.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DexAvatar: 3D Sign Language Reconstruction with Hand and Body Pose Priors [WACV'26 🏆 Best Paper Award Finalist 🏆]

About the project

Funding details

Overview

Overall Architecture

Qualitative Results (check out the videos!!)

Motion blur cases

Self-occlusion cases

Gaussian Noise cases

Citation

License and Copyright

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DexAvatar: 3D Sign Language Reconstruction with Hand and Body Pose Priors [WACV'26 🏆 Best Paper Award Finalist 🏆]

About the project

Funding details

Overview

Overall Architecture

Qualitative Results (check out the videos!!)

Motion blur cases

Self-occlusion cases

Gaussian Noise cases

Citation

License and Copyright

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages