Motion VQ-Trans

Pytorch implementation of paper "Generating Human Motion from Textual Descriptions with High Quality Discrete Representation"

[Notebook Demo]

If our project is helpful for your research, please consider citing : (todo)

@inproceedings{shen2020ransac,
          title={RANSAC-Flow: generic two-stage image alignment},
          author={Shen, Xi and Darmon, Fran{\c{c}}ois and Efros, Alexei A and Aubry, Mathieu},
          booktitle={16th European Conference on Computer Vision}
          year={2020}
        }

Table of Content

1. Visual Results
2. Installation
3. Quick Start
4. Train
5. Evaluation
6. Motion Render
7. Acknowledgement
8. ChangLog

1. Visual Results (More results can be found in our project page (todo))

2. Installation

2.1. Environment

Our model can be learnt in a single GPU V100-32G

conda env create -f environment.yml
conda activate VQTrans

The code was tested on Python 3.8 and PyTorch 1.8.1.

2.2. Dependencies

bash dataset/prepare/download_glove.sh

2.3. Datasets

We are using two 3D human motion-language dataset: HumanML3D and KIT-ML. For both datasets, you could find the details as well as download link [here].

Take HumanML3D for an example, the file directory should look like this:

./dataset/HumanML3D/
├── new_joint_vecs/
├── texts/
├── Mean.npy # same as in [HumanML3D](https://github.com/EricGuo5513/HumanML3D) 
├── Std.npy # same as in [HumanML3D](https://github.com/EricGuo5513/HumanML3D) 
├── train.txt
├── val.txt
├── test.txt
├── train_val.txt
└──all.txt

2.4. Motion & text feature extractors:

We use the same extractors provided by t2m to evaluate our generated motions. Please download the extractors.

bash dataset/prepare/download_extractor.sh

2.5. Pre-trained models

The pretrained model files will be stored in the 'pretrained' folder:

bash dataset/prepare/download_model.sh

2.6. Render motion (optional)

If you want to render the generated motion, you need to install:

sudo sh dataset/prepare/download_smpl.sh
conda install -c menpo osmesa
conda install h5py
conda install -c conda-forge shapely pyrender trimesh mapbox_earcut

3. Quick Start

A quick start guide of how to use our code is available in demo.ipynb

demo

4. Train

Note that, for kit dataset, just need to set '--dataname kit'.

4.1. VQ-VAE

The results are saved in the folder output_vqfinal.

VQ training

python3 train_vq.py \
--batch-size 256 \
--lr 2e-4 \
--total-iter 300000 \
--lr-scheduler 200000 \
--nb-code 512 \
--down-t 2 \
--depth 3 \
--dilation-growth-rate 3 \
--out-dir output \
--dataname t2m \
--vq-act relu \
--quantizer ema_reset \
--loss-vel 0.5 \
--recons-loss l1_smooth \
--exp-name VQVAE

4.2. Motion-Transformer

The results are saved in the folder output_transformer.

MoTrans training

python3 train_t2m_trans.py  \
--exp-name VQTransformer \
--batch-size 128 \
--num-layers 9 \
--embed-dim-gpt 1024 \
--nb-code 512 \
--n-head-gpt 16 \
--block-size 51 \
--ff-rate 4 \
--drop-out-rate 0.1 \
--resume-pth output/VQVAE/net_last.pth \
--vq-name VQVAE \
--out-dir output \
--total-iter 300000 \
--lr-scheduler 150000 \
--lr 0.0001 \
--dataname t2m \
--down-t 2 \
--depth 3 \
--quantizer ema_reset \
--eval-iter 10000 \
--pkeep 0.5 \
--dilation-growth-rate 3 \
--vq-act relu

5. Evaluation

5.1. VQ-VAE

VQ eval

python3 VQ_eval.py \
--batch-size 256 \
--lr 2e-4 \
--total-iter 300000 \
--lr-scheduler 200000 \
--nb-code 512 \
--down-t 2 \
--depth 3 \
--dilation-growth-rate 3 \
--out-dir output \
--dataname t2m \
--vq-act relu \
--quantizer ema_reset \
--loss-vel 0.5 \
--recons-loss l1_smooth \
--exp-name TEST_VQVAE \
--resume-pth output/VQVAE/net_last.pth

5.2. Motion-Transformer

MoTrans eval

python3 GPT_eval_multi.py  \
--exp-name TEST_VQTransformer \
--batch-size 128 \
--num-layers 9 \
--embed-dim-gpt 1024 \
--nb-code 512 \
--n-head-gpt 16 \
--block-size 51 \
--ff-rate 4 \
--drop-out-rate 0.1 \
--resume-pth output/VQVAE/net_last.pth \
--vq-name VQVAE \
--out-dir output \
--total-iter 300000 \
--lr-scheduler 150000 \
--lr 0.0001 \
--dataname t2m \
--down-t 2 \
--depth 3 \
--quantizer ema_reset \
--eval-iter 10000 \
--pkeep 0.5 \
--dilation-growth-rate 3 \
--vq-act relu \
--resume-gpt output/VQTransformer/net_best_fid.pth

6. Motion Render

Motion Render

You should input the npy folder address and the motion names. Here is an example:

python3 render_final.py --filedir output/TEST_VQTransformer/ --motion-list 000019 005485

7. Acknowledgement

We appreciate helps from :

Public code like text-to-motion, TM2T etc.