Spaces:

xymeow7
/

gene-hoi-denoising

Runtime error

App Files Files Community

meow commited on Feb 7

Commit

d6d3a5b

•

1 Parent(s): f9fd2fa

init

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

LICENSE +21 -0
README.md +462 -12
app.py +52 -0
cog.yaml +38 -0
common/.gitignore +1 -0
common/___init___.py +0 -0
common/abstract_pl.py +180 -0
common/args_utils.py +15 -0
common/body_models.py +146 -0
common/camera.py +474 -0
common/comet_utils.py +158 -0
common/data_utils.py +371 -0
common/ld_utils.py +116 -0
common/list_utils.py +52 -0
common/mesh.py +94 -0
common/metrics.py +51 -0
common/np_utils.py +7 -0
common/object_tensors.py +293 -0
common/pl_utils.py +63 -0
common/rend_utils.py +139 -0
common/rot.py +782 -0
common/sys_utils.py +44 -0
common/thing.py +66 -0
common/torch_utils.py +212 -0
common/transforms.py +356 -0
common/viewer.py +287 -0
common/vis_utils.py +129 -0
common/xdict.py +288 -0
data_loaders/.DS_Store +0 -0
data_loaders/__pycache__/get_data.cpython-38.pyc +0 -0
data_loaders/__pycache__/tensors.cpython-38.pyc +0 -0
data_loaders/get_data.py +178 -0
data_loaders/humanml/.DS_Store +0 -0
data_loaders/humanml/README.md +1 -0
data_loaders/humanml/common/__pycache__/quaternion.cpython-38.pyc +0 -0
data_loaders/humanml/common/__pycache__/skeleton.cpython-38.pyc +0 -0
data_loaders/humanml/common/quaternion.py +423 -0
data_loaders/humanml/common/skeleton.py +199 -0
data_loaders/humanml/data/__init__.py +0 -0
data_loaders/humanml/data/__pycache__/__init__.cpython-38.pyc +0 -0
data_loaders/humanml/data/__pycache__/dataset.cpython-38.pyc +0 -0
data_loaders/humanml/data/__pycache__/dataset_ours.cpython-38.pyc +0 -0
data_loaders/humanml/data/__pycache__/dataset_ours_single_seq.cpython-38.pyc +0 -0
data_loaders/humanml/data/__pycache__/utils.cpython-38.pyc +0 -0
data_loaders/humanml/data/dataset.py +795 -0
data_loaders/humanml/data/dataset_ours.py +0 -0
data_loaders/humanml/data/dataset_ours_single_seq.py +0 -0
data_loaders/humanml/data/utils.py +507 -0
data_loaders/humanml/motion_loaders/__init__.py +0 -0
data_loaders/humanml/motion_loaders/__pycache__/__init__.cpython-38.pyc +0 -0

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2022 Guy Tevet
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md CHANGED Viewed

@@ -1,12 +1,462 @@
----
-title: Gene Hoi Denoising
-emoji: 📉
-colorFrom: yellow
-colorTo: yellow
-sdk: gradio
-sdk_version: 4.17.0
-app_file: app.py
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# MDM: Human Motion Diffusion Model
+data in what format and data in this foramt
+[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/human-motion-diffusion-model/motion-synthesis-on-humanact12)](https://paperswithcode.com/sota/motion-synthesis-on-humanact12?p=human-motion-diffusion-model)
+[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/human-motion-diffusion-model/motion-synthesis-on-humanml3d)](https://paperswithcode.com/sota/motion-synthesis-on-humanml3d?p=human-motion-diffusion-model)
+[![arXiv](https://img.shields.io/badge/arXiv-<2209.14916>-<COLOR>.svg)](https://arxiv.org/abs/2209.14916)
+<a href="https://replicate.com/arielreplicate/motion_diffusion_model"><img src="https://replicate.com/arielreplicate/motion_diffusion_model/badge"></a>
+The official PyTorch implementation of the paper [**"Human Motion Diffusion Model"**](https://arxiv.org/abs/2209.14916).
+Please visit our [**webpage**](https://guytevet.github.io/mdm-page/) for more details.
+![teaser](https://github.com/GuyTevet/mdm-page/raw/main/static/figures/github.gif)
+#### Bibtex
+If you find this code useful in your research, please cite:
+```
+@article{tevet2022human,
+  title={Human Motion Diffusion Model},
+  author={Tevet, Guy and Raab, Sigal and Gordon, Brian and Shafir, Yonatan and Bermano, Amit H and Cohen-Or, Daniel},
+  journal={arXiv preprint arXiv:2209.14916},
+  year={2022}
+}
+```
+## News
+📢 **23/Nov/22** - Fixed evaluation issue (#42) - Please pull and run `bash prepare/download_t2m_evaluators.sh` from the top of the repo to adapt.
+📢 **4/Nov/22** - Added sampling, training and evaluation of unconstrained tasks.
+  Note slight env changes adapting to the new code. If you already have an installed environment, run `bash prepare/download_unconstrained_assets.sh; conda install -y -c anaconda scikit-learn
+` to adapt.
+📢 **3/Nov/22** - Added in-between and upper-body editing.
+📢 **31/Oct/22** - Added sampling, training and evaluation of action-to-motion tasks.
+📢 **9/Oct/22** - Added training and evaluation scripts.
+  Note slight env changes adapting to the new code. If you already have an installed environment, run `bash prepare/download_glove.sh; pip install clearml` to adapt.
+📢 **6/Oct/22** - First release - sampling and rendering using pre-trained models.
+## Getting started
+This code was tested on `Ubuntu 18.04.5 LTS` and requires:
+* Python 3.7
+* conda3 or miniconda3
+* CUDA capable GPU (one is enough)
+### 1. Setup environment
+Install ffmpeg (if not already installed):
+```shell
+sudo apt update
+sudo apt install ffmpeg
+```
+For windows use [this](https://www.geeksforgeeks.org/how-to-install-ffmpeg-on-windows/) instead.
+Setup conda env:
+```shell
+conda env create -f environment.yml
+conda activate mdm
+python -m spacy download en_core_web_sm
+pip install git+https://github.com/openai/CLIP.git
+```
+Download dependencies:
+<details>
+  <summary><b>Text to Motion</b></summary>
+```bash
+bash prepare/download_smpl_files.sh
+bash prepare/download_glove.sh
+bash prepare/download_t2m_evaluators.sh
+```
+</details>
+<details>
+  <summary><b>Action to Motion</b></summary>
+```bash
+bash prepare/download_smpl_files.sh
+bash prepare/download_recognition_models.sh
+```
+</details>
+<details>
+  <summary><b>Unconstrained</b></summary>
+```bash
+bash prepare/download_smpl_files.sh
+bash prepare/download_recognition_models.sh
+bash prepare/download_recognition_unconstrained_models.sh
+```
+</details>
+### 2. Get data
+<details>
+  <summary><b>Text to Motion</b></summary>
+There are two paths to get the data:
+(a) **Go the easy way if** you just want to generate text-to-motion (excluding editing which does require motion capture data)
+(b) **Get full data** to train and evaluate the model.
+#### a. The easy way (text only)
+**HumanML3D** - Clone HumanML3D, then copy the data dir to our repository:
+```shell
+cd ..
+git clone https://github.com/EricGuo5513/HumanML3D.git
+unzip ./HumanML3D/HumanML3D/texts.zip -d ./HumanML3D/HumanML3D/
+cp -r HumanML3D/HumanML3D motion-diffusion-model/dataset/HumanML3D
+cd motion-diffusion-model
+```
+#### b. Full data (text + motion capture)
+**HumanML3D** - Follow the instructions in [HumanML3D](https://github.com/EricGuo5513/HumanML3D.git),
+then copy the result dataset to our repository:
+```shell
+cp -r ../HumanML3D/HumanML3D ./dataset/HumanML3D
+```
+**KIT** - Download from [HumanML3D](https://github.com/EricGuo5513/HumanML3D.git) (no processing needed this time) and the place result in `./dataset/KIT-ML`
+</details>
+<details>
+  <summary><b>Action to Motion</b></summary>
+**UESTC, HumanAct12**
+```bash
+bash prepare/download_a2m_datasets.sh
+```
+</details>
+<details>
+  <summary><b>Unconstrained</b></summary>
+**HumanAct12**
+```bash
+bash prepare/download_unconstrained_datasets.sh
+```
+</details>
+### 3. Download the pretrained models
+Download the model(s) you wish to use, then unzip and place them in `./save/`.
+<details>
+  <summary><b>Text to Motion</b></summary>
+**You need only the first one.**
+**HumanML3D**
+[humanml-encoder-512](https://drive.google.com/file/d/1PE0PK8e5a5j-7-Xhs5YET5U5pGh0c821/view?usp=sharing) (best model)
+[humanml-decoder-512](https://drive.google.com/file/d/1q3soLadvVh7kJuJPd2cegMNY2xVuVudj/view?usp=sharing)
+[humanml-decoder-with-emb-512](https://drive.google.com/file/d/1GnsW0K3UjuOkNkAWmjrGIUmeDDZrmPE5/view?usp=sharing)
+**KIT**
+[kit-encoder-512](https://drive.google.com/file/d/1SHCRcE0es31vkJMLGf9dyLe7YsWj7pNL/view?usp=sharing)
+</details>
+<details>
+  <summary><b>Action to Motion</b></summary>
+**UESTC**
+[uestc](https://drive.google.com/file/d/1goB2DJK4B-fLu2QmqGWKAqWGMTAO6wQ6/view?usp=sharing)
+[uestc_no_fc](https://drive.google.com/file/d/1fpv3mR-qP9CYCsi9CrQhFqlLavcSQky6/view?usp=sharing)
+**HumanAct12**
+[humanact12](https://drive.google.com/file/d/154X8_Lgpec6Xj0glEGql7FVKqPYCdBFO/view?usp=sharing)
+[humanact12_no_fc](https://drive.google.com/file/d/1frKVMBYNiN5Mlq7zsnhDBzs9vGJvFeiQ/view?usp=sharing)
+</details>
+<details>
+  <summary><b>Unconstrained</b></summary>
+**HumanAct12**
+[humanact12_unconstrained](https://drive.google.com/file/d/1uG68m200pZK3pD-zTmPXu5XkgNpx_mEx/view?usp=share_link)
+</details>
+## Example Usage
+example usage and results on TACO dataset
+|        Input        |       Result         |         Overlayed         |
+| :----------------------: | :---------------------: | :-----------------------: |
+| ![](assets/taco-20231104_017-src-a.gif) | ![](assets/taco-20231104_017-res-a.gif) | ![](assets/taco-20231104_017-overlayed-a.gif) |
+Follow steps below to reproduce the above result.
+1. **Denoising**
+   ```bash
+   bash scripts/val_examples/predict_taco_rndseed_spatial_20231104_017.sh
+   ```
+   Ten random seeds will be utilizd for prediction. The predicted results will be saved in the folder `./data/taco/result`.
+2. **Mesh reconstruction**
+   ```bash
+   bash scripts/val_examples/reconstruct_taco_20231104_017.sh
+   ```
+   Results will be saved under the same folder with the above step.
+3. **Extracting results and visualization**
+<details>
+  <summary><b>Text to Motion</b></summary>
+### Generate from test set prompts
+```shell
+python -m sample.generate --model_path ./save/humanml_trans_enc_512/model000200000.pt --num_samples 10 --num_repetitions 3
+```
+### Generate from your text file
+```shell
+python -m sample.generate --model_path ./save/humanml_trans_enc_512/model000200000.pt --input_text ./assets/example_text_prompts.txt
+```
+### Generate a single prompt
+```shell
+python -m sample.generate --model_path ./save/humanml_trans_enc_512/model000200000.pt --text_prompt "the person walked forward and is picking up his toolbox."
+```
+</details>
+<details>
+  <summary><b>Action to Motion</b></summary>
+### Generate from test set actions
+```shell
+python -m sample.generate --model_path ./save/humanact12/model000350000.pt --num_samples 10 --num_repetitions 3
+```
+### Generate from your actions file
+```shell
+python -m sample.generate --model_path ./save/humanact12/model000350000.pt --action_file ./assets/example_action_names_humanact12.txt
+```
+### Generate a single action
+```shell
+python -m sample.generate --model_path ./save/humanact12/model000350000.pt --text_prompt "drink"
+```
+</details>
+<details>
+  <summary><b>Unconstrained</b></summary>
+```shell
+python -m sample.generate --model_path ./save/unconstrained/model000450000.pt --num_samples 10 --num_repetitions 3
+```
+By abuse of notation, (num_samples * num_repetitions) samples are created, and are visually organized in a display of num_samples rows and num_repetitions columns.
+</details>
+**You may also define:**
+* `--device` id.
+* `--seed` to sample different prompts.
+* `--motion_length` (text-to-motion only) in seconds (maximum is 9.8[sec]).
+**Running those will get you:**
+* `results.npy` file with text prompts and xyz positions of the generated animation
+* `sample##_rep##.mp4` - a stick figure animation for each generated motion.
+It will look something like this:
+![example](assets/example_stick_fig.gif)
+You can stop here, or render the SMPL mesh using the following script.
+### Render SMPL mesh
+To create SMPL mesh per frame run:
+```shell
+python -m visualize.render_mesh --input_path /path/to/mp4/stick/figure/file
+```
+**This script outputs:**
+* `sample##_rep##_smpl_params.npy` - SMPL parameters (thetas, root translations, vertices and faces)
+* `sample##_rep##_obj` - Mesh per frame in `.obj` format.
+**Notes:**
+* The `.obj` can be integrated into Blender/Maya/3DS-MAX and rendered using them.
+* This script is running [SMPLify](https://smplify.is.tue.mpg.de/) and needs GPU as well (can be specified with the `--device` flag).
+* **Important** - Do not change the original `.mp4` path before running the script.
+**Notes for 3d makers:**
+* You have two ways to animate the sequence:
+  1. Use the [SMPL add-on](https://smpl.is.tue.mpg.de/index.html) and the theta parameters saved to `sample##_rep##_smpl_params.npy` (we always use beta=0 and the gender-neutral model).
+  1. A more straightforward way is using the mesh data itself. All meshes have the same topology (SMPL), so you just need to keyframe vertex locations.
+     Since the OBJs are not preserving vertices order, we also save this data to the `sample##_rep##_smpl_params.npy` file for your convenience.
+## Motion Editing
+* This feature is available for text-to-motion datasets (HumanML3D and KIT).
+* In order to use it, you need to acquire the full data (not just the texts).
+* We support the two modes presented in the paper: `in_between` and `upper_body`.
+### Unconditioned editing
+```shell
+python -m sample.edit --model_path ./save/humanml_trans_enc_512/model000200000.pt --edit_mode in_between
+```
+**You may also define:**
+* `--num_samples` (default is 10) / `--num_repetitions` (default is 3).
+* `--device` id.
+* `--seed` to sample different prompts.
+* `--edit_mode upper_body` For upper body editing (lower body is fixed).
+The output will look like this (blue frames are from the input motion; orange were generated by the model):
+![example](assets/in_between_edit.gif)
+* As in *Motion Synthesis*, you may follow the **Render SMPL mesh** section to obtain meshes for your edited motions.
+### Text conditioned editing
+Just add the text conditioning using `--text_condition`. For example:
+```shell
+python -m sample.edit --model_path ./save/humanml_trans_enc_512/model000200000.pt --edit_mode upper_body --text_condition "A person throws a ball"
+```
+The output will look like this (blue joints are from the input motion; orange were generated by the model):
+![example](assets/upper_body_edit.gif)
+## Train your own MDM
+<details>
+  <summary><b>Text to Motion</b></summary>
+**HumanML3D**
+```shell
+python -m train.train_mdm --save_dir save/my_humanml_trans_enc_512 --dataset humanml
+```
+**KIT**
+```shell
+python -m train.train_mdm --save_dir save/my_kit_trans_enc_512 --dataset kit
+```
+</details>
+<details>
+  <summary><b>Action to Motion</b></summary>
+```shell
+python -m train.train_mdm --save_dir save/my_name --dataset {humanact12,uestc} --cond_mask_prob 0 --lambda_rcxyz 1 --lambda_vel 1 --lambda_fc 1
+```
+</details>
+<details>
+  <summary><b>Unconstrained</b></summary>
+```shell
+python -m train.train_mdm --save_dir save/my_name --dataset humanact12 --cond_mask_prob 0 --lambda_rcxyz 1 --lambda_vel 1 --lambda_fc 1  --unconstrained
+```
+</details>
+* Use `--device` to define GPU id.
+* Use `--arch` to choose one of the architectures reported in the paper `{trans_enc, trans_dec, gru}` (`trans_enc` is default).
+* Add `--train_platform_type {ClearmlPlatform, TensorboardPlatform}` to track results with either [ClearML](https://clear.ml/) or [Tensorboard](https://www.tensorflow.org/tensorboard).
+* Add `--eval_during_training` to run a short (90 minutes) evaluation for each saved checkpoint.
+  This will slow down training but will give you better monitoring.
+## Evaluate
+<details>
+  <summary><b>Text to Motion</b></summary>
+* Takes about 20 hours (on a single GPU)
+* The output of this script for the pre-trained models (as was reported in the paper) is provided in the checkpoints zip file.
+**HumanML3D**
+```shell
+python -m eval.eval_humanml --model_path ./save/humanml_trans_enc_512/model000475000.pt
+```
+**KIT**
+```shell
+python -m eval.eval_humanml --model_path ./save/kit_trans_enc_512/model000400000.pt
+```
+</details>
+<details>
+  <summary><b>Action to Motion</b></summary>
+* Takes about 7 hours for UESTC and 2 hours for HumanAct12 (on a single GPU)
+* The output of this script for the pre-trained models (as was reported in the paper) is provided in the checkpoints zip file.
+```shell
+python -m eval.eval_humanact12_uestc --model <path-to-model-ckpt> --eval_mode full
+```
+where `path-to-model-ckpt` can be a path to any of the pretrained action-to-motion models listed above, or to a checkpoint trained by the user.
+</details>
+<details>
+  <summary><b>Unconstrained</b></summary>
+* Takes about 3 hours (on a single GPU)
+```shell
+python -m eval.eval_humanact12_uestc --model ./save/unconstrained/model000450000.pt --eval_mode full
+```
+Precision and recall are not computed to save computing time. If you wish to compute them, edit the file eval/a2m/gru_eval.py and change the string `fast=True` to `fast=False`.
+</details>
+## Acknowledgments
+This code is standing on the shoulders of giants. We want to thank the following contributors
+that our code is based on:
+[guided-diffusion](https://github.com/openai/guided-diffusion), [MotionCLIP](https://github.com/GuyTevet/MotionCLIP), [text-to-motion](https://github.com/EricGuo5513/text-to-motion), [actor](https://github.com/Mathux/ACTOR), [joints2smpl](https://github.com/wangsen1312/joints2smpl), [MoDi](https://github.com/sigal-raab/MoDi).
+## License
+This code is distributed under an [MIT LICENSE](LICENSE).
+Note that our code depends on other libraries, including CLIP, SMPL, SMPL-X, PyTorch3D, and uses datasets that each have their own respective licenses that must also be followed.

app.py ADDED Viewed

	@@ -0,0 +1,52 @@

+import numpy as np
+import gradio as gr
+import os
+import tempfile
+import shutil
+# from gradio_inter.predict_from_file import predict_from_file
+from gradio_inter.create_bash_file import create_bash_file
+def create_temp_file(path: str) -> str:
+    temp_dir = tempfile.gettempdir()
+    temp_folder = os.path.join(temp_dir, "denoising")
+    os.makedirs(temp_folder, exist_ok=True)
+    # Clean up directory
+    # for i in os.listdir(temp_folder):
+    #     print("Removing", i)
+    #     os.remove(os.path.join(temp_folder, i))
+    temp_path = os.path.join(temp_folder, path.split("/")[-1])
+    shutil.copy2(path, temp_path)
+    return temp_path
+# from gradio_inter.predict import predict_from_data
+# from gradio_inter.predi
+def transpose(matrix):
+    return matrix.T
+def predict(file_path: str):
+    temp_file_path = create_temp_file(file_path)
+    # predict_from_file
+    temp_bash_file = create_bash_file(temp_file_path)
+    os.system(f"bash {temp_bash_file}")
+demo = gr.Interface(
+    predict,
+    # gr.Dataframe(type="numpy", datatype="number", row_count=5, col_count=3),
+    gr.File(type="filepath"),
+    "dict",
+    cache_examples=False
+)
+if __name__ == "__main__":
+    demo.launch()

cog.yaml ADDED Viewed

	@@ -0,0 +1,38 @@

+build:
+  gpu: true
+  cuda: "11.3"
+  python_version: 3.8
+  system_packages:
+    - libgl1-mesa-glx
+    - libglib2.0-0
+  python_packages:
+    - imageio==2.22.2
+    - matplotlib==3.1.3
+    - spacy==3.3.1
+    - smplx==0.1.28
+    - chumpy==0.70
+    - blis==0.7.8
+    - click==8.1.3
+    - confection==0.0.2
+    - ftfy==6.1.1
+    - importlib-metadata==5.0.0
+    - lxml==4.9.1
+    - murmurhash==1.0.8
+    - preshed==3.0.7
+    - pycryptodomex==3.15.0
+    - regex==2022.9.13
+    - srsly==2.4.4
+    - thinc==8.0.17
+    - typing-extensions==4.1.1
+    - urllib3==1.26.12
+    - wasabi==0.10.1
+    - wcwidth==0.2.5
+  run:
+    - apt update -y && apt-get install ffmpeg -y
+#    - python -m spacy download en_core_web_sm
+    - git clone https://github.com/openai/CLIP.git sub_modules/CLIP
+    - pip install -e sub_modules/CLIP
+predict: "sample/predict.py:Predictor"

common/.gitignore ADDED Viewed

	@@ -0,0 +1 @@


1	+ __pycache__

common/___init___.py ADDED Viewed

File without changes

common/abstract_pl.py ADDED Viewed

	@@ -0,0 +1,180 @@

+import time
+import numpy as np
+import pytorch_lightning as pl
+import torch
+import torch.optim as optim
+import common.pl_utils as pl_utils
+from common.comet_utils import log_dict
+from common.pl_utils import avg_losses_cpu, push_checkpoint_metric
+from common.xdict import xdict
+class AbstractPL(pl.LightningModule):
+    def __init__(
+        self,
+        args,
+        push_images_fn,
+        tracked_metric,
+        metric_init_val,
+        high_loss_val,
+    ):
+        super().__init__()
+        self.experiment = args.experiment
+        self.args = args
+        self.tracked_metric = tracked_metric
+        self.metric_init_val = metric_init_val
+        self.started_training = False
+        self.loss_dict_vec = []
+        self.push_images = push_images_fn
+        self.vis_train_batches = []
+        self.vis_val_batches = []
+        self.high_loss_val = high_loss_val
+        self.max_vis_examples = 20
+        self.val_step_outputs = []
+        self.test_step_outputs = []
+    def set_training_flags(self):
+        self.started_training = True
+    def load_from_ckpt(self, ckpt_path):
+        sd = torch.load(ckpt_path)["state_dict"]
+        print(self.load_state_dict(sd))
+    def training_step(self, batch, batch_idx):
+        self.set_training_flags()
+        if len(self.vis_train_batches) < self.num_vis_train:
+            self.vis_train_batches.append(batch)
+        inputs, targets, meta_info = batch
+        out = self.forward(inputs, targets, meta_info, "train")
+        loss = out["loss"]
+        loss = {k: loss[k].mean().view(-1) for k in loss}
+        total_loss = sum(loss[k] for k in loss)
+        loss_dict = {"total_loss": total_loss, "loss": total_loss}
+        loss_dict.update(loss)
+        for k, v in loss_dict.items():
+            if k != "loss":
+                loss_dict[k] = v.detach()
+        log_every = self.args.log_every
+        self.loss_dict_vec.append(loss_dict)
+        self.loss_dict_vec = self.loss_dict_vec[len(self.loss_dict_vec) - log_every :]
+        if batch_idx % log_every == 0 and batch_idx != 0:
+            running_loss_dict = avg_losses_cpu(self.loss_dict_vec)
+            running_loss_dict = xdict(running_loss_dict).postfix("__train")
+            log_dict(self.experiment, running_loss_dict, step=self.global_step)
+        return loss_dict
+    def on_train_epoch_end(self):
+        self.experiment.log_epoch_end(self.current_epoch)
+    def validation_step(self, batch, batch_idx):
+        if len(self.vis_val_batches) < self.num_vis_val:
+            self.vis_val_batches.append(batch)
+        out = self.inference_step(batch, batch_idx)
+        self.val_step_outputs.append(out)
+        return out
+    def on_validation_epoch_end(self):
+        outputs = self.val_step_outputs
+        outputs = self.inference_epoch_end(outputs, postfix="__val")
+        self.log("loss__val", outputs["loss__val"])
+        self.val_step_outputs.clear()  # free memory
+        return outputs
+    def inference_step(self, batch, batch_idx):
+        if self.training:
+            self.eval()
+        with torch.no_grad():
+            inputs, targets, meta_info = batch
+            out, loss = self.forward(inputs, targets, meta_info, "test")
+            return {"out_dict": out, "loss": loss}
+    def inference_epoch_end(self, out_list, postfix):
+        if not self.started_training:
+            self.started_training = True
+            result = push_checkpoint_metric(self.tracked_metric, self.metric_init_val)
+            return result
+        # unpack
+        outputs, loss_dict = pl_utils.reform_outputs(out_list)
+        if "test" in postfix:
+            per_img_metric_dict = {}
+            for k, v in outputs.items():
+                if "metric." in k:
+                    per_img_metric_dict[k] = np.array(v)
+        metric_dict = {}
+        for k, v in outputs.items():
+            if "metric." in k:
+                metric_dict[k] = np.nanmean(np.array(v))
+        loss_metric_dict = {}
+        loss_metric_dict.update(metric_dict)
+        loss_metric_dict.update(loss_dict)
+        loss_metric_dict = xdict(loss_metric_dict).postfix(postfix)
+        log_dict(
+            self.experiment,
+            loss_metric_dict,
+            step=self.global_step,
+        )
+        if self.args.interface_p is None and "test" not in postfix:
+            result = push_checkpoint_metric(
+                self.tracked_metric, loss_metric_dict[self.tracked_metric]
+            )
+            self.log(self.tracked_metric, result[self.tracked_metric])
+        if not self.args.no_vis:
+            print("Rendering train images")
+            self.visualize_batches(self.vis_train_batches, "_train", False)
+            print("Rendering val images")
+            self.visualize_batches(self.vis_val_batches, "_val", False)
+        if "test" in postfix:
+            return (
+                outputs,
+                {"per_img_metric_dict": per_img_metric_dict},
+                metric_dict,
+            )
+        return loss_metric_dict
+    def configure_optimizers(self):
+        optimizer = torch.optim.Adam(self.parameters(), lr=self.args.lr)
+        scheduler = optim.lr_scheduler.MultiStepLR(
+            optimizer, self.args.lr_dec_epoch, gamma=self.args.lr_decay, verbose=True
+        )
+        return [optimizer], [scheduler]
+    def visualize_batches(self, batches, postfix, no_tqdm=True):
+        im_list = []
+        if self.training:
+            self.eval()
+        tic = time.time()
+        for batch_idx, batch in enumerate(batches):
+            with torch.no_grad():
+                inputs, targets, meta_info = batch
+                vis_dict = self.forward(inputs, targets, meta_info, "vis")
+                for vis_fn in self.vis_fns:
+                    curr_im_list = vis_fn(
+                        vis_dict,
+                        self.max_vis_examples,
+                        self.renderer,
+                        postfix=postfix,
+                        no_tqdm=no_tqdm,
+                    )
+                    im_list += curr_im_list
+                print("Rendering: %d/%d" % (batch_idx + 1, len(batches)))
+        self.push_images(self.experiment, im_list, self.global_step)
+        print("Done rendering (%.1fs)" % (time.time() - tic))
+        return im_list

common/args_utils.py ADDED Viewed

	@@ -0,0 +1,15 @@

+from loguru import logger
+def set_default_params(args, default_args):
+    # if a val is not set on argparse, use default val
+    # else, use the one in the argparse
+    custom_dict = {}
+    for key, val in args.items():
+        if val is None:
+            args[key] = default_args[key]
+        else:
+            custom_dict[key] = val
+    logger.info(f"Using custom values: {custom_dict}")
+    return args

common/body_models.py ADDED Viewed

	@@ -0,0 +1,146 @@

+import json
+import numpy as np
+import torch
+from smplx import MANO
+from common.mesh import Mesh
+class MANODecimator:
+    def __init__(self):
+        data = np.load(
+            "./data/arctic_data/data/meta/mano_decimator_195.npy", allow_pickle=True
+        ).item()
+        mydata = {}
+        for key, val in data.items():
+            # only consider decimation matrix so far
+            if "D" in key:
+                mydata[key] = torch.FloatTensor(val)
+        self.data = mydata
+    def downsample(self, verts, is_right):
+        dev = verts.device
+        flag = "right" if is_right else "left"
+        if self.data[f"D_{flag}"].device != dev:
+            self.data[f"D_{flag}"] = self.data[f"D_{flag}"].to(dev)
+        D = self.data[f"D_{flag}"]
+        batch_size = verts.shape[0]
+        D_batch = D[None, :, :].repeat(batch_size, 1, 1)
+        verts_sub = torch.bmm(D_batch, verts)
+        return verts_sub
+MODEL_DIR = "./data/body_models/mano"
+SEAL_FACES_R = [
+    [120, 108, 778],
+    [108, 79, 778],
+    [79, 78, 778],
+    [78, 121, 778],
+    [121, 214, 778],
+    [214, 215, 778],
+    [215, 279, 778],
+    [279, 239, 778],
+    [239, 234, 778],
+    [234, 92, 778],
+    [92, 38, 778],
+    [38, 122, 778],
+    [122, 118, 778],
+    [118, 117, 778],
+    [117, 119, 778],
+    [119, 120, 778],
+]
+# vertex ids around the ring of the wrist
+CIRCLE_V_ID = np.array(
+    [108, 79, 78, 121, 214, 215, 279, 239, 234, 92, 38, 122, 118, 117, 119, 120],
+    dtype=np.int64,
+)
+def seal_mano_mesh(v3d, faces, is_rhand):
+    # v3d: B, 778, 3
+    # faces: 1538, 3
+    # output: v3d(B, 779, 3); faces (1554, 3)
+    seal_faces = torch.LongTensor(np.array(SEAL_FACES_R)).to(faces.device)
+    if not is_rhand:
+        # left hand
+        seal_faces = seal_faces[:, np.array([1, 0, 2])]  # invert face normal
+    centers = v3d[:, CIRCLE_V_ID].mean(dim=1)[:, None, :]
+    sealed_vertices = torch.cat((v3d, centers), dim=1)
+    faces = torch.cat((faces, seal_faces), dim=0)
+    return sealed_vertices, faces
+def build_layers(device=None):
+    from common.object_tensors import ObjectTensors
+    layers = {
+        "right": build_mano_aa(True),
+        "left": build_mano_aa(False),
+        "object_tensors": ObjectTensors(),
+    }
+    if device is not None:
+        layers["right"] = layers["right"].to(device)
+        layers["left"] = layers["left"].to(device)
+        layers["object_tensors"].to(device)
+    return layers
+MANO_MODEL_DIR = "./data/body_models/mano"
+SMPLX_MODEL_P = {
+    "male": "./data/body_models/smplx/SMPLX_MALE.npz",
+    "female": "./data/body_models/smplx/SMPLX_FEMALE.npz",
+    "neutral": "./data/body_models/smplx/SMPLX_NEUTRAL.npz",
+}
+def build_smplx(batch_size, gender, vtemplate):
+    import smplx
+    subj_m = smplx.create(
+        model_path=SMPLX_MODEL_P[gender],
+        model_type="smplx",
+        gender=gender,
+        num_pca_comps=45,
+        v_template=vtemplate,
+        flat_hand_mean=True,
+        use_pca=False,
+        batch_size=batch_size,
+        # batch_size=320,
+    )
+    return subj_m
+def build_subject_smplx(batch_size, subject_id):
+    with open("./data/arctic_data/data/meta/misc.json", "r") as f:
+        misc = json.load(f)
+    vtemplate_p = f"./data/arctic_data/data/meta/subject_vtemplates/{subject_id}.obj"
+    mesh = Mesh(filename=vtemplate_p)
+    vtemplate = mesh.v
+    gender = misc[subject_id]["gender"]
+    return build_smplx(batch_size, gender, vtemplate)
+def build_mano_aa(is_rhand, create_transl=False, flat_hand=False):
+    return MANO(
+        MODEL_DIR,
+        create_transl=create_transl,
+        use_pca=False,
+        flat_hand_mean=flat_hand,
+        is_rhand=is_rhand,
+    )
+##
+def construct_layers(dev):
+    mano_layers = {
+        "right": build_mano_aa(True, create_transl=True, flat_hand=False),
+        "left": build_mano_aa(False, create_transl=True, flat_hand=False),
+        "smplx": build_smplx(1, "neutral", None),
+    }
+    for layer in mano_layers.values():
+        layer.to(dev)
+    return mano_layers

common/camera.py ADDED Viewed

	@@ -0,0 +1,474 @@

+import numpy as np
+import torch
+"""
+Useful geometric operations, e.g. Perspective projection and a differentiable Rodrigues formula
+Parts of the code are taken from https://github.com/MandyMo/pytorch_HMR
+"""
+def perspective_to_weak_perspective_torch(
+    perspective_camera,
+    focal_length,
+    img_res,
+):
+    # Convert Weak Perspective Camera [s, tx, ty] to camera translation [tx, ty, tz]
+    # in 3D given the bounding box size
+    # This camera translation can be used in a full perspective projection
+    # if isinstance(focal_length, torch.Tensor):
+    #     focal_length = focal_length[:, 0]
+    tx = perspective_camera[:, 0]
+    ty = perspective_camera[:, 1]
+    tz = perspective_camera[:, 2]
+    weak_perspective_camera = torch.stack(
+        [2 * focal_length / (img_res * tz + 1e-9), tx, ty],
+        dim=-1,
+    )
+    return weak_perspective_camera
+def convert_perspective_to_weak_perspective(
+    perspective_camera,
+    focal_length,
+    img_res,
+):
+    # Convert Weak Perspective Camera [s, tx, ty] to camera translation [tx, ty, tz]
+    # in 3D given the bounding box size
+    # This camera translation can be used in a full perspective projection
+    # if isinstance(focal_length, torch.Tensor):
+    #     focal_length = focal_length[:, 0]
+    weak_perspective_camera = torch.stack(
+        [
+            2 * focal_length / (img_res * perspective_camera[:, 2] + 1e-9),
+            perspective_camera[:, 0],
+            perspective_camera[:, 1],
+        ],
+        dim=-1,
+    )
+    return weak_perspective_camera
+def convert_weak_perspective_to_perspective(
+    weak_perspective_camera, focal_length, img_res
+):
+    # Convert Weak Perspective Camera [s, tx, ty] to camera translation [tx, ty, tz]
+    # in 3D given the bounding box size
+    # This camera translation can be used in a full perspective projection
+    # if isinstance(focal_length, torch.Tensor):
+    #     focal_length = focal_length[:, 0]
+    perspective_camera = torch.stack(
+        [
+            weak_perspective_camera[:, 1],
+            weak_perspective_camera[:, 2],
+            2 * focal_length / (img_res * weak_perspective_camera[:, 0] + 1e-9),
+        ],
+        dim=-1,
+    )
+    return perspective_camera
+def get_default_cam_t(f, img_res):
+    cam = torch.tensor([[5.0, 0.0, 0.0]])
+    return convert_weak_perspective_to_perspective(cam, f, img_res)
+def estimate_translation_np(S, joints_2d, joints_conf, focal_length, img_size):
+    """Find camera translation that brings 3D joints S closest to 2D the corresponding joints_2d.
+    Input:
+        S: (25, 3) 3D joint locations
+        joints: (25, 3) 2D joint locations and confidence
+    Returns:
+        (3,) camera translation vector
+    """
+    num_joints = S.shape[0]
+    # focal length
+    f = np.array([focal_length[0], focal_length[1]])
+    # optical center
+    center = np.array([img_size[1] / 2.0, img_size[0] / 2.0])
+    # transformations
+    Z = np.reshape(np.tile(S[:, 2], (2, 1)).T, -1)
+    XY = np.reshape(S[:, 0:2], -1)
+    O = np.tile(center, num_joints)
+    F = np.tile(f, num_joints)
+    weight2 = np.reshape(np.tile(np.sqrt(joints_conf), (2, 1)).T, -1)
+    # least squares
+    Q = np.array(
+        [
+            F * np.tile(np.array([1, 0]), num_joints),
+            F * np.tile(np.array([0, 1]), num_joints),
+            O - np.reshape(joints_2d, -1),
+        ]
+    ).T
+    c = (np.reshape(joints_2d, -1) - O) * Z - F * XY
+    # weighted least squares
+    W = np.diagflat(weight2)
+    Q = np.dot(W, Q)
+    c = np.dot(W, c)
+    # square matrix
+    A = np.dot(Q.T, Q)
+    b = np.dot(Q.T, c)
+    # solution
+    trans = np.linalg.solve(A, b)
+    return trans
+def estimate_translation(
+    S,
+    joints_2d,
+    focal_length,
+    img_size,
+    use_all_joints=False,
+    rotation=None,
+    pad_2d=False,
+):
+    """Find camera translation that brings 3D joints S closest to 2D the corresponding joints_2d.
+    Input:
+        S: (B, 49, 3) 3D joint locations
+        joints: (B, 49, 3) 2D joint locations and confidence
+    Returns:
+        (B, 3) camera translation vectors
+    """
+    if pad_2d:
+        batch, num_pts = joints_2d.shape[:2]
+        joints_2d_pad = torch.ones((batch, num_pts, 3))
+        joints_2d_pad[:, :, :2] = joints_2d
+        joints_2d_pad = joints_2d_pad.to(joints_2d.device)
+        joints_2d = joints_2d_pad
+    device = S.device
+    if rotation is not None:
+        S = torch.einsum("bij,bkj->bki", rotation, S)
+    # Use only joints 25:49 (GT joints)
+    if use_all_joints:
+        S = S.cpu().numpy()
+        joints_2d = joints_2d.cpu().numpy()
+    else:
+        S = S[:, 25:, :].cpu().numpy()
+        joints_2d = joints_2d[:, 25:, :].cpu().numpy()
+    joints_conf = joints_2d[:, :, -1]
+    joints_2d = joints_2d[:, :, :-1]
+    trans = np.zeros((S.shape[0], 3), dtype=np.float32)
+    # Find the translation for each example in the batch
+    for i in range(S.shape[0]):
+        S_i = S[i]
+        joints_i = joints_2d[i]
+        conf_i = joints_conf[i]
+        trans[i] = estimate_translation_np(
+            S_i, joints_i, conf_i, focal_length=focal_length, img_size=img_size
+        )
+    return torch.from_numpy(trans).to(device)
+def estimate_translation_cam(
+    S, joints_2d, focal_length, img_size, use_all_joints=False, rotation=None
+):
+    """Find camera translation that brings 3D joints S closest to 2D the corresponding joints_2d.
+    Input:
+        S: (B, 49, 3) 3D joint locations
+        joints: (B, 49, 3) 2D joint locations and confidence
+    Returns:
+        (B, 3) camera translation vectors
+    """
+    def estimate_translation_np(S, joints_2d, joints_conf, focal_length, img_size):
+        """Find camera translation that brings 3D joints S closest to 2D the corresponding joints_2d.
+        Input:
+            S: (25, 3) 3D joint locations
+            joints: (25, 3) 2D joint locations and confidence
+        Returns:
+            (3,) camera translation vector
+        """
+        num_joints = S.shape[0]
+        # focal length
+        f = np.array([focal_length[0], focal_length[1]])
+        # optical center
+        center = np.array([img_size[0] / 2.0, img_size[1] / 2.0])
+        # transformations
+        Z = np.reshape(np.tile(S[:, 2], (2, 1)).T, -1)
+        XY = np.reshape(S[:, 0:2], -1)
+        O = np.tile(center, num_joints)
+        F = np.tile(f, num_joints)
+        weight2 = np.reshape(np.tile(np.sqrt(joints_conf), (2, 1)).T, -1)
+        # least squares
+        Q = np.array(
+            [
+                F * np.tile(np.array([1, 0]), num_joints),
+                F * np.tile(np.array([0, 1]), num_joints),
+                O - np.reshape(joints_2d, -1),
+            ]
+        ).T
+        c = (np.reshape(joints_2d, -1) - O) * Z - F * XY
+        # weighted least squares
+        W = np.diagflat(weight2)
+        Q = np.dot(W, Q)
+        c = np.dot(W, c)
+        # square matrix
+        A = np.dot(Q.T, Q)
+        b = np.dot(Q.T, c)
+        # solution
+        trans = np.linalg.solve(A, b)
+        return trans
+    device = S.device
+    if rotation is not None:
+        S = torch.einsum("bij,bkj->bki", rotation, S)
+    # Use only joints 25:49 (GT joints)
+    if use_all_joints:
+        S = S.cpu().numpy()
+        joints_2d = joints_2d.cpu().numpy()
+    else:
+        S = S[:, 25:, :].cpu().numpy()
+        joints_2d = joints_2d[:, 25:, :].cpu().numpy()
+    joints_conf = joints_2d[:, :, -1]
+    joints_2d = joints_2d[:, :, :-1]
+    trans = np.zeros((S.shape[0], 3), dtype=np.float32)
+    # Find the translation for each example in the batch
+    for i in range(S.shape[0]):
+        S_i = S[i]
+        joints_i = joints_2d[i]
+        conf_i = joints_conf[i]
+        trans[i] = estimate_translation_np(
+            S_i, joints_i, conf_i, focal_length=focal_length, img_size=img_size
+        )
+    return torch.from_numpy(trans).to(device)
+def get_coord_maps(size=56):
+    xx_ones = torch.ones([1, size], dtype=torch.int32)
+    xx_ones = xx_ones.unsqueeze(-1)
+    xx_range = torch.arange(size, dtype=torch.int32).unsqueeze(0)
+    xx_range = xx_range.unsqueeze(1)
+    xx_channel = torch.matmul(xx_ones, xx_range)
+    xx_channel = xx_channel.unsqueeze(-1)
+    yy_ones = torch.ones([1, size], dtype=torch.int32)
+    yy_ones = yy_ones.unsqueeze(1)
+    yy_range = torch.arange(size, dtype=torch.int32).unsqueeze(0)
+    yy_range = yy_range.unsqueeze(-1)
+    yy_channel = torch.matmul(yy_range, yy_ones)
+    yy_channel = yy_channel.unsqueeze(-1)
+    xx_channel = xx_channel.permute(0, 3, 1, 2)
+    yy_channel = yy_channel.permute(0, 3, 1, 2)
+    xx_channel = xx_channel.float() / (size - 1)
+    yy_channel = yy_channel.float() / (size - 1)
+    xx_channel = xx_channel * 2 - 1
+    yy_channel = yy_channel * 2 - 1
+    out = torch.cat([xx_channel, yy_channel], dim=1)
+    return out
+def look_at(eye, at=np.array([0, 0, 0]), up=np.array([0, 0, 1]), eps=1e-5):
+    at = at.astype(float).reshape(1, 3)
+    up = up.astype(float).reshape(1, 3)
+    eye = eye.reshape(-1, 3)
+    up = up.repeat(eye.shape[0] // up.shape[0], axis=0)
+    eps = np.array([eps]).reshape(1, 1).repeat(up.shape[0], axis=0)
+    z_axis = eye - at
+    z_axis /= np.max(np.stack([np.linalg.norm(z_axis, axis=1, keepdims=True), eps]))
+    x_axis = np.cross(up, z_axis)
+    x_axis /= np.max(np.stack([np.linalg.norm(x_axis, axis=1, keepdims=True), eps]))
+    y_axis = np.cross(z_axis, x_axis)
+    y_axis /= np.max(np.stack([np.linalg.norm(y_axis, axis=1, keepdims=True), eps]))
+    r_mat = np.concatenate(
+        (x_axis.reshape(-1, 3, 1), y_axis.reshape(-1, 3, 1), z_axis.reshape(-1, 3, 1)),
+        axis=2,
+    )
+    return r_mat
+def to_sphere(u, v):
+    theta = 2 * np.pi * u
+    phi = np.arccos(1 - 2 * v)
+    cx = np.sin(phi) * np.cos(theta)
+    cy = np.sin(phi) * np.sin(theta)
+    cz = np.cos(phi)
+    s = np.stack([cx, cy, cz])
+    return s
+def sample_on_sphere(range_u=(0, 1), range_v=(0, 1)):
+    u = np.random.uniform(*range_u)
+    v = np.random.uniform(*range_v)
+    return to_sphere(u, v)
+def sample_pose_on_sphere(range_v=(0, 1), range_u=(0, 1), radius=1, up=[0, 1, 0]):
+    # sample location on unit sphere
+    loc = sample_on_sphere(range_u, range_v)
+    # sample radius if necessary
+    if isinstance(radius, tuple):
+        radius = np.random.uniform(*radius)
+    loc = loc * radius
+    R = look_at(loc, up=np.array(up))[0]
+    RT = np.concatenate([R, loc.reshape(3, 1)], axis=1)
+    RT = torch.Tensor(RT.astype(np.float32))
+    return RT
+def rectify_pose(camera_r, body_aa, rotate_x=False):
+    body_r = batch_rodrigues(body_aa).reshape(-1, 3, 3)
+    if rotate_x:
+        rotate_x = torch.tensor([[[1.0, 0.0, 0.0], [0.0, -1.0, 0.0], [0.0, 0.0, -1.0]]])
+        body_r = body_r @ rotate_x
+    final_r = camera_r @ body_r
+    body_aa = batch_rot2aa(final_r)
+    return body_aa
+def estimate_translation_k_np(S, joints_2d, joints_conf, K):
+    """Find camera translation that brings 3D joints S closest to 2D the corresponding joints_2d.
+    Input:
+        S: (25, 3) 3D joint locations
+        joints: (25, 3) 2D joint locations and confidence
+    Returns:
+        (3,) camera translation vector
+    """
+    num_joints = S.shape[0]
+    # focal length
+    focal = np.array([K[0, 0], K[1, 1]])
+    # optical center
+    center = np.array([K[0, 2], K[1, 2]])
+    # transformations
+    Z = np.reshape(np.tile(S[:, 2], (2, 1)).T, -1)
+    XY = np.reshape(S[:, 0:2], -1)
+    O = np.tile(center, num_joints)
+    F = np.tile(focal, num_joints)
+    weight2 = np.reshape(np.tile(np.sqrt(joints_conf), (2, 1)).T, -1)
+    # least squares
+    Q = np.array(
+        [
+            F * np.tile(np.array([1, 0]), num_joints),
+            F * np.tile(np.array([0, 1]), num_joints),
+            O - np.reshape(joints_2d, -1),
+        ]
+    ).T
+    c = (np.reshape(joints_2d, -1) - O) * Z - F * XY
+    # weighted least squares
+    W = np.diagflat(weight2)
+    Q = np.dot(W, Q)
+    c = np.dot(W, c)
+    # square matrix
+    A = np.dot(Q.T, Q)
+    b = np.dot(Q.T, c)
+    # solution
+    trans = np.linalg.solve(A, b)
+    return trans
+def estimate_translation_k(
+    S,
+    joints_2d,
+    K,
+    use_all_joints=False,
+    rotation=None,
+    pad_2d=False,
+):
+    """Find camera translation that brings 3D joints S closest to 2D the corresponding joints_2d.
+    Input:
+        S: (B, 49, 3) 3D joint locations
+        joints: (B, 49, 3) 2D joint locations and confidence
+    Returns:
+        (B, 3) camera translation vectors
+    """
+    if pad_2d:
+        batch, num_pts = joints_2d.shape[:2]
+        joints_2d_pad = torch.ones((batch, num_pts, 3))
+        joints_2d_pad[:, :, :2] = joints_2d
+        joints_2d_pad = joints_2d_pad.to(joints_2d.device)
+        joints_2d = joints_2d_pad
+    device = S.device
+    if rotation is not None:
+        S = torch.einsum("bij,bkj->bki", rotation, S)
+    # Use only joints 25:49 (GT joints)
+    if use_all_joints:
+        S = S.cpu().numpy()
+        joints_2d = joints_2d.cpu().numpy()
+    else:
+        S = S[:, 25:, :].cpu().numpy()
+        joints_2d = joints_2d[:, 25:, :].cpu().numpy()
+    joints_conf = joints_2d[:, :, -1]
+    joints_2d = joints_2d[:, :, :-1]
+    trans = np.zeros((S.shape[0], 3), dtype=np.float32)
+    # Find the translation for each example in the batch
+    for i in range(S.shape[0]):
+        S_i = S[i]
+        joints_i = joints_2d[i]
+        conf_i = joints_conf[i]
+        K_i = K[i]
+        trans[i] = estimate_translation_k_np(S_i, joints_i, conf_i, K_i)
+    return torch.from_numpy(trans).to(device)
+def weak_perspective_to_perspective_torch(
+    weak_perspective_camera, focal_length, img_res, min_s
+):
+    # Convert Weak Perspective Camera [s, tx, ty] to camera translation [tx, ty, tz]
+    # in 3D given the bounding box size
+    # This camera translation can be used in a full perspective projection
+    s = weak_perspective_camera[:, 0]
+    s = torch.clamp(s, min_s)
+    tx = weak_perspective_camera[:, 1]
+    ty = weak_perspective_camera[:, 2]
+    perspective_camera = torch.stack(
+        [
+            tx,
+            ty,
+            2 * focal_length / (img_res * s + 1e-9),
+        ],
+        dim=-1,
+    )
+    return perspective_camera

common/comet_utils.py ADDED Viewed

	@@ -0,0 +1,158 @@

+import json
+import os
+import os.path as op
+import time
+import comet_ml
+import numpy as np
+import torch
+from loguru import logger
+from tqdm import tqdm
+from src.datasets.dataset_utils import copy_repo_arctic
+# folder used for debugging
+DUMMY_EXP = "xxxxxxxxx"
+def add_paths(args):
+    exp_key = args.exp_key
+    args_p = f"./logs/{exp_key}/args.json"
+    ckpt_p = f"./logs/{exp_key}/checkpoints/last.ckpt"
+    if not op.exists(ckpt_p) or DUMMY_EXP in ckpt_p:
+        ckpt_p = ""
+    if args.resume_ckpt != "":
+        ckpt_p = args.resume_ckpt
+    args.ckpt_p = ckpt_p
+    args.log_dir = f"./logs/{exp_key}"
+    if args.infer_ckpt != "":
+        basedir = "/".join(args.infer_ckpt.split("/")[:2])
+        basename = op.basename(args.infer_ckpt).replace(".ckpt", ".params.pt")
+        args.interface_p = op.join(basedir, basename)
+    args.args_p = args_p
+    if args.cluster:
+        args.run_p = op.join(args.log_dir, "condor", "run.sh")
+        args.submit_p = op.join(args.log_dir, "condor", "submit.sub")
+        args.repo_p = op.join(args.log_dir, "repo")
+    return args
+def save_args(args, save_keys):
+    args_save = {}
+    for key, val in args.items():
+        if key in save_keys:
+            args_save[key] = val
+    with open(args.args_p, "w") as f:
+        json.dump(args_save, f, indent=4)
+    logger.info(f"Saved args at {args.args_p}")
+def create_files(args):
+    os.makedirs(args.log_dir, exist_ok=True)
+    if args.cluster:
+        os.makedirs(op.dirname(args.run_p), exist_ok=True)
+        copy_repo_arctic(args.exp_key)
+def log_exp_meta(args):
+    tags = [args.method]
+    logger.info(f"Experiment tags: {tags}")
+    args.experiment.set_name(args.exp_key)
+    args.experiment.add_tags(tags)
+    args.experiment.log_parameters(args)
+def init_experiment(args):
+    if args.resume_ckpt != "":
+        args.exp_key = args.resume_ckpt.split("/")[1]
+    if args.fast_dev_run:
+        args.exp_key = DUMMY_EXP
+    if args.exp_key == "":
+        args.exp_key = generate_exp_key()
+    args = add_paths(args)
+    if op.exists(args.args_p) and args.exp_key not in [DUMMY_EXP]:
+        with open(args.args_p, "r") as f:
+            args_disk = json.load(f)
+            if "comet_key" in args_disk.keys():
+                args.comet_key = args_disk["comet_key"]
+    create_files(args)
+    project_name = args.project
+    disabled = args.mute
+    comet_url = args["comet_key"] if "comet_key" in args.keys() else None
+    api_key = os.environ["COMET_API_KEY"]
+    workspace = os.environ["COMET_WORKSPACE"]
+    if not args.cluster:
+        if comet_url is None:
+            experiment = comet_ml.Experiment(
+                api_key=api_key,
+                workspace=workspace,
+                project_name=project_name,
+                disabled=disabled,
+                display_summary_level=0,
+            )
+            args.comet_key = experiment.get_key()
+        else:
+            experiment = comet_ml.ExistingExperiment(
+                previous_experiment=comet_url,
+                api_key=api_key,
+                project_name=project_name,
+                workspace=workspace,
+                disabled=disabled,
+                display_summary_level=0,
+            )
+        device = "cuda" if torch.cuda.is_available() else "cpu"
+        logger.add(
+            os.path.join(args.log_dir, "train.log"),
+            level="INFO",
+            colorize=True,
+        )
+        logger.info(torch.cuda.get_device_properties(device))
+        args.gpu = torch.cuda.get_device_properties(device).name
+    else:
+        experiment = None
+    args.experiment = experiment
+    return experiment, args
+def log_dict(experiment, metric_dict, step, postfix=None):
+    if experiment is None:
+        return
+    for key, value in metric_dict.items():
+        if postfix is not None:
+            key = key + postfix
+        if isinstance(value, torch.Tensor) and len(value.view(-1)) == 1:
+            value = value.item()
+        if isinstance(value, (int, float, np.float32)):
+            experiment.log_metric(key, value, step=step)
+def generate_exp_key():
+    import random
+    hash = random.getrandbits(128)
+    key = "%032x" % (hash)
+    key = key[:9]
+    return key
+def push_images(experiment, all_im_list, global_step=None, no_tqdm=False, verbose=True):
+    if verbose:
+        print("Pushing PIL images")
+        tic = time.time()
+    iterator = all_im_list if no_tqdm else tqdm(all_im_list)
+    for im in iterator:
+        im_np = np.array(im["im"])
+        if "fig_name" in im.keys():
+            experiment.log_image(im_np, im["fig_name"], step=global_step)
+        else:
+            experiment.log_image(im_np, "unnamed", step=global_step)
+    if verbose:
+        toc = time.time()
+        print("Done pushing PIL images (%.1fs)" % (toc - tic))

common/data_utils.py ADDED Viewed

	@@ -0,0 +1,371 @@

+"""
+This file contains functions that are used to perform data augmentation.
+"""
+import cv2
+import numpy as np
+import torch
+from loguru import logger
+def get_transform(center, scale, res, rot=0):
+    """Generate transformation matrix."""
+    h = 200 * scale
+    t = np.zeros((3, 3))
+    t[0, 0] = float(res[1]) / h
+    t[1, 1] = float(res[0]) / h
+    t[0, 2] = res[1] * (-float(center[0]) / h + 0.5)
+    t[1, 2] = res[0] * (-float(center[1]) / h + 0.5)
+    t[2, 2] = 1
+    if not rot == 0:
+        rot = -rot  # To match direction of rotation from cropping
+        rot_mat = np.zeros((3, 3))
+        rot_rad = rot * np.pi / 180
+        sn, cs = np.sin(rot_rad), np.cos(rot_rad)
+        rot_mat[0, :2] = [cs, -sn]
+        rot_mat[1, :2] = [sn, cs]
+        rot_mat[2, 2] = 1
+        # Need to rotate around center
+        t_mat = np.eye(3)
+        t_mat[0, 2] = -res[1] / 2
+        t_mat[1, 2] = -res[0] / 2
+        t_inv = t_mat.copy()
+        t_inv[:2, 2] *= -1
+        t = np.dot(t_inv, np.dot(rot_mat, np.dot(t_mat, t)))
+    return t
+def transform(pt, center, scale, res, invert=0, rot=0):
+    """Transform pixel location to different reference."""
+    t = get_transform(center, scale, res, rot=rot)
+    if invert:
+        t = np.linalg.inv(t)
+    new_pt = np.array([pt[0] - 1, pt[1] - 1, 1.0]).T
+    new_pt = np.dot(t, new_pt)
+    return new_pt[:2].astype(int) + 1
+def rotate_2d(pt_2d, rot_rad):
+    x = pt_2d[0]
+    y = pt_2d[1]
+    sn, cs = np.sin(rot_rad), np.cos(rot_rad)
+    xx = x * cs - y * sn
+    yy = x * sn + y * cs
+    return np.array([xx, yy], dtype=np.float32)
+def gen_trans_from_patch_cv(
+    c_x, c_y, src_width, src_height, dst_width, dst_height, scale, rot, inv=False
+):
+    # augment size with scale
+    src_w = src_width * scale
+    src_h = src_height * scale
+    src_center = np.array([c_x, c_y], dtype=np.float32)
+    # augment rotation
+    rot_rad = np.pi * rot / 180
+    src_downdir = rotate_2d(np.array([0, src_h * 0.5], dtype=np.float32), rot_rad)
+    src_rightdir = rotate_2d(np.array([src_w * 0.5, 0], dtype=np.float32), rot_rad)
+    dst_w = dst_width
+    dst_h = dst_height
+    dst_center = np.array([dst_w * 0.5, dst_h * 0.5], dtype=np.float32)
+    dst_downdir = np.array([0, dst_h * 0.5], dtype=np.float32)
+    dst_rightdir = np.array([dst_w * 0.5, 0], dtype=np.float32)
+    src = np.zeros((3, 2), dtype=np.float32)
+    src[0, :] = src_center
+    src[1, :] = src_center + src_downdir
+    src[2, :] = src_center + src_rightdir
+    dst = np.zeros((3, 2), dtype=np.float32)
+    dst[0, :] = dst_center
+    dst[1, :] = dst_center + dst_downdir
+    dst[2, :] = dst_center + dst_rightdir
+    if inv:
+        trans = cv2.getAffineTransform(np.float32(dst), np.float32(src))
+    else:
+        trans = cv2.getAffineTransform(np.float32(src), np.float32(dst))
+    trans = trans.astype(np.float32)
+    return trans
+def generate_patch_image(
+    cvimg,
+    bbox,
+    scale,
+    rot,
+    out_shape,
+    interpl_strategy,
+    gauss_kernel=5,
+    gauss_sigma=8.0,
+):
+    img = cvimg.copy()
+    bb_c_x = float(bbox[0])
+    bb_c_y = float(bbox[1])
+    bb_width = float(bbox[2])
+    bb_height = float(bbox[3])
+    trans = gen_trans_from_patch_cv(
+        bb_c_x, bb_c_y, bb_width, bb_height, out_shape[1], out_shape[0], scale, rot
+    )
+    # anti-aliasing
+    blur = cv2.GaussianBlur(img, (gauss_kernel, gauss_kernel), gauss_sigma)
+    img_patch = cv2.warpAffine(
+        blur, trans, (int(out_shape[1]), int(out_shape[0])), flags=interpl_strategy
+    )
+    img_patch = img_patch.astype(np.float32)
+    inv_trans = gen_trans_from_patch_cv(
+        bb_c_x,
+        bb_c_y,
+        bb_width,
+        bb_height,
+        out_shape[1],
+        out_shape[0],
+        scale,
+        rot,
+        inv=True,
+    )
+    return img_patch, trans, inv_trans
+def augm_params(is_train, flip_prob, noise_factor, rot_factor, scale_factor):
+    """Get augmentation parameters."""
+    flip = 0  # flipping
+    pn = np.ones(3)  # per channel pixel-noise
+    rot = 0  # rotation
+    sc = 1  # scaling
+    if is_train:
+        # We flip with probability 1/2
+        if np.random.uniform() <= flip_prob:
+            flip = 1
+            assert False, "Flipping not supported"
+        # Each channel is multiplied with a number
+        # in the area [1-opt.noiseFactor,1+opt.noiseFactor]
+        pn = np.random.uniform(1 - noise_factor, 1 + noise_factor, 3)
+        # The rotation is a number in the area [-2*rotFactor, 2*rotFactor]
+        rot = min(
+            2 * rot_factor,
+            max(
+                -2 * rot_factor,
+                np.random.randn() * rot_factor,
+            ),
+        )
+        # The scale is multiplied with a number
+        # in the area [1-scaleFactor,1+scaleFactor]
+        sc = min(
+            1 + scale_factor,
+            max(
+                1 - scale_factor,
+                np.random.randn() * scale_factor + 1,
+            ),
+        )
+        # but it is zero with probability 3/5
+        if np.random.uniform() <= 0.6:
+            rot = 0
+    augm_dict = {}
+    augm_dict["flip"] = flip
+    augm_dict["pn"] = pn
+    augm_dict["rot"] = rot
+    augm_dict["sc"] = sc
+    return augm_dict
+def rgb_processing(is_train, rgb_img, center, bbox_dim, augm_dict, img_res):
+    rot = augm_dict["rot"]
+    sc = augm_dict["sc"]
+    pn = augm_dict["pn"]
+    scale = sc * bbox_dim
+    crop_dim = int(scale * 200)
+    # faster cropping!!
+    rgb_img = generate_patch_image(
+        rgb_img,
+        [center[0], center[1], crop_dim, crop_dim],
+        1.0,
+        rot,
+        [img_res, img_res],
+        cv2.INTER_CUBIC,
+    )[0]
+    # in the rgb image we add pixel noise in a channel-wise manner
+    rgb_img[:, :, 0] = np.minimum(255.0, np.maximum(0.0, rgb_img[:, :, 0] * pn[0]))
+    rgb_img[:, :, 1] = np.minimum(255.0, np.maximum(0.0, rgb_img[:, :, 1] * pn[1]))
+    rgb_img[:, :, 2] = np.minimum(255.0, np.maximum(0.0, rgb_img[:, :, 2] * pn[2]))
+    rgb_img = np.transpose(rgb_img.astype("float32"), (2, 0, 1)) / 255.0
+    return rgb_img
+def transform_kp2d(kp2d, bbox):
+    # bbox: (cx, cy, scale) in the original image space
+    # scale is normalized
+    assert isinstance(kp2d, np.ndarray)
+    assert len(kp2d.shape) == 2
+    cx, cy, scale = bbox
+    s = 200 * scale  # to px
+    cap_dim = 1000  # px
+    factor = cap_dim / (1.5 * s)
+    kp2d_cropped = np.copy(kp2d)
+    kp2d_cropped[:, 0] -= cx - 1.5 / 2 * s
+    kp2d_cropped[:, 1] -= cy - 1.5 / 2 * s
+    kp2d_cropped[:, 0] *= factor
+    kp2d_cropped[:, 1] *= factor
+    return kp2d_cropped
+def j2d_processing(kp, center, bbox_dim, augm_dict, img_res):
+    """Process gt 2D keypoints and apply all augmentation transforms."""
+    scale = augm_dict["sc"] * bbox_dim
+    rot = augm_dict["rot"]
+    nparts = kp.shape[0]
+    for i in range(nparts):
+        kp[i, 0:2] = transform(
+            kp[i, 0:2] + 1,
+            center,
+            scale,
+            [img_res, img_res],
+            rot=rot,
+        )
+    # convert to normalized coordinates
+    kp = normalize_kp2d_np(kp, img_res)
+    kp = kp.astype("float32")
+    return kp
+def pose_processing(pose, augm_dict):
+    """Process SMPL theta parameters  and apply all augmentation transforms."""
+    rot = augm_dict["rot"]
+    # rotation or the pose parameters
+    pose[:3] = rot_aa(pose[:3], rot)
+    # flip the pose parameters
+    # (72),float
+    pose = pose.astype("float32")
+    return pose
+def rot_aa(aa, rot):
+    """Rotate axis angle parameters."""
+    # pose parameters
+    R = np.array(
+        [
+            [np.cos(np.deg2rad(-rot)), -np.sin(np.deg2rad(-rot)), 0],
+            [np.sin(np.deg2rad(-rot)), np.cos(np.deg2rad(-rot)), 0],
+            [0, 0, 1],
+        ]
+    )
+    # find the rotation of the body in camera frame
+    per_rdg, _ = cv2.Rodrigues(aa)
+    # apply the global rotation to the global orientation
+    resrot, _ = cv2.Rodrigues(np.dot(R, per_rdg))
+    aa = (resrot.T)[0]
+    return aa
+def denormalize_images(images):
+    images = images * torch.tensor([0.229, 0.224, 0.225], device=images.device).reshape(
+        1, 3, 1, 1
+    )
+    images = images + torch.tensor([0.485, 0.456, 0.406], device=images.device).reshape(
+        1, 3, 1, 1
+    )
+    return images
+def read_img(img_fn, dummy_shape):
+    try:
+        cv_img = _read_img(img_fn)
+    except:
+        logger.warning(f"Unable to load {img_fn}")
+        cv_img = np.zeros(dummy_shape, dtype=np.float32)
+        return cv_img, False
+    return cv_img, True
+def _read_img(img_fn):
+    img = cv2.cvtColor(cv2.imread(img_fn), cv2.COLOR_BGR2RGB)
+    return img.astype(np.float32)
+def normalize_kp2d_np(kp2d: np.ndarray, img_res):
+    assert kp2d.shape[1] == 3
+    kp2d_normalized = kp2d.copy()
+    kp2d_normalized[:, :2] = 2.0 * kp2d[:, :2] / img_res - 1.0
+    return kp2d_normalized
+def unnormalize_2d_kp(kp_2d_np: np.ndarray, res):
+    assert kp_2d_np.shape[1] == 3
+    kp_2d = np.copy(kp_2d_np)
+    kp_2d[:, :2] = 0.5 * res * (kp_2d[:, :2] + 1)
+    return kp_2d
+def normalize_kp2d(kp2d: torch.Tensor, img_res):
+    assert len(kp2d.shape) == 3
+    kp2d_normalized = kp2d.clone()
+    kp2d_normalized[:, :, :2] = 2.0 * kp2d[:, :, :2] / img_res - 1.0
+    return kp2d_normalized
+def unormalize_kp2d(kp2d_normalized: torch.Tensor, img_res):
+    assert len(kp2d_normalized.shape) == 3
+    assert kp2d_normalized.shape[2] == 2
+    kp2d = kp2d_normalized.clone()
+    kp2d = 0.5 * img_res * (kp2d + 1)
+    return kp2d
+def get_wp_intrix(fixed_focal: float, img_res):
+    # consruct weak perspective on patch
+    camera_center = np.array([img_res // 2, img_res // 2])
+    intrx = torch.zeros([3, 3])
+    intrx[0, 0] = fixed_focal
+    intrx[1, 1] = fixed_focal
+    intrx[2, 2] = 1.0
+    intrx[0, -1] = camera_center[0]
+    intrx[1, -1] = camera_center[1]
+    return intrx
+def get_aug_intrix(
+    intrx, fixed_focal: float, img_res, use_gt_k, bbox_cx, bbox_cy, scale
+):
+    """
+    This function returns camera intrinsics under scaling.
+    If use_gt_k, the GT K is used, but scaled based on the amount of scaling in the patch.
+    Else, we construct an intrinsic camera with a fixed focal length and fixed camera center.
+    """
+    if not use_gt_k:
+        # consruct weak perspective on patch
+        intrx = get_wp_intrix(fixed_focal, img_res)
+    else:
+        # update the GT intrinsics (full image space)
+        # such that it matches the scale of the patch
+        dim = scale * 200.0  # bbox size
+        k_scale = float(img_res) / dim  # resized_dim / bbox_size in full image space
+        """
+        # x1 and y1: top-left corner of bbox
+        intrinsics after data augmentation
+        fx' = k*fx
+        fy' = k*fy
+        cx' = k*(cx - x1)
+        cy' = k*(cy - y1)
+        """
+        intrx[0, 0] *= k_scale  # k*fx
+        intrx[1, 1] *= k_scale  # k*fy
+        intrx[0, 2] -= bbox_cx - dim / 2.0
+        intrx[1, 2] -= bbox_cy - dim / 2.0
+        intrx[0, 2] *= k_scale
+        intrx[1, 2] *= k_scale
+    return intrx

common/ld_utils.py ADDED Viewed

	@@ -0,0 +1,116 @@

+import itertools
+import numpy as np
+import torch
+def sort_dict(disordered):
+    sorted_dict = {k: disordered[k] for k in sorted(disordered)}
+    return sorted_dict
+def prefix_dict(mydict, prefix):
+    out = {prefix + k: v for k, v in mydict.items()}
+    return out
+def postfix_dict(mydict, postfix):
+    out = {k + postfix: v for k, v in mydict.items()}
+    return out
+def unsort(L, sort_idx):
+    assert isinstance(sort_idx, list)
+    assert isinstance(L, list)
+    LL = zip(sort_idx, L)
+    LL = sorted(LL, key=lambda x: x[0])
+    _, L = zip(*LL)
+    return list(L)
+def cat_dl(out_list, dim, verbose=True, squeeze=True):
+    out = {}
+    for key, val in out_list.items():
+        if isinstance(val[0], torch.Tensor):
+            out[key] = torch.cat(val, dim=dim)
+            if squeeze:
+                out[key] = out[key].squeeze()
+        elif isinstance(val[0], np.ndarray):
+            out[key] = np.concatenate(val, axis=dim)
+            if squeeze:
+                out[key] = np.squeeze(out[key])
+        elif isinstance(val[0], list):
+            out[key] = sum(val, [])
+        else:
+            if verbose:
+                print(f"Ignoring {key} undefined type {type(val[0])}")
+    return out
+def stack_dl(out_list, dim, verbose=True, squeeze=True):
+    out = {}
+    for key, val in out_list.items():
+        if isinstance(val[0], torch.Tensor):
+            out[key] = torch.stack(val, dim=dim)
+            if squeeze:
+                out[key] = out[key].squeeze()
+        elif isinstance(val[0], np.ndarray):
+            out[key] = np.stack(val, axis=dim)
+            if squeeze:
+                out[key] = np.squeeze(out[key])
+        elif isinstance(val[0], list):
+            out[key] = sum(val, [])
+        else:
+            out[key] = val
+            if verbose:
+                print(f"Processing {key} undefined type {type(val[0])}")
+    return out
+def add_prefix_postfix(mydict, prefix="", postfix=""):
+    assert isinstance(mydict, dict)
+    return dict((prefix + key + postfix, value) for (key, value) in mydict.items())
+def ld2dl(LD):
+    assert isinstance(LD, list)
+    assert isinstance(LD[0], dict)
+    """
+    A list of dict (same keys) to a dict of lists
+    """
+    dict_list = {k: [dic[k] for dic in LD] for k in LD[0]}
+    return dict_list
+class NameSpace(object):
+    def __init__(self, adict):
+        self.__dict__.update(adict)
+def dict2ns(mydict):
+    """
+    Convert dict objec to namespace
+    """
+    return NameSpace(mydict)
+def ld2dev(ld, dev):
+    """
+    Convert tensors in a list or dict to a device recursively
+    """
+    if isinstance(ld, torch.Tensor):
+        return ld.to(dev)
+    if isinstance(ld, dict):
+        for k, v in ld.items():
+            ld[k] = ld2dev(v, dev)
+        return ld
+    if isinstance(ld, list):
+        return [ld2dev(x, dev) for x in ld]
+    return ld
+def all_comb_dict(hyper_dict):
+    assert isinstance(hyper_dict, dict)
+    keys, values = zip(*hyper_dict.items())
+    permute_dicts = [dict(zip(keys, v)) for v in itertools.product(*values)]
+    return permute_dicts

common/list_utils.py ADDED Viewed

	@@ -0,0 +1,52 @@

+import math
+def chunks_by_len(L, n):
+    """
+    Split a list into n chunks
+    """
+    num_chunks = int(math.ceil(float(len(L)) / n))
+    splits = [L[x : x + num_chunks] for x in range(0, len(L), num_chunks)]
+    return splits
+def chunks_by_size(L, n):
+    """Yield successive n-sized chunks from lst."""
+    seqs = []
+    for i in range(0, len(L), n):
+        seqs.append(L[i : i + n])
+    return seqs
+def unsort(L, sort_idx):
+    assert isinstance(sort_idx, list)
+    assert isinstance(L, list)
+    LL = zip(sort_idx, L)
+    LL = sorted(LL, key=lambda x: x[0])
+    _, L = zip(*LL)
+    return list(L)
+def add_prefix_postfix(mydict, prefix="", postfix=""):
+    assert isinstance(mydict, dict)
+    return dict((prefix + key + postfix, value) for (key, value) in mydict.items())
+def ld2dl(LD):
+    assert isinstance(LD, list)
+    assert isinstance(LD[0], dict)
+    """
+    A list of dict (same keys) to a dict of lists
+    """
+    dict_list = {k: [dic[k] for dic in LD] for k in LD[0]}
+    return dict_list
+def chunks(lst, n):
+    """Yield successive n-sized chunks from lst."""
+    seqs = []
+    for i in range(0, len(lst), n):
+        seqs.append(lst[i : i + n])
+    seqs_chunked = sum(seqs, [])
+    assert set(seqs_chunked) == set(lst)
+    return seqs

common/mesh.py ADDED Viewed

	@@ -0,0 +1,94 @@

+import numpy as np
+import trimesh
+colors = {
+    "pink": [1.00, 0.75, 0.80],
+    "purple": [0.63, 0.13, 0.94],
+    "red": [1.0, 0.0, 0.0],
+    "green": [0.0, 1.0, 0.0],
+    "yellow": [1.0, 1.0, 0],
+    "brown": [1.00, 0.25, 0.25],
+    "blue": [0.0, 0.0, 1.0],
+    "white": [1.0, 1.0, 1.0],
+    "orange": [1.00, 0.65, 0.00],
+    "grey": [0.75, 0.75, 0.75],
+    "black": [0.0, 0.0, 0.0],
+}
+class Mesh(trimesh.Trimesh):
+    def __init__(
+        self,
+        filename=None,
+        v=None,
+        f=None,
+        vc=None,
+        fc=None,
+        process=False,
+        visual=None,
+        **kwargs
+    ):
+        if filename is not None:
+            mesh = trimesh.load(filename, process=process)
+            v = mesh.vertices
+            f = mesh.faces
+            visual = mesh.visual
+        super(Mesh, self).__init__(
+            vertices=v, faces=f, visual=visual, process=process, **kwargs
+        )
+        self.v = self.vertices
+        self.f = self.faces
+        assert self.v is self.vertices
+        assert self.f is self.faces
+        if vc is not None:
+            self.set_vc(vc)
+            self.vc = self.visual.vertex_colors
+            assert self.vc is self.visual.vertex_colors
+        if fc is not None:
+            self.set_fc(fc)
+            self.fc = self.visual.face_colors
+            assert self.fc is self.visual.face_colors
+    def rot_verts(self, vertices, rxyz):
+        return np.array(vertices * rxyz.T)
+    def colors_like(self, color, array, ids):
+        color = np.array(color)
+        if color.max() <= 1.0:
+            color = color * 255
+        color = color.astype(np.int8)
+        n_color = color.shape[0]
+        n_ids = ids.shape[0]
+        new_color = np.array(array)
+        if n_color <= 4:
+            new_color[ids, :n_color] = np.repeat(color[np.newaxis], n_ids, axis=0)
+        else:
+            new_color[ids, :] = color
+        return new_color
+    def set_vc(self, vc, vertex_ids=None):
+        all_ids = np.arange(self.vertices.shape[0])
+        if vertex_ids is None:
+            vertex_ids = all_ids
+        vertex_ids = all_ids[vertex_ids]
+        new_vc = self.colors_like(vc, self.visual.vertex_colors, vertex_ids)
+        self.visual.vertex_colors[:] = new_vc
+    def set_fc(self, fc, face_ids=None):
+        if face_ids is None:
+            face_ids = np.arange(self.faces.shape[0])
+        new_fc = self.colors_like(fc, self.visual.face_colors, face_ids)
+        self.visual.face_colors[:] = new_fc
+    @staticmethod
+    def cat(meshes):
+        return trimesh.util.concatenate(meshes)

common/metrics.py ADDED Viewed

	@@ -0,0 +1,51 @@

+import math
+import numpy as np
+import torch
+def compute_v2v_dist_no_reduce(v3d_cam_gt, v3d_cam_pred, is_valid):
+    assert isinstance(v3d_cam_gt, list)
+    assert isinstance(v3d_cam_pred, list)
+    assert len(v3d_cam_gt) == len(v3d_cam_pred)
+    assert len(v3d_cam_gt) == len(is_valid)
+    v2v = []
+    for v_gt, v_pred, valid in zip(v3d_cam_gt, v3d_cam_pred, is_valid):
+        if valid:
+            dist = ((v_gt - v_pred) ** 2).sum(dim=1).sqrt().cpu().numpy()  # meter
+        else:
+            dist = None
+        v2v.append(dist)
+    return v2v
+def compute_joint3d_error(joints3d_cam_gt, joints3d_cam_pred, valid_jts):
+    valid_jts = valid_jts.view(-1)
+    assert joints3d_cam_gt.shape == joints3d_cam_pred.shape
+    assert joints3d_cam_gt.shape[0] == valid_jts.shape[0]
+    dist = ((joints3d_cam_gt - joints3d_cam_pred) ** 2).sum(dim=2).sqrt()
+    invalid_idx = torch.nonzero((1 - valid_jts).long()).view(-1)
+    dist[invalid_idx, :] = float("nan")
+    dist = dist.cpu().numpy()
+    return dist
+def compute_mrrpe(root_r_gt, root_l_gt, root_r_pred, root_l_pred, is_valid):
+    rel_vec_gt = root_l_gt - root_r_gt
+    rel_vec_pred = root_l_pred - root_r_pred
+    invalid_idx = torch.nonzero((1 - is_valid).long()).view(-1)
+    mrrpe = ((rel_vec_pred - rel_vec_gt) ** 2).sum(dim=1).sqrt()
+    mrrpe[invalid_idx] = float("nan")
+    mrrpe = mrrpe.cpu().numpy()
+    return mrrpe
+def compute_arti_deg_error(pred_radian, gt_radian):
+    assert pred_radian.shape == gt_radian.shape
+    # articulation error in degree
+    pred_degree = pred_radian / math.pi * 180  # degree
+    gt_degree = gt_radian / math.pi * 180  # degree
+    err_deg = torch.abs(pred_degree - gt_degree).tolist()
+    return np.array(err_deg, dtype=np.float32)

common/np_utils.py ADDED Viewed

	@@ -0,0 +1,7 @@

+import numpy as np
+def permute_np(x, idx):
+    original_perm = tuple(range(len(x.shape)))
+    x = np.moveaxis(x, original_perm, idx)
+    return x

common/object_tensors.py ADDED Viewed

	@@ -0,0 +1,293 @@

+import json
+import os.path as op
+import sys
+import numpy as np
+import torch
+import torch.nn as nn
+import trimesh
+from easydict import EasyDict
+from scipy.spatial.distance import cdist
+sys.path = [".."] + sys.path
+import common.thing as thing
+from common.rot import axis_angle_to_quaternion, quaternion_apply
+from common.torch_utils import pad_tensor_list
+from common.xdict import xdict
+# objects to consider for training so far
+OBJECTS = [
+    "capsulemachine",
+    "box",
+    "ketchup",
+    "laptop",
+    "microwave",
+    "mixer",
+    "notebook",
+    "espressomachine",
+    "waffleiron",
+    "scissors",
+    "phone",
+]
+class ObjectTensors(nn.Module):
+    def __init__(self):
+        super(ObjectTensors, self).__init__()
+        self.obj_tensors = thing.thing2dev(construct_obj_tensors(OBJECTS), "cpu")
+        self.dev = None
+    def forward_7d_batch(
+        self,
+        angles: (None, torch.Tensor),
+        global_orient: (None, torch.Tensor),
+        transl: (None, torch.Tensor),
+        query_names: list,
+        fwd_template: bool,
+    ):
+        self._sanity_check(angles, global_orient, transl, query_names, fwd_template)
+        # store output
+        out = xdict()
+        # meta info
+        obj_idx = np.array(
+            [self.obj_tensors["names"].index(name) for name in query_names]
+        )
+        out["diameter"] = self.obj_tensors["diameter"][obj_idx]
+        out["f"] = self.obj_tensors["f"][obj_idx]
+        out["f_len"] = self.obj_tensors["f_len"][obj_idx]
+        out["v_len"] = self.obj_tensors["v_len"][obj_idx]
+        max_len = out["v_len"].max()
+        out["v"] = self.obj_tensors["v"][obj_idx][:, :max_len]
+        out["mask"] = self.obj_tensors["mask"][obj_idx][:, :max_len]
+        out["v_sub"] = self.obj_tensors["v_sub"][obj_idx]
+        out["parts_ids"] = self.obj_tensors["parts_ids"][obj_idx][:, :max_len]
+        out["parts_sub_ids"] = self.obj_tensors["parts_sub_ids"][obj_idx]
+        if fwd_template:
+            return out
+        # articulation + global rotation
+        quat_arti = axis_angle_to_quaternion(self.obj_tensors["z_axis"] * angles)
+        quat_global = axis_angle_to_quaternion(global_orient.view(-1, 3))
+        # mm
+        # collect entities to be transformed
+        tf_dict = xdict()
+        tf_dict["v_top"] = out["v"].clone()
+        tf_dict["v_sub_top"] = out["v_sub"].clone()
+        tf_dict["v_bottom"] = out["v"].clone()
+        tf_dict["v_sub_bottom"] = out["v_sub"].clone()
+        tf_dict["bbox_top"] = self.obj_tensors["bbox_top"][obj_idx]
+        tf_dict["bbox_bottom"] = self.obj_tensors["bbox_bottom"][obj_idx]
+        tf_dict["kp_top"] = self.obj_tensors["kp_top"][obj_idx]
+        tf_dict["kp_bottom"] = self.obj_tensors["kp_bottom"][obj_idx]
+        # articulate top parts
+        for key, val in tf_dict.items():
+            if "top" in key:
+                val_rot = quaternion_apply(quat_arti[:, None, :], val)
+                tf_dict.overwrite(key, val_rot)
+        # global rotation for all
+        for key, val in tf_dict.items():
+            val_rot = quaternion_apply(quat_global[:, None, :], val)
+            if transl is not None:
+                val_rot = val_rot + transl[:, None, :]
+            tf_dict.overwrite(key, val_rot)
+        # prep output
+        top_idx = out["parts_ids"] == 1
+        v_tensor = tf_dict["v_bottom"].clone()
+        v_tensor[top_idx, :] = tf_dict["v_top"][top_idx, :]
+        top_idx = out["parts_sub_ids"] == 1
+        v_sub_tensor = tf_dict["v_sub_bottom"].clone()
+        v_sub_tensor[top_idx, :] = tf_dict["v_sub_top"][top_idx, :]
+        bbox = torch.cat((tf_dict["bbox_top"], tf_dict["bbox_bottom"]), dim=1)
+        kp3d = torch.cat((tf_dict["kp_top"], tf_dict["kp_bottom"]), dim=1)
+        out.overwrite("v", v_tensor)
+        out.overwrite("v_sub", v_sub_tensor)
+        out.overwrite("bbox3d", bbox)
+        out.overwrite("kp3d", kp3d)
+        return out
+    def forward(self, angles, global_orient, transl, query_names):
+        out = self.forward_7d_batch(
+            angles, global_orient, transl, query_names, fwd_template=False
+        )
+        return out
+    def forward_template(self, query_names):
+        out = self.forward_7d_batch(
+            angles=None,
+            global_orient=None,
+            transl=None,
+            query_names=query_names,
+            fwd_template=True,
+        )
+        return out
+    def to(self, dev):
+        self.obj_tensors = thing.thing2dev(self.obj_tensors, dev)
+        self.dev = dev
+    def _sanity_check(self, angles, global_orient, transl, query_names, fwd_template):
+        # sanity check
+        if not fwd_template:
+            # assume transl is in meter
+            if transl is not None:
+                transl = transl * 1000  # mm
+            batch_size = angles.shape[0]
+            assert angles.shape == (batch_size, 1)
+            assert global_orient.shape == (batch_size, 3)
+            if transl is not None:
+                assert isinstance(transl, torch.Tensor)
+                assert transl.shape == (batch_size, 3)
+            assert len(query_names) == batch_size
+def construct_obj(object_model_p):
+    # load vtemplate
+    mesh_p = op.join(object_model_p, "mesh.obj")
+    parts_p = op.join(object_model_p, f"parts.json")
+    json_p = op.join(object_model_p, "object_params.json")
+    obj_name = op.basename(object_model_p)
+    top_sub_p = f"./data/arctic_data/data/meta/object_vtemplates/{obj_name}/top_keypoints_300.json"
+    bottom_sub_p = top_sub_p.replace("top_", "bottom_")
+    with open(top_sub_p, "r") as f:
+        sub_top = np.array(json.load(f)["keypoints"])
+    with open(bottom_sub_p, "r") as f:
+        sub_bottom = np.array(json.load(f)["keypoints"])
+    sub_v = np.concatenate((sub_top, sub_bottom), axis=0)
+    with open(parts_p, "r") as f:
+        parts = np.array(json.load(f), dtype=np.bool)
+    assert op.exists(mesh_p), f"Not found: {mesh_p}"
+    mesh = trimesh.exchange.load.load_mesh(mesh_p, process=False)
+    mesh_v = mesh.vertices
+    mesh_f = torch.LongTensor(mesh.faces)
+    vidx = np.argmin(cdist(sub_v, mesh_v, metric="euclidean"), axis=1)
+    parts_sub = parts[vidx]
+    vsk = object_model_p.split("/")[-1]
+    with open(json_p, "r") as f:
+        params = json.load(f)
+        rest = EasyDict()
+        rest.top = np.array(params["mocap_top"])
+        rest.bottom = np.array(params["mocap_bottom"])
+        bbox_top = np.array(params["bbox_top"])
+        bbox_bottom = np.array(params["bbox_bottom"])
+        kp_top = np.array(params["keypoints_top"])
+        kp_bottom = np.array(params["keypoints_bottom"])
+    np.random.seed(1)
+    obj = EasyDict()
+    obj.name = vsk
+    obj.obj_name = "".join([i for i in vsk if not i.isdigit()])
+    obj.v = torch.FloatTensor(mesh_v)
+    obj.v_sub = torch.FloatTensor(sub_v)
+    obj.f = torch.LongTensor(mesh_f)
+    obj.parts = torch.LongTensor(parts)
+    obj.parts_sub = torch.LongTensor(parts_sub)
+    with open("./data/arctic_data/data/meta/object_meta.json", "r") as f:
+        object_meta = json.load(f)
+    obj.diameter = torch.FloatTensor(np.array(object_meta[obj.obj_name]["diameter"]))
+    obj.bbox_top = torch.FloatTensor(bbox_top)
+    obj.bbox_bottom = torch.FloatTensor(bbox_bottom)
+    obj.kp_top = torch.FloatTensor(kp_top)
+    obj.kp_bottom = torch.FloatTensor(kp_bottom)
+    obj.mocap_top = torch.FloatTensor(np.array(params["mocap_top"]))
+    obj.mocap_bottom = torch.FloatTensor(np.array(params["mocap_bottom"]))
+    return obj
+def construct_obj_tensors(object_names):
+    obj_list = []
+    for k in object_names:
+        object_model_p = f"./data/arctic_data/data/meta/object_vtemplates/%s" % (k)
+        obj = construct_obj(object_model_p)
+        obj_list.append(obj)
+    bbox_top_list = []
+    bbox_bottom_list = []
+    mocap_top_list = []
+    mocap_bottom_list = []
+    kp_top_list = []
+    kp_bottom_list = []
+    v_list = []
+    v_sub_list = []
+    f_list = []
+    parts_list = []
+    parts_sub_list = []
+    diameter_list = []
+    for obj in obj_list:
+        v_list.append(obj.v)
+        v_sub_list.append(obj.v_sub)
+        f_list.append(obj.f)
+        # root_list.append(obj.root)
+        bbox_top_list.append(obj.bbox_top)
+        bbox_bottom_list.append(obj.bbox_bottom)
+        kp_top_list.append(obj.kp_top)
+        kp_bottom_list.append(obj.kp_bottom)
+        mocap_top_list.append(obj.mocap_top / 1000)
+        mocap_bottom_list.append(obj.mocap_bottom / 1000)
+        parts_list.append(obj.parts + 1)
+        parts_sub_list.append(obj.parts_sub + 1)
+        diameter_list.append(obj.diameter)
+    v_list, v_len_list = pad_tensor_list(v_list)
+    p_list, p_len_list = pad_tensor_list(parts_list)
+    ps_list = torch.stack(parts_sub_list, dim=0)
+    assert (p_len_list - v_len_list).sum() == 0
+    max_len = v_len_list.max()
+    mask = torch.zeros(len(obj_list), max_len)
+    for idx, vlen in enumerate(v_len_list):
+        mask[idx, :vlen] = 1.0
+    v_sub_list = torch.stack(v_sub_list, dim=0)
+    diameter_list = torch.stack(diameter_list, dim=0)
+    f_list, f_len_list = pad_tensor_list(f_list)
+    bbox_top_list = torch.stack(bbox_top_list, dim=0)
+    bbox_bottom_list = torch.stack(bbox_bottom_list, dim=0)
+    kp_top_list = torch.stack(kp_top_list, dim=0)
+    kp_bottom_list = torch.stack(kp_bottom_list, dim=0)
+    obj_tensors = {}
+    obj_tensors["names"] = object_names
+    obj_tensors["parts_ids"] = p_list
+    obj_tensors["parts_sub_ids"] = ps_list
+    obj_tensors["v"] = v_list.float() / 1000
+    obj_tensors["v_sub"] = v_sub_list.float() / 1000
+    obj_tensors["v_len"] = v_len_list
+    obj_tensors["f"] = f_list
+    obj_tensors["f_len"] = f_len_list
+    obj_tensors["diameter"] = diameter_list.float()
+    obj_tensors["mask"] = mask
+    obj_tensors["bbox_top"] = bbox_top_list.float() / 1000
+    obj_tensors["bbox_bottom"] = bbox_bottom_list.float() / 1000
+    obj_tensors["kp_top"] = kp_top_list.float() / 1000
+    obj_tensors["kp_bottom"] = kp_bottom_list.float() / 1000
+    obj_tensors["mocap_top"] = mocap_top_list
+    obj_tensors["mocap_bottom"] = mocap_bottom_list
+    obj_tensors["z_axis"] = torch.FloatTensor(np.array([0, 0, -1])).view(1, 3)
+    return obj_tensors

common/pl_utils.py ADDED Viewed

	@@ -0,0 +1,63 @@

+import random
+import time
+import torch
+import common.thing as thing
+from common.ld_utils import ld2dl
+def reweight_loss_by_keys(loss_dict, keys, alpha):
+    for key in keys:
+        val, weight = loss_dict[key]
+        weight_new = weight * alpha
+        loss_dict[key] = (val, weight_new)
+    return loss_dict
+def select_loss_group(groups, agent_id, alphas):
+    random.seed(1)
+    random.shuffle(groups)
+    keys = groups[agent_id % len(groups)]
+    random.seed(time.time())
+    alpha = random.choice(alphas)
+    random.seed(1)
+    return keys, alpha
+def push_checkpoint_metric(key, val):
+    val = float(val)
+    checkpt_metric = torch.FloatTensor([val])
+    result = {key: checkpt_metric}
+    return result
+def avg_losses_cpu(outputs):
+    outputs = ld2dl(outputs)
+    for key, val in outputs.items():
+        val = [v.cpu() for v in val]
+        val = torch.cat(val, dim=0).view(-1)
+        outputs[key] = val.mean()
+    return outputs
+def reform_outputs(out_list):
+    out_list_dict = ld2dl(out_list)
+    outputs = ld2dl(out_list_dict["out_dict"])
+    losses = ld2dl(out_list_dict["loss"])
+    for k, tensor in outputs.items():
+        if isinstance(tensor[0], list):
+            outputs[k] = sum(tensor, [])
+        else:
+            outputs[k] = torch.cat(tensor)
+    for k, tensor in losses.items():
+        tensor = [ten.view(-1) for ten in tensor]
+        losses[k] = torch.cat(tensor)
+    outputs = {k: thing.thing2np(v) for k, v in outputs.items()}
+    loss_dict = {k: v.mean().item() for k, v in losses.items()}
+    return outputs, loss_dict

common/rend_utils.py ADDED Viewed

	@@ -0,0 +1,139 @@

+import copy
+import os
+import numpy as np
+import pyrender
+import trimesh
+# offline rendering
+os.environ["PYOPENGL_PLATFORM"] = "egl"
+def flip_meshes(meshes):
+    rot = trimesh.transformations.rotation_matrix(np.radians(180), [1, 0, 0])
+    for mesh in meshes:
+        mesh.apply_transform(rot)
+    return meshes
+def color2material(mesh_color: list):
+    material = pyrender.MetallicRoughnessMaterial(
+        metallicFactor=0.1,
+        alphaMode="OPAQUE",
+        baseColorFactor=(
+            mesh_color[0] / 255.0,
+            mesh_color[1] / 255.0,
+            mesh_color[2] / 255.0,
+            0.5,
+        ),
+    )
+    return material
+class Renderer:
+    def __init__(self, img_res: int) -> None:
+        self.renderer = pyrender.OffscreenRenderer(
+            viewport_width=img_res, viewport_height=img_res, point_size=1.0
+        )
+        self.img_res = img_res
+    def render_meshes_pose(
+        self,
+        meshes,
+        image=None,
+        cam_transl=None,
+        cam_center=None,
+        K=None,
+        materials=None,
+        sideview_angle=None,
+    ):
+        # unpack
+        if cam_transl is not None:
+            cam_trans = np.copy(cam_transl)
+            cam_trans[0] *= -1.0
+        else:
+            cam_trans = None
+        meshes = copy.deepcopy(meshes)
+        meshes = flip_meshes(meshes)
+        if sideview_angle is not None:
+            # center around the final mesh
+            anchor_mesh = meshes[-1]
+            center = anchor_mesh.vertices.mean(axis=0)
+            rot = trimesh.transformations.rotation_matrix(
+                np.radians(sideview_angle), [0, 1, 0]
+            )
+            out_meshes = []
+            for mesh in copy.deepcopy(meshes):
+                mesh.vertices -= center
+                mesh.apply_transform(rot)
+                mesh.vertices += center
+                # further away to see more
+                mesh.vertices += np.array([0, 0, -0.10])
+                out_meshes.append(mesh)
+            meshes = out_meshes
+        # setting up
+        self.create_scene()
+        self.setup_light()
+        self.position_camera(cam_trans, K)
+        if materials is not None:
+            meshes = [
+                pyrender.Mesh.from_trimesh(mesh, material=material)
+                for mesh, material in zip(meshes, materials)
+            ]
+        else:
+            meshes = [pyrender.Mesh.from_trimesh(mesh) for mesh in meshes]
+        for mesh in meshes:
+            self.scene.add(mesh)
+        color, valid_mask = self.render_rgb()
+        if image is None:
+            output_img = color[:, :, :3]
+        else:
+            output_img = self.overlay_image(color, valid_mask, image)
+        rend_img = (output_img * 255).astype(np.uint8)
+        return rend_img
+    def render_rgb(self):
+        color, rend_depth = self.renderer.render(
+            self.scene, flags=pyrender.RenderFlags.RGBA
+        )
+        color = color.astype(np.float32) / 255.0
+        valid_mask = (rend_depth > 0)[:, :, None]
+        return color, valid_mask
+    def overlay_image(self, color, valid_mask, image):
+        output_img = color[:, :, :3] * valid_mask + (1 - valid_mask) * image
+        return output_img
+    def position_camera(self, cam_transl, K):
+        camera_pose = np.eye(4)
+        if cam_transl is not None:
+            camera_pose[:3, 3] = cam_transl
+        fx = K[0, 0]
+        fy = K[1, 1]
+        cx = K[0, 2]
+        cy = K[1, 2]
+        camera = pyrender.IntrinsicsCamera(fx=fx, fy=fy, cx=cx, cy=cy)
+        self.scene.add(camera, pose=camera_pose)
+    def setup_light(self):
+        light = pyrender.DirectionalLight(color=[1.0, 1.0, 1.0], intensity=1)
+        light_pose = np.eye(4)
+        light_pose[:3, 3] = np.array([0, -1, 1])
+        self.scene.add(light, pose=light_pose)
+        light_pose[:3, 3] = np.array([0, 1, 1])
+        self.scene.add(light, pose=light_pose)
+        light_pose[:3, 3] = np.array([1, 1, 2])
+        self.scene.add(light, pose=light_pose)
+    def create_scene(self):
+        self.scene = pyrender.Scene(ambient_light=(0.5, 0.5, 0.5))

common/rot.py ADDED Viewed

	@@ -0,0 +1,782 @@

+import cv2
+import numpy as np
+import torch
+from torch.nn import functional as F
+"""
+Taken from https://pytorch3d.readthedocs.io/en/latest/_modules/pytorch3d/transforms/rotation_conversions.html
+Just to avoid installing pytorch3d at times
+"""
+def standardize_quaternion(quaternions: torch.Tensor) -> torch.Tensor:
+    """
+    Convert a unit quaternion to a standard form: one in which the real
+    part is non negative.
+    Args:
+        quaternions: Quaternions with real part first,
+            as tensor of shape (..., 4).
+    Returns:
+        Standardized quaternions as tensor of shape (..., 4).
+    """
+    return torch.where(quaternions[..., 0:1] < 0, -quaternions, quaternions)
+def quaternion_multiply(a: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
+    """
+    Multiply two quaternions representing rotations, returning the quaternion
+    representing their composition, i.e. the versor with nonnegative real part.
+    Usual torch rules for broadcasting apply.
+    Args:
+        a: Quaternions as tensor of shape (..., 4), real part first.
+        b: Quaternions as tensor of shape (..., 4), real part first.
+    Returns:
+        The product of a and b, a tensor of quaternions of shape (..., 4).
+    """
+    ab = quaternion_raw_multiply(a, b)
+    return standardize_quaternion(ab)
+def _sqrt_positive_part(x: torch.Tensor) -> torch.Tensor:
+    """
+    Returns torch.sqrt(torch.max(0, x))
+    but with a zero subgradient where x is 0.
+    """
+    ret = torch.zeros_like(x)
+    positive_mask = x > 0
+    ret[positive_mask] = torch.sqrt(x[positive_mask])
+    return ret
+def quaternion_to_axis_angle(quaternions: torch.Tensor) -> torch.Tensor:
+    """
+    Convert rotations given as quaternions to axis/angle.
+    Args:
+        quaternions: quaternions with real part first,
+            as tensor of shape (..., 4).
+    Returns:
+        Rotations given as a vector in axis angle form, as a tensor
+            of shape (..., 3), where the magnitude is the angle
+            turned anticlockwise in radians around the vector's
+            direction.
+    """
+    norms = torch.norm(quaternions[..., 1:], p=2, dim=-1, keepdim=True)
+    half_angles = torch.atan2(norms, quaternions[..., :1])
+    angles = 2 * half_angles
+    eps = 1e-6
+    small_angles = angles.abs() < eps
+    sin_half_angles_over_angles = torch.empty_like(angles)
+    sin_half_angles_over_angles[~small_angles] = (
+        torch.sin(half_angles[~small_angles]) / angles[~small_angles]
+    )
+    # for x small, sin(x/2) is about x/2 - (x/2)^3/6
+    # so sin(x/2)/x is about 1/2 - (x*x)/48
+    sin_half_angles_over_angles[small_angles] = (
+        0.5 - (angles[small_angles] * angles[small_angles]) / 48
+    )
+    return quaternions[..., 1:] / sin_half_angles_over_angles
+def quaternion_to_matrix(quaternions: torch.Tensor) -> torch.Tensor:
+    """
+    Convert rotations given as quaternions to rotation matrices.
+    Args:
+        quaternions: quaternions with real part first,
+            as tensor of shape (..., 4).
+    Returns:
+        Rotation matrices as tensor of shape (..., 3, 3).
+    """
+    r, i, j, k = torch.unbind(quaternions, -1)
+    # pyre-fixme[58]: `/` is not supported for operand types `float` and `Tensor`.
+    two_s = 2.0 / (quaternions * quaternions).sum(-1)
+    o = torch.stack(
+        (
+            1 - two_s * (j * j + k * k),
+            two_s * (i * j - k * r),
+            two_s * (i * k + j * r),
+            two_s * (i * j + k * r),
+            1 - two_s * (i * i + k * k),
+            two_s * (j * k - i * r),
+            two_s * (i * k - j * r),
+            two_s * (j * k + i * r),
+            1 - two_s * (i * i + j * j),
+        ),
+        -1,
+    )
+    return o.reshape(quaternions.shape[:-1] + (3, 3))
+def matrix_to_quaternion(matrix: torch.Tensor) -> torch.Tensor:
+    """
+    Convert rotations given as rotation matrices to quaternions.
+    Args:
+        matrix: Rotation matrices as tensor of shape (..., 3, 3).
+    Returns:
+        quaternions with real part first, as tensor of shape (..., 4).
+    """
+    if matrix.size(-1) != 3 or matrix.size(-2) != 3:
+        raise ValueError(f"Invalid rotation matrix shape {matrix.shape}.")
+    batch_dim = matrix.shape[:-2]
+    m00, m01, m02, m10, m11, m12, m20, m21, m22 = torch.unbind(
+        matrix.reshape(batch_dim + (9,)), dim=-1
+    )
+    q_abs = _sqrt_positive_part(
+        torch.stack(
+            [
+                1.0 + m00 + m11 + m22,
+                1.0 + m00 - m11 - m22,
+                1.0 - m00 + m11 - m22,
+                1.0 - m00 - m11 + m22,
+            ],
+            dim=-1,
+        )
+    )
+    # we produce the desired quaternion multiplied by each of r, i, j, k
+    quat_by_rijk = torch.stack(
+        [
+            # pyre-fixme[58]: `**` is not supported for operand types `Tensor` and
+            #  `int`.
+            torch.stack([q_abs[..., 0] ** 2, m21 - m12, m02 - m20, m10 - m01], dim=-1),
+            # pyre-fixme[58]: `**` is not supported for operand types `Tensor` and
+            #  `int`.
+            torch.stack([m21 - m12, q_abs[..., 1] ** 2, m10 + m01, m02 + m20], dim=-1),
+            # pyre-fixme[58]: `**` is not supported for operand types `Tensor` and
+            #  `int`.
+            torch.stack([m02 - m20, m10 + m01, q_abs[..., 2] ** 2, m12 + m21], dim=-1),
+            # pyre-fixme[58]: `**` is not supported for operand types `Tensor` and
+            #  `int`.
+            torch.stack([m10 - m01, m20 + m02, m21 + m12, q_abs[..., 3] ** 2], dim=-1),
+        ],
+        dim=-2,
+    )
+    # We floor here at 0.1 but the exact level is not important; if q_abs is small,
+    # the candidate won't be picked.
+    flr = torch.tensor(0.1).to(dtype=q_abs.dtype, device=q_abs.device)
+    quat_candidates = quat_by_rijk / (2.0 * q_abs[..., None].max(flr))
+    # if not for numerical problems, quat_candidates[i] should be same (up to a sign),
+    # forall i; we pick the best-conditioned one (with the largest denominator)
+    return quat_candidates[
+        F.one_hot(q_abs.argmax(dim=-1), num_classes=4) > 0.5, :
+    ].reshape(batch_dim + (4,))
+def matrix_to_axis_angle(matrix: torch.Tensor) -> torch.Tensor:
+    """
+    Convert rotations given as rotation matrices to axis/angle.
+    Args:
+        matrix: Rotation matrices as tensor of shape (..., 3, 3).
+    Returns:
+        Rotations given as a vector in axis angle form, as a tensor
+            of shape (..., 3), where the magnitude is the angle
+            turned anticlockwise in radians around the vector's
+            direction.
+    """
+    return quaternion_to_axis_angle(matrix_to_quaternion(matrix))
+def rot_aa(aa, rot):
+    """Rotate axis angle parameters."""
+    # pose parameters
+    R = np.array(
+        [
+            [np.cos(np.deg2rad(-rot)), -np.sin(np.deg2rad(-rot)), 0],
+            [np.sin(np.deg2rad(-rot)), np.cos(np.deg2rad(-rot)), 0],
+            [0, 0, 1],
+        ]
+    )
+    # find the rotation of the body in camera frame
+    per_rdg, _ = cv2.Rodrigues(aa)
+    # apply the global rotation to the global orientation
+    resrot, _ = cv2.Rodrigues(np.dot(R, per_rdg))
+    aa = (resrot.T)[0]
+    return aa
+def quat2mat(quat):
+    """
+    This function is borrowed from https://github.com/MandyMo/pytorch_HMR/blob/master/src/util.py#L50
+    Convert quaternion coefficients to rotation matrix.
+    Args:
+        quat: size = [batch_size, 4] 4 <===>(w, x, y, z)
+    Returns:
+        Rotation matrix corresponding to the quaternion -- size = [batch_size, 3, 3]
+    """
+    norm_quat = quat
+    norm_quat = norm_quat / norm_quat.norm(p=2, dim=1, keepdim=True)
+    w, x, y, z = norm_quat[:, 0], norm_quat[:, 1], norm_quat[:, 2], norm_quat[:, 3]
+    batch_size = quat.size(0)
+    w2, x2, y2, z2 = w.pow(2), x.pow(2), y.pow(2), z.pow(2)
+    wx, wy, wz = w * x, w * y, w * z
+    xy, xz, yz = x * y, x * z, y * z
+    rotMat = torch.stack(
+        [
+            w2 + x2 - y2 - z2,
+            2 * xy - 2 * wz,
+            2 * wy + 2 * xz,
+            2 * wz + 2 * xy,
+            w2 - x2 + y2 - z2,
+            2 * yz - 2 * wx,
+            2 * xz - 2 * wy,
+            2 * wx + 2 * yz,
+            w2 - x2 - y2 + z2,
+        ],
+        dim=1,
+    ).view(batch_size, 3, 3)
+    return rotMat
+def batch_aa2rot(axisang):
+    # This function is borrowed from https://github.com/MandyMo/pytorch_HMR/blob/master/src/util.py#L37
+    assert len(axisang.shape) == 2
+    assert axisang.shape[1] == 3
+    # axisang N x 3
+    axisang_norm = torch.norm(axisang + 1e-8, p=2, dim=1)
+    angle = torch.unsqueeze(axisang_norm, -1)
+    axisang_normalized = torch.div(axisang, angle)
+    angle = angle * 0.5
+    v_cos = torch.cos(angle)
+    v_sin = torch.sin(angle)
+    quat = torch.cat([v_cos, v_sin * axisang_normalized], dim=1)
+    rot_mat = quat2mat(quat)
+    rot_mat = rot_mat.view(rot_mat.shape[0], 9)
+    return rot_mat
+def batch_rot2aa(Rs):
+    assert len(Rs.shape) == 3
+    assert Rs.shape[1] == Rs.shape[2]
+    assert Rs.shape[1] == 3
+    """
+    Rs is B x 3 x 3
+    void cMathUtil::RotMatToAxisAngle(const tMatrix& mat, tVector& out_axis,
+                                      double& out_theta)
+    {
+        double c = 0.5 * (mat(0, 0) + mat(1, 1) + mat(2, 2) - 1);
+        c = cMathUtil::Clamp(c, -1.0, 1.0);
+        out_theta = std::acos(c);
+        if (std::abs(out_theta) < 0.00001)
+        {
+            out_axis = tVector(0, 0, 1, 0);
+        }
+        else
+        {
+            double m21 = mat(2, 1) - mat(1, 2);
+            double m02 = mat(0, 2) - mat(2, 0);
+            double m10 = mat(1, 0) - mat(0, 1);
+            double denom = std::sqrt(m21 * m21 + m02 * m02 + m10 * m10);
+            out_axis[0] = m21 / denom;
+            out_axis[1] = m02 / denom;
+            out_axis[2] = m10 / denom;
+            out_axis[3] = 0;
+        }
+    }
+    """
+    cos = 0.5 * (torch.stack([torch.trace(x) for x in Rs]) - 1)
+    cos = torch.clamp(cos, -1, 1)
+    theta = torch.acos(cos)
+    m21 = Rs[:, 2, 1] - Rs[:, 1, 2]
+    m02 = Rs[:, 0, 2] - Rs[:, 2, 0]
+    m10 = Rs[:, 1, 0] - Rs[:, 0, 1]
+    denom = torch.sqrt(m21 * m21 + m02 * m02 + m10 * m10)
+    axis0 = torch.where(torch.abs(theta) < 0.00001, m21, m21 / denom)
+    axis1 = torch.where(torch.abs(theta) < 0.00001, m02, m02 / denom)
+    axis2 = torch.where(torch.abs(theta) < 0.00001, m10, m10 / denom)
+    return theta.unsqueeze(1) * torch.stack([axis0, axis1, axis2], 1)
+def batch_rodrigues(theta):
+    """Convert axis-angle representation to rotation matrix.
+    Args:
+        theta: size = [B, 3]
+    Returns:
+        Rotation matrix corresponding to the quaternion -- size = [B, 3, 3]
+    """
+    l1norm = torch.norm(theta + 1e-8, p=2, dim=1)
+    angle = torch.unsqueeze(l1norm, -1)
+    normalized = torch.div(theta, angle)
+    angle = angle * 0.5
+    v_cos = torch.cos(angle)
+    v_sin = torch.sin(angle)
+    quat = torch.cat([v_cos, v_sin * normalized], dim=1)
+    return quat_to_rotmat(quat)
+def quat_to_rotmat(quat):
+    """Convert quaternion coefficients to rotation matrix.
+    Args:
+        quat: size = [B, 4] 4 <===>(w, x, y, z)
+    Returns:
+        Rotation matrix corresponding to the quaternion -- size = [B, 3, 3]
+    """
+    norm_quat = quat
+    norm_quat = norm_quat / norm_quat.norm(p=2, dim=1, keepdim=True)
+    w, x, y, z = norm_quat[:, 0], norm_quat[:, 1], norm_quat[:, 2], norm_quat[:, 3]
+    B = quat.size(0)
+    w2, x2, y2, z2 = w.pow(2), x.pow(2), y.pow(2), z.pow(2)
+    wx, wy, wz = w * x, w * y, w * z
+    xy, xz, yz = x * y, x * z, y * z
+    rotMat = torch.stack(
+        [
+            w2 + x2 - y2 - z2,
+            2 * xy - 2 * wz,
+            2 * wy + 2 * xz,
+            2 * wz + 2 * xy,
+            w2 - x2 + y2 - z2,
+            2 * yz - 2 * wx,
+            2 * xz - 2 * wy,
+            2 * wx + 2 * yz,
+            w2 - x2 - y2 + z2,
+        ],
+        dim=1,
+    ).view(B, 3, 3)
+    return rotMat
+def rot6d_to_rotmat(x):
+    """Convert 6D rotation representation to 3x3 rotation matrix.
+    Based on Zhou et al., "On the Continuity of Rotation Representations in Neural Networks", CVPR 2019
+    Input:
+        (B,6) Batch of 6-D rotation representations
+    Output:
+        (B,3,3) Batch of corresponding rotation matrices
+    """
+    x = x.reshape(-1, 3, 2)
+    a1 = x[:, :, 0]
+    a2 = x[:, :, 1]
+    b1 = F.normalize(a1)
+    b2 = F.normalize(a2 - torch.einsum("bi,bi->b", b1, a2).unsqueeze(-1) * b1)
+    b3 = torch.cross(b1, b2)
+    return torch.stack((b1, b2, b3), dim=-1)
+def rotmat_to_rot6d(x):
+    rotmat = x.reshape(-1, 3, 3)
+    rot6d = rotmat[:, :, :2].reshape(x.shape[0], -1)
+    return rot6d
+def rotation_matrix_to_angle_axis(rotation_matrix):
+    """
+    This function is borrowed from https://github.com/kornia/kornia
+    Convert 3x4 rotation matrix to Rodrigues vector
+    Args:
+        rotation_matrix (Tensor): rotation matrix.
+    Returns:
+        Tensor: Rodrigues vector transformation.
+    Shape:
+        - Input: :math:`(N, 3, 4)`
+        - Output: :math:`(N, 3)`
+    Example:
+        >>> input = torch.rand(2, 3, 4)  # Nx4x4
+        >>> output = tgm.rotation_matrix_to_angle_axis(input)  # Nx3
+    """
+    if rotation_matrix.shape[1:] == (3, 3):
+        rot_mat = rotation_matrix.reshape(-1, 3, 3)
+        hom = (
+            torch.tensor([0, 0, 1], dtype=torch.float32, device=rotation_matrix.device)
+            .reshape(1, 3, 1)
+            .expand(rot_mat.shape[0], -1, -1)
+        )
+        rotation_matrix = torch.cat([rot_mat, hom], dim=-1)
+    quaternion = rotation_matrix_to_quaternion(rotation_matrix)
+    aa = quaternion_to_angle_axis(quaternion)
+    aa[torch.isnan(aa)] = 0.0
+    return aa
+def quaternion_to_angle_axis(quaternion: torch.Tensor) -> torch.Tensor:
+    """
+    This function is borrowed from https://github.com/kornia/kornia
+    Convert quaternion vector to angle axis of rotation.
+    Adapted from ceres C++ library: ceres-solver/include/ceres/rotation.h
+    Args:
+        quaternion (torch.Tensor): tensor with quaternions.
+    Return:
+        torch.Tensor: tensor with angle axis of rotation.
+    Shape:
+        - Input: :math:`(*, 4)` where `*` means, any number of dimensions
+        - Output: :math:`(*, 3)`
+    Example:
+        >>> quaternion = torch.rand(2, 4)  # Nx4
+        >>> angle_axis = tgm.quaternion_to_angle_axis(quaternion)  # Nx3
+    """
+    if not torch.is_tensor(quaternion):
+        raise TypeError(
+            "Input type is not a torch.Tensor. Got {}".format(type(quaternion))
+        )
+    if not quaternion.shape[-1] == 4:
+        raise ValueError(
+            "Input must be a tensor of shape Nx4 or 4. Got {}".format(quaternion.shape)
+        )
+    # unpack input and compute conversion
+    q1: torch.Tensor = quaternion[..., 1]
+    q2: torch.Tensor = quaternion[..., 2]
+    q3: torch.Tensor = quaternion[..., 3]
+    sin_squared_theta: torch.Tensor = q1 * q1 + q2 * q2 + q3 * q3
+    sin_theta: torch.Tensor = torch.sqrt(sin_squared_theta)
+    cos_theta: torch.Tensor = quaternion[..., 0]
+    two_theta: torch.Tensor = 2.0 * torch.where(
+        cos_theta < 0.0,
+        torch.atan2(-sin_theta, -cos_theta),
+        torch.atan2(sin_theta, cos_theta),
+    )
+    k_pos: torch.Tensor = two_theta / sin_theta
+    k_neg: torch.Tensor = 2.0 * torch.ones_like(sin_theta)
+    k: torch.Tensor = torch.where(sin_squared_theta > 0.0, k_pos, k_neg)
+    angle_axis: torch.Tensor = torch.zeros_like(quaternion)[..., :3]
+    angle_axis[..., 0] += q1 * k
+    angle_axis[..., 1] += q2 * k
+    angle_axis[..., 2] += q3 * k
+    return angle_axis
+def rotation_matrix_to_quaternion(rotation_matrix, eps=1e-6):
+    """
+    This function is borrowed from https://github.com/kornia/kornia
+    Convert 3x4 rotation matrix to 4d quaternion vector
+    This algorithm is based on algorithm described in
+    https://github.com/KieranWynn/pyquaternion/blob/master/pyquaternion/quaternion.py#L201
+    Args:
+        rotation_matrix (Tensor): the rotation matrix to convert.
+    Return:
+        Tensor: the rotation in quaternion
+    Shape:
+        - Input: :math:`(N, 3, 4)`
+        - Output: :math:`(N, 4)`
+    Example:
+        >>> input = torch.rand(4, 3, 4)  # Nx3x4
+        >>> output = tgm.rotation_matrix_to_quaternion(input)  # Nx4
+    """
+    if not torch.is_tensor(rotation_matrix):
+        raise TypeError(
+            "Input type is not a torch.Tensor. Got {}".format(type(rotation_matrix))
+        )
+    if len(rotation_matrix.shape) > 3:
+        raise ValueError(
+            "Input size must be a three dimensional tensor. Got {}".format(
+                rotation_matrix.shape
+            )
+        )
+    if not rotation_matrix.shape[-2:] == (3, 4):
+        raise ValueError(
+            "Input size must be a N x 3 x 4  tensor. Got {}".format(
+                rotation_matrix.shape
+            )
+        )
+    rmat_t = torch.transpose(rotation_matrix, 1, 2)
+    mask_d2 = rmat_t[:, 2, 2] < eps
+    mask_d0_d1 = rmat_t[:, 0, 0] > rmat_t[:, 1, 1]
+    mask_d0_nd1 = rmat_t[:, 0, 0] < -rmat_t[:, 1, 1]
+    t0 = 1 + rmat_t[:, 0, 0] - rmat_t[:, 1, 1] - rmat_t[:, 2, 2]
+    q0 = torch.stack(
+        [
+            rmat_t[:, 1, 2] - rmat_t[:, 2, 1],
+            t0,
+            rmat_t[:, 0, 1] + rmat_t[:, 1, 0],
+            rmat_t[:, 2, 0] + rmat_t[:, 0, 2],
+        ],
+        -1,
+    )
+    t0_rep = t0.repeat(4, 1).t()
+    t1 = 1 - rmat_t[:, 0, 0] + rmat_t[:, 1, 1] - rmat_t[:, 2, 2]
+    q1 = torch.stack(
+        [
+            rmat_t[:, 2, 0] - rmat_t[:, 0, 2],
+            rmat_t[:, 0, 1] + rmat_t[:, 1, 0],
+            t1,
+            rmat_t[:, 1, 2] + rmat_t[:, 2, 1],
+        ],
+        -1,
+    )
+    t1_rep = t1.repeat(4, 1).t()
+    t2 = 1 - rmat_t[:, 0, 0] - rmat_t[:, 1, 1] + rmat_t[:, 2, 2]
+    q2 = torch.stack(
+        [
+            rmat_t[:, 0, 1] - rmat_t[:, 1, 0],
+            rmat_t[:, 2, 0] + rmat_t[:, 0, 2],
+            rmat_t[:, 1, 2] + rmat_t[:, 2, 1],
+            t2,
+        ],
+        -1,
+    )
+    t2_rep = t2.repeat(4, 1).t()
+    t3 = 1 + rmat_t[:, 0, 0] + rmat_t[:, 1, 1] + rmat_t[:, 2, 2]
+    q3 = torch.stack(
+        [
+            t3,
+            rmat_t[:, 1, 2] - rmat_t[:, 2, 1],
+            rmat_t[:, 2, 0] - rmat_t[:, 0, 2],
+            rmat_t[:, 0, 1] - rmat_t[:, 1, 0],
+        ],
+        -1,
+    )
+    t3_rep = t3.repeat(4, 1).t()
+    mask_c0 = mask_d2 * mask_d0_d1
+    mask_c1 = mask_d2 * ~mask_d0_d1
+    mask_c2 = ~mask_d2 * mask_d0_nd1
+    mask_c3 = ~mask_d2 * ~mask_d0_nd1
+    mask_c0 = mask_c0.view(-1, 1).type_as(q0)
+    mask_c1 = mask_c1.view(-1, 1).type_as(q1)
+    mask_c2 = mask_c2.view(-1, 1).type_as(q2)
+    mask_c3 = mask_c3.view(-1, 1).type_as(q3)
+    q = q0 * mask_c0 + q1 * mask_c1 + q2 * mask_c2 + q3 * mask_c3
+    q /= torch.sqrt(
+        t0_rep * mask_c0
+        + t1_rep * mask_c1
+        + t2_rep * mask_c2  # noqa
+        + t3_rep * mask_c3
+    )  # noqa
+    q *= 0.5
+    return q
+def batch_euler2matrix(r):
+    return quaternion_to_rotation_matrix(euler_to_quaternion(r))
+def euler_to_quaternion(r):
+    x = r[..., 0]
+    y = r[..., 1]
+    z = r[..., 2]
+    z = z / 2.0
+    y = y / 2.0
+    x = x / 2.0
+    cz = torch.cos(z)
+    sz = torch.sin(z)
+    cy = torch.cos(y)
+    sy = torch.sin(y)
+    cx = torch.cos(x)
+    sx = torch.sin(x)
+    quaternion = torch.zeros_like(r.repeat(1, 2))[..., :4].to(r.device)
+    quaternion[..., 0] += cx * cy * cz - sx * sy * sz
+    quaternion[..., 1] += cx * sy * sz + cy * cz * sx
+    quaternion[..., 2] += cx * cz * sy - sx * cy * sz
+    quaternion[..., 3] += cx * cy * sz + sx * cz * sy
+    return quaternion
+def quaternion_to_rotation_matrix(quat):
+    """Convert quaternion coefficients to rotation matrix.
+    Args:
+        quat: size = [B, 4] 4 <===>(w, x, y, z)
+    Returns:
+        Rotation matrix corresponding to the quaternion -- size = [B, 3, 3]
+    """
+    norm_quat = quat
+    norm_quat = norm_quat / norm_quat.norm(p=2, dim=1, keepdim=True)
+    w, x, y, z = norm_quat[:, 0], norm_quat[:, 1], norm_quat[:, 2], norm_quat[:, 3]
+    B = quat.size(0)
+    w2, x2, y2, z2 = w.pow(2), x.pow(2), y.pow(2), z.pow(2)
+    wx, wy, wz = w * x, w * y, w * z
+    xy, xz, yz = x * y, x * z, y * z
+    rotMat = torch.stack(
+        [
+            w2 + x2 - y2 - z2,
+            2 * xy - 2 * wz,
+            2 * wy + 2 * xz,
+            2 * wz + 2 * xy,
+            w2 - x2 + y2 - z2,
+            2 * yz - 2 * wx,
+            2 * xz - 2 * wy,
+            2 * wx + 2 * yz,
+            w2 - x2 - y2 + z2,
+        ],
+        dim=1,
+    ).view(B, 3, 3)
+    return rotMat
+def euler_angles_from_rotmat(R):
+    """
+    computer euler angles for rotation around x, y, z axis
+    from rotation amtrix
+    R: 4x4 rotation matrix
+    https://www.gregslabaugh.net/publications/euler.pdf
+    """
+    r21 = np.round(R[:, 2, 0].item(), 4)
+    if abs(r21) != 1:
+        y_angle1 = -1 * torch.asin(R[:, 2, 0])
+        y_angle2 = math.pi + torch.asin(R[:, 2, 0])
+        cy1, cy2 = torch.cos(y_angle1), torch.cos(y_angle2)
+        x_angle1 = torch.atan2(R[:, 2, 1] / cy1, R[:, 2, 2] / cy1)
+        x_angle2 = torch.atan2(R[:, 2, 1] / cy2, R[:, 2, 2] / cy2)
+        z_angle1 = torch.atan2(R[:, 1, 0] / cy1, R[:, 0, 0] / cy1)
+        z_angle2 = torch.atan2(R[:, 1, 0] / cy2, R[:, 0, 0] / cy2)
+        s1 = (x_angle1, y_angle1, z_angle1)
+        s2 = (x_angle2, y_angle2, z_angle2)
+        s = (s1, s2)
+    else:
+        z_angle = torch.tensor([0], device=R.device).float()
+        if r21 == -1:
+            y_angle = torch.tensor([math.pi / 2], device=R.device).float()
+            x_angle = z_angle + torch.atan2(R[:, 0, 1], R[:, 0, 2])
+        else:
+            y_angle = -torch.tensor([math.pi / 2], device=R.device).float()
+            x_angle = -z_angle + torch.atan2(-R[:, 0, 1], R[:, 0, 2])
+        s = ((x_angle, y_angle, z_angle),)
+    return s
+def quaternion_raw_multiply(a, b):
+    """
+    Source: https://github.com/facebookresearch/pytorch3d/blob/main/pytorch3d/transforms/rotation_conversions.py
+    Multiply two quaternions.
+    Usual torch rules for broadcasting apply.
+    Args:
+        a: Quaternions as tensor of shape (..., 4), real part first.
+        b: Quaternions as tensor of shape (..., 4), real part first.
+    Returns:
+        The product of a and b, a tensor of quaternions shape (..., 4).
+    """
+    aw, ax, ay, az = torch.unbind(a, -1)
+    bw, bx, by, bz = torch.unbind(b, -1)
+    ow = aw * bw - ax * bx - ay * by - az * bz
+    ox = aw * bx + ax * bw + ay * bz - az * by
+    oy = aw * by - ax * bz + ay * bw + az * bx
+    oz = aw * bz + ax * by - ay * bx + az * bw
+    return torch.stack((ow, ox, oy, oz), -1)
+def quaternion_invert(quaternion):
+    """
+    Source: https://github.com/facebookresearch/pytorch3d/blob/main/pytorch3d/transforms/rotation_conversions.py
+    Given a quaternion representing rotation, get the quaternion representing
+    its inverse.
+    Args:
+        quaternion: Quaternions as tensor of shape (..., 4), with real part
+            first, which must be versors (unit quaternions).
+    Returns:
+        The inverse, a tensor of quaternions of shape (..., 4).
+    """
+    return quaternion * quaternion.new_tensor([1, -1, -1, -1])
+def quaternion_apply(quaternion, point):
+    """
+    Source: https://github.com/facebookresearch/pytorch3d/blob/main/pytorch3d/transforms/rotation_conversions.py
+    Apply the rotation given by a quaternion to a 3D point.
+    Usual torch rules for broadcasting apply.
+    Args:
+        quaternion: Tensor of quaternions, real part first, of shape (..., 4).
+        point: Tensor of 3D points of shape (..., 3).
+    Returns:
+        Tensor of rotated points of shape (..., 3).
+    """
+    if point.size(-1) != 3:
+        raise ValueError(f"Points are not in 3D, f{point.shape}.")
+    real_parts = point.new_zeros(point.shape[:-1] + (1,))
+    point_as_quaternion = torch.cat((real_parts, point), -1)
+    out = quaternion_raw_multiply(
+        quaternion_raw_multiply(quaternion, point_as_quaternion),
+        quaternion_invert(quaternion),
+    )
+    return out[..., 1:]
+def axis_angle_to_quaternion(axis_angle: torch.Tensor) -> torch.Tensor:
+    """
+    Source: https://github.com/facebookresearch/pytorch3d/blob/main/pytorch3d/transforms/rotation_conversions.py
+    Convert rotations given as axis/angle to quaternions.
+    Args:
+        axis_angle: Rotations given as a vector in axis angle form,
+            as a tensor of shape (..., 3), where the magnitude is
+            the angle turned anticlockwise in radians around the
+            vector's direction.
+    Returns:
+        quaternions with real part first, as tensor of shape (..., 4).
+    """
+    angles = torch.norm(axis_angle, p=2, dim=-1, keepdim=True)
+    half_angles = angles * 0.5
+    eps = 1e-6
+    small_angles = angles.abs() < eps
+    sin_half_angles_over_angles = torch.empty_like(angles)
+    sin_half_angles_over_angles[~small_angles] = (
+        torch.sin(half_angles[~small_angles]) / angles[~small_angles]
+    )
+    # for x small, sin(x/2) is about x/2 - (x/2)^3/6
+    # so sin(x/2)/x is about 1/2 - (x*x)/48
+    sin_half_angles_over_angles[small_angles] = (
+        0.5 - (angles[small_angles] * angles[small_angles]) / 48
+    )
+    quaternions = torch.cat(
+        [torch.cos(half_angles), axis_angle * sin_half_angles_over_angles], dim=-1
+    )
+    return quaternions

common/sys_utils.py ADDED Viewed

	@@ -0,0 +1,44 @@

+import os
+import os.path as op
+import shutil
+from glob import glob
+from loguru import logger
+def copy(src, dst):
+    if os.path.islink(src):
+        linkto = os.readlink(src)
+        os.symlink(linkto, dst)
+    else:
+        if os.path.isdir(src):
+            shutil.copytree(src, dst)
+        else:
+            shutil.copy(src, dst)
+def copy_repo(src_files, dst_folder, filter_keywords):
+    src_files = [
+        f for f in src_files if not any(keyword in f for keyword in filter_keywords)
+    ]
+    dst_files = [op.join(dst_folder, op.basename(f)) for f in src_files]
+    for src_f, dst_f in zip(src_files, dst_files):
+        logger.info(f"FROM: {src_f}\nTO:{dst_f}")
+        copy(src_f, dst_f)
+def mkdir(directory):
+    if not os.path.exists(directory):
+        os.makedirs(directory)
+def mkdir_p(exp_path):
+    os.makedirs(exp_path, exist_ok=True)
+def count_files(path):
+    """
+    Non-recursively count number of files in a folder.
+    """
+    files = glob(path)
+    return len(files)

common/thing.py ADDED Viewed

	@@ -0,0 +1,66 @@

+import numpy as np
+import torch
+"""
+This file stores functions for conversion between numpy and torch, torch, list, etc.
+Also deal with general operations such as to(dev), detach, etc.
+"""
+def thing2list(thing):
+    if isinstance(thing, torch.Tensor):
+        return thing.tolist()
+    if isinstance(thing, np.ndarray):
+        return thing.tolist()
+    if isinstance(thing, dict):
+        return {k: thing2list(v) for k, v in md.items()}
+    if isinstance(thing, list):
+        return [thing2list(ten) for ten in thing]
+    return thing
+def thing2dev(thing, dev):
+    if hasattr(thing, "to"):
+        thing = thing.to(dev)
+        return thing
+    if isinstance(thing, list):
+        return [thing2dev(ten, dev) for ten in thing]
+    if isinstance(thing, tuple):
+        return tuple(thing2dev(list(thing), dev))
+    if isinstance(thing, dict):
+        return {k: thing2dev(v, dev) for k, v in thing.items()}
+    if isinstance(thing, torch.Tensor):
+        return thing.to(dev)
+    return thing
+def thing2np(thing):
+    if isinstance(thing, list):
+        return np.array(thing)
+    if isinstance(thing, torch.Tensor):
+        return thing.cpu().detach().numpy()
+    if isinstance(thing, dict):
+        return {k: thing2np(v) for k, v in thing.items()}
+    return thing
+def thing2torch(thing):
+    if isinstance(thing, list):
+        return torch.tensor(np.array(thing))
+    if isinstance(thing, np.ndarray):
+        return torch.from_numpy(thing)
+    if isinstance(thing, dict):
+        return {k: thing2torch(v) for k, v in thing.items()}
+    return thing
+def detach_thing(thing):
+    if isinstance(thing, torch.Tensor):
+        return thing.cpu().detach()
+    if isinstance(thing, list):
+        return [detach_thing(ten) for ten in thing]
+    if isinstance(thing, tuple):
+        return tuple(detach_thing(list(thing)))
+    if isinstance(thing, dict):
+        return {k: detach_thing(v) for k, v in thing.items()}
+    return thing

common/torch_utils.py ADDED Viewed

	@@ -0,0 +1,212 @@

+import random
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.optim as optim
+from common.ld_utils import unsort as unsort_list
+# pytorch implementation for np.nanmean
+# https://github.com/pytorch/pytorch/issues/21987#issuecomment-539402619
+def nanmean(v, *args, inplace=False, **kwargs):
+    if not inplace:
+        v = v.clone()
+    is_nan = torch.isnan(v)
+    v[is_nan] = 0
+    return v.sum(*args, **kwargs) / (~is_nan).float().sum(*args, **kwargs)
+def grad_norm(model):
+    # compute norm of gradient for a model
+    total_norm = None
+    for p in model.parameters():
+        if p.grad is not None:
+            if total_norm is None:
+                total_norm = 0
+            param_norm = p.grad.detach().data.norm(2)
+            total_norm += param_norm.item() ** 2
+    if total_norm is not None:
+        total_norm = total_norm ** (1.0 / 2)
+    else:
+        total_norm = 0.0
+    return total_norm
+def pad_tensor_list(v_list: list):
+    dev = v_list[0].device
+    num_meshes = len(v_list)
+    num_dim = 1 if len(v_list[0].shape) == 1 else v_list[0].shape[1]
+    v_len_list = []
+    for verts in v_list:
+        v_len_list.append(verts.shape[0])
+    pad_len = max(v_len_list)
+    dtype = v_list[0].dtype
+    if num_dim == 1:
+        padded_tensor = torch.zeros(num_meshes, pad_len, dtype=dtype)
+    else:
+        padded_tensor = torch.zeros(num_meshes, pad_len, num_dim, dtype=dtype)
+    for idx, (verts, v_len) in enumerate(zip(v_list, v_len_list)):
+        padded_tensor[idx, :v_len] = verts
+    padded_tensor = padded_tensor.to(dev)
+    v_len_list = torch.LongTensor(v_len_list).to(dev)
+    return padded_tensor, v_len_list
+def unpad_vtensor(
+    vtensor: (torch.Tensor), lens: (torch.LongTensor, torch.cuda.LongTensor)
+):
+    tensors_list = []
+    for verts, vlen in zip(vtensor, lens):
+        tensors_list.append(verts[:vlen])
+    return tensors_list
+def one_hot_embedding(labels, num_classes):
+    """Embedding labels to one-hot form.
+    Args:
+      labels: (LongTensor) class labels, sized [N, D1, D2, ..].
+      num_classes: (int) number of classes.
+    Returns:
+      (tensor) encoded labels, sized [N, D1, D2, .., Dk, #classes].
+    """
+    y = torch.eye(num_classes).float()
+    return y[labels]
+def unsort(ten, sort_idx):
+    """
+    Unsort a tensor of shape (N, *) using the sort_idx list(N).
+    Return a tensor of the pre-sorting order in shape (N, *)
+    """
+    assert isinstance(ten, torch.Tensor)
+    assert isinstance(sort_idx, list)
+    assert ten.shape[0] == len(sort_idx)
+    out_list = list(torch.chunk(ten, ten.size(0), dim=0))
+    out_list = unsort_list(out_list, sort_idx)
+    out_list = torch.cat(out_list, dim=0)
+    return out_list
+def all_comb(X, Y):
+    """
+    Returns all possible combinations of elements in X and Y.
+    X: (n_x, d_x)
+    Y: (n_y, d_y)
+    Output: Z: (n_x*x_y, d_x+d_y)
+    Example:
+    X = tensor([[8, 8, 8],
+                [7, 5, 9]])
+    Y = tensor([[3, 8, 7, 7],
+                [3, 7, 9, 9],
+                [6, 4, 3, 7]])
+    Z = tensor([[8, 8, 8, 3, 8, 7, 7],
+                [8, 8, 8, 3, 7, 9, 9],
+                [8, 8, 8, 6, 4, 3, 7],
+                [7, 5, 9, 3, 8, 7, 7],
+                [7, 5, 9, 3, 7, 9, 9],
+                [7, 5, 9, 6, 4, 3, 7]])
+    """
+    assert len(X.size()) == 2
+    assert len(Y.size()) == 2
+    X1 = X.unsqueeze(1)
+    Y1 = Y.unsqueeze(0)
+    X2 = X1.repeat(1, Y.shape[0], 1)
+    Y2 = Y1.repeat(X.shape[0], 1, 1)
+    Z = torch.cat([X2, Y2], -1)
+    Z = Z.view(-1, Z.shape[-1])
+    return Z
+def toggle_parameters(model, requires_grad):
+    """
+    Set all weights to requires_grad or not.
+    """
+    for param in model.parameters():
+        param.requires_grad = requires_grad
+def detach_tensor(ten):
+    """This function move tensor to cpu and convert to numpy"""
+    if isinstance(ten, torch.Tensor):
+        return ten.cpu().detach().numpy()
+    return ten
+def count_model_parameters(model):
+    """
+    Return the amount of parameters that requries gradients.
+    """
+    return sum(p.numel() for p in model.parameters() if p.requires_grad)
+def reset_all_seeds(seed):
+    """Reset all seeds for reproduciability."""
+    random.seed(seed)
+    torch.manual_seed(seed)
+    np.random.seed(seed)
+def get_activation(name):
+    """This function return an activation constructor by name."""
+    if name == "tanh":
+        return nn.Tanh()
+    elif name == "sigmoid":
+        return nn.Sigmoid()
+    elif name == "relu":
+        return nn.ReLU()
+    elif name == "selu":
+        return nn.SELU()
+    elif name == "relu6":
+        return nn.ReLU6()
+    elif name == "softplus":
+        return nn.Softplus()
+    elif name == "softshrink":
+        return nn.Softshrink()
+    else:
+        print("Undefined activation: %s" % (name))
+        assert False
+def stack_ll_tensors(tensor_list_list):
+    """
+    Recursively stack a list of lists of lists .. whose elements are tensors with the same shape
+    """
+    if isinstance(tensor_list_list, torch.Tensor):
+        return tensor_list_list
+    assert isinstance(tensor_list_list, list)
+    if isinstance(tensor_list_list[0], torch.Tensor):
+        return torch.stack(tensor_list_list)
+    stacked_tensor = []
+    for tensor_list in tensor_list_list:
+        stacked_tensor.append(stack_ll_tensors(tensor_list))
+    stacked_tensor = torch.stack(stacked_tensor)
+    return stacked_tensor
+def get_optim(name):
+    """This function return an optimizer constructor by name."""
+    if name == "adam":
+        return optim.Adam
+    elif name == "rmsprop":
+        return optim.RMSprop
+    elif name == "sgd":
+        return optim.SGD
+    else:
+        print("Undefined optim: %s" % (name))
+        assert False
+def decay_lr(optimizer, gamma):
+    """
+    Decay the learning rate by gamma
+    """
+    assert isinstance(gamma, float)
+    assert 0 <= gamma and gamma <= 1.0
+    for param_group in optimizer.param_groups:
+        param_group["lr"] *= gamma

common/transforms.py ADDED Viewed

	@@ -0,0 +1,356 @@

+import numpy as np
+import torch
+import common.data_utils as data_utils
+from common.np_utils import permute_np
+"""
+Useful geometric operations, e.g. Perspective projection and a differentiable Rodrigues formula
+Parts of the code are taken from https://github.com/MandyMo/pytorch_HMR
+"""
+def to_xy(x_homo):
+    assert isinstance(x_homo, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert x_homo.shape[1] == 3
+    assert len(x_homo.shape) == 2
+    batch_size = x_homo.shape[0]
+    x = torch.ones(batch_size, 2, device=x_homo.device)
+    x = x_homo[:, :2] / x_homo[:, 2:3]
+    return x
+def to_xyz(x_homo):
+    assert isinstance(x_homo, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert x_homo.shape[1] == 4
+    assert len(x_homo.shape) == 2
+    batch_size = x_homo.shape[0]
+    x = torch.ones(batch_size, 3, device=x_homo.device)
+    x = x_homo[:, :3] / x_homo[:, 3:4]
+    return x
+def to_homo(x):
+    assert isinstance(x, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert x.shape[1] == 3
+    assert len(x.shape) == 2
+    batch_size = x.shape[0]
+    x_homo = torch.ones(batch_size, 4, device=x.device)
+    x_homo[:, :3] = x.clone()
+    return x_homo
+def to_homo_batch(x):
+    assert isinstance(x, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert x.shape[2] == 3
+    assert len(x.shape) == 3
+    batch_size = x.shape[0]
+    num_pts = x.shape[1]
+    x_homo = torch.ones(batch_size, num_pts, 4, device=x.device)
+    x_homo[:, :, :3] = x.clone()
+    return x_homo
+def to_xyz_batch(x_homo):
+    """
+    Input: (B, N, 4)
+    Ouput: (B, N, 3)
+    """
+    assert isinstance(x_homo, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert x_homo.shape[2] == 4
+    assert len(x_homo.shape) == 3
+    batch_size = x_homo.shape[0]
+    num_pts = x_homo.shape[1]
+    x = torch.ones(batch_size, num_pts, 3, device=x_homo.device)
+    x = x_homo[:, :, :3] / x_homo[:, :, 3:4]
+    return x
+def to_xy_batch(x_homo):
+    assert isinstance(x_homo, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert x_homo.shape[2] == 3
+    assert len(x_homo.shape) == 3
+    batch_size = x_homo.shape[0]
+    num_pts = x_homo.shape[1]
+    x = torch.ones(batch_size, num_pts, 2, device=x_homo.device)
+    x = x_homo[:, :, :2] / x_homo[:, :, 2:3]
+    return x
+# VR Distortion Correction Using Vertex Displacement
+# https://stackoverflow.com/questions/44489686/camera-lens-distortion-in-opengl
+def distort_pts3d_all(_pts_cam, dist_coeffs):
+    # egocentric cameras commonly has heavy distortion
+    # this function transform points in the undistorted camera coord
+    # to distorted camera coord such that the 2d projection can match the pixels.
+    pts_cam = _pts_cam.clone().double()
+    z = pts_cam[:, :, 2]
+    z_inv = 1 / z
+    x1 = pts_cam[:, :, 0] * z_inv
+    y1 = pts_cam[:, :, 1] * z_inv
+    # precalculations
+    x1_2 = x1 * x1
+    y1_2 = y1 * y1
+    x1_y1 = x1 * y1
+    r2 = x1_2 + y1_2
+    r4 = r2 * r2
+    r6 = r4 * r2
+    r_dist = (1 + dist_coeffs[0] * r2 + dist_coeffs[1] * r4 + dist_coeffs[4] * r6) / (
+        1 + dist_coeffs[5] * r2 + dist_coeffs[6] * r4 + dist_coeffs[7] * r6
+    )
+    # full (rational + tangential) distortion
+    x2 = x1 * r_dist + 2 * dist_coeffs[2] * x1_y1 + dist_coeffs[3] * (r2 + 2 * x1_2)
+    y2 = y1 * r_dist + 2 * dist_coeffs[3] * x1_y1 + dist_coeffs[2] * (r2 + 2 * y1_2)
+    # denormalize for projection (which is a linear operation)
+    cam_pts_dist = torch.stack([x2 * z, y2 * z, z], dim=2).float()
+    return cam_pts_dist
+def rigid_tf_torch_batch(points, R, T):
+    """
+    Performs rigid transformation to incoming points but batched
+    Q = (points*R.T) + T
+    points: (batch, num, 3)
+    R: (batch, 3, 3)
+    T: (batch, 3, 1)
+    out: (batch, num, 3)
+    """
+    points_out = torch.bmm(R, points.permute(0, 2, 1)) + T
+    points_out = points_out.permute(0, 2, 1)
+    return points_out
+def solve_rigid_tf_np(A: np.ndarray, B: np.ndarray):
+    """
+    “Least-Squares Fitting of Two 3-D Point Sets”, Arun, K. S. , May 1987
+    Input: expects Nx3 matrix of points
+    Returns R,t
+    R = 3x3 rotation matrix
+    t = 3x1 column vector
+    This function should be a fix for compute_rigid_tf when the det == -1
+    """
+    assert A.shape == B.shape
+    A = A.T
+    B = B.T
+    num_rows, num_cols = A.shape
+    if num_rows != 3:
+        raise Exception(f"matrix A is not 3xN, it is {num_rows}x{num_cols}")
+    num_rows, num_cols = B.shape
+    if num_rows != 3:
+        raise Exception(f"matrix B is not 3xN, it is {num_rows}x{num_cols}")
+    # find mean column wise
+    centroid_A = np.mean(A, axis=1)
+    centroid_B = np.mean(B, axis=1)
+    # ensure centroids are 3x1
+    centroid_A = centroid_A.reshape(-1, 1)
+    centroid_B = centroid_B.reshape(-1, 1)
+    # subtract mean
+    Am = A - centroid_A
+    Bm = B - centroid_B
+    H = Am @ np.transpose(Bm)
+    # find rotation
+    U, S, Vt = np.linalg.svd(H)
+    R = Vt.T @ U.T
+    # special reflection case
+    if np.linalg.det(R) < 0:
+        Vt[2, :] *= -1
+        R = Vt.T @ U.T
+    t = -R @ centroid_A + centroid_B
+    return R, t
+def batch_solve_rigid_tf(A, B):
+    """
+    “Least-Squares Fitting of Two 3-D Point Sets”, Arun, K. S. , May 1987
+    Input: expects BxNx3 matrix of points
+    Returns R,t
+    R = Bx3x3 rotation matrix
+    t = Bx3x1 column vector
+    """
+    assert A.shape == B.shape
+    dev = A.device
+    A = A.cpu().numpy()
+    B = B.cpu().numpy()
+    A = permute_np(A, (0, 2, 1))
+    B = permute_np(B, (0, 2, 1))
+    batch, num_rows, num_cols = A.shape
+    if num_rows != 3:
+        raise Exception(f"matrix A is not 3xN, it is {num_rows}x{num_cols}")
+    _, num_rows, num_cols = B.shape
+    if num_rows != 3:
+        raise Exception(f"matrix B is not 3xN, it is {num_rows}x{num_cols}")
+    # find mean column wise
+    centroid_A = np.mean(A, axis=2)
+    centroid_B = np.mean(B, axis=2)
+    # ensure centroids are 3x1
+    centroid_A = centroid_A.reshape(batch, -1, 1)
+    centroid_B = centroid_B.reshape(batch, -1, 1)
+    # subtract mean
+    Am = A - centroid_A
+    Bm = B - centroid_B
+    H = np.matmul(Am, permute_np(Bm, (0, 2, 1)))
+    # find rotation
+    U, S, Vt = np.linalg.svd(H)
+    R = np.matmul(permute_np(Vt, (0, 2, 1)), permute_np(U, (0, 2, 1)))
+    # special reflection case
+    neg_idx = np.linalg.det(R) < 0
+    if neg_idx.sum() > 0:
+        raise Exception(
+            f"some rotation matrices are not orthogonal; make sure implementation is correct for such case: {neg_idx}"
+        )
+    Vt[neg_idx, 2, :] *= -1
+    R[neg_idx, :, :] = np.matmul(
+        permute_np(Vt[neg_idx], (0, 2, 1)), permute_np(U[neg_idx], (0, 2, 1))
+    )
+    t = np.matmul(-R, centroid_A) + centroid_B
+    R = torch.FloatTensor(R).to(dev)
+    t = torch.FloatTensor(t).to(dev)
+    return R, t
+def rigid_tf_np(points, R, T):
+    """
+    Performs rigid transformation to incoming points
+    Q = (points*R.T) + T
+    points: (num, 3)
+    R: (3, 3)
+    T: (1, 3)
+    out: (num, 3)
+    """
+    assert isinstance(points, np.ndarray)
+    assert isinstance(R, np.ndarray)
+    assert isinstance(T, np.ndarray)
+    assert len(points.shape) == 2
+    assert points.shape[1] == 3
+    assert R.shape == (3, 3)
+    assert T.shape == (1, 3)
+    points_new = np.matmul(R, points.T).T + T
+    return points_new
+def transform_points(world2cam_mat, pts):
+    """
+    Map points from one coord to another based on the 4x4 matrix.
+    e.g., map points from world to camera coord.
+    pts: (N, 3), in METERS!!
+    world2cam_mat: (4, 4)
+    Output: points in cam coord (N, 3)
+    We follow this convention:
+    | R T |   |pt|
+    | 0 1 | * | 1|
+    i.e. we rotate first then translate as T is the camera translation not position.
+    """
+    assert isinstance(pts, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert isinstance(world2cam_mat, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert world2cam_mat.shape == (4, 4)
+    assert len(pts.shape) == 2
+    assert pts.shape[1] == 3
+    pts_homo = to_homo(pts)
+    # mocap to cam
+    pts_cam_homo = torch.matmul(world2cam_mat, pts_homo.T).T
+    pts_cam = to_xyz(pts_cam_homo)
+    assert pts_cam.shape[1] == 3
+    return pts_cam
+def transform_points_batch(world2cam_mat, pts):
+    """
+    Map points from one coord to another based on the 4x4 matrix.
+    e.g., map points from world to camera coord.
+    pts: (B, N, 3), in METERS!!
+    world2cam_mat: (B, 4, 4)
+    Output: points in cam coord (B, N, 3)
+    We follow this convention:
+    | R T |   |pt|
+    | 0 1 | * | 1|
+    i.e. we rotate first then translate as T is the camera translation not position.
+    """
+    assert isinstance(pts, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert isinstance(world2cam_mat, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert world2cam_mat.shape[1:] == (4, 4)
+    assert len(pts.shape) == 3
+    assert pts.shape[2] == 3
+    batch_size = pts.shape[0]
+    pts_homo = to_homo_batch(pts)
+    # mocap to cam
+    pts_cam_homo = torch.bmm(world2cam_mat, pts_homo.permute(0, 2, 1)).permute(0, 2, 1)
+    pts_cam = to_xyz_batch(pts_cam_homo)
+    assert pts_cam.shape[2] == 3
+    return pts_cam
+def project2d_batch(K, pts_cam):
+    """
+    K: (B, 3, 3)
+    pts_cam: (B, N, 3)
+    """
+    assert isinstance(K, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert isinstance(pts_cam, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert K.shape[1:] == (3, 3)
+    assert pts_cam.shape[2] == 3
+    assert len(pts_cam.shape) == 3
+    pts2d_homo = torch.bmm(K, pts_cam.permute(0, 2, 1)).permute(0, 2, 1)
+    pts2d = to_xy_batch(pts2d_homo)
+    return pts2d
+def project2d_norm_batch(K, pts_cam, patch_width):
+    """
+    K: (B, 3, 3)
+    pts_cam: (B, N, 3)
+    """
+    assert isinstance(K, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert isinstance(pts_cam, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert K.shape[1:] == (3, 3)
+    assert pts_cam.shape[2] == 3
+    assert len(pts_cam.shape) == 3
+    v2d = project2d_batch(K, pts_cam)
+    v2d_norm = data_utils.normalize_kp2d(v2d, patch_width)
+    return v2d_norm
+def project2d(K, pts_cam):
+    assert isinstance(K, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert isinstance(pts_cam, (torch.FloatTensor, torch.cuda.FloatTensor))
+    assert K.shape == (3, 3)
+    assert pts_cam.shape[1] == 3
+    assert len(pts_cam.shape) == 2
+    pts2d_homo = torch.matmul(K, pts_cam.T).T
+    pts2d = to_xy(pts2d_homo)
+    return pts2d

common/viewer.py ADDED Viewed

	@@ -0,0 +1,287 @@

+import os
+import os.path as op
+import re
+from abc import abstractmethod
+import matplotlib.cm as cm
+import numpy as np
+from aitviewer.headless import HeadlessRenderer
+from aitviewer.renderables.billboard import Billboard
+from aitviewer.renderables.meshes import Meshes
+from aitviewer.scene.camera import OpenCVCamera
+from aitviewer.scene.material import Material
+from aitviewer.utils.so3 import aa2rot_numpy
+from aitviewer.viewer import Viewer
+from easydict import EasyDict as edict
+from loguru import logger
+from PIL import Image
+from tqdm import tqdm
+OBJ_ID = 100
+SMPLX_ID = 150
+LEFT_ID = 200
+RIGHT_ID = 250
+SEGM_IDS = {"object": OBJ_ID, "smplx": SMPLX_ID, "left": LEFT_ID, "right": RIGHT_ID}
+cmap = cm.get_cmap("plasma")
+materials = {
+    "none": None,
+    "white": Material(color=(1.0, 1.0, 1.0, 1.0), ambient=0.2),
+    "red": Material(color=(0.969, 0.106, 0.059, 1.0), ambient=0.2),
+    "blue": Material(color=(0.0, 0.0, 1.0, 1.0), ambient=0.2),
+    "green": Material(color=(1.0, 0.0, 0.0, 1.0), ambient=0.2),
+    "cyan": Material(color=(0.051, 0.659, 0.051, 1.0), ambient=0.2),
+    "light-blue": Material(color=(0.588, 0.5647, 0.9725, 1.0), ambient=0.2),
+    "cyan-light": Material(color=(0.051, 0.659, 0.051, 1.0), ambient=0.2),
+    "dark-light": Material(color=(0.404, 0.278, 0.278, 1.0), ambient=0.2),
+    "rice": Material(color=(0.922, 0.922, 0.102, 1.0), ambient=0.2),
+}
+class ViewerData(edict):
+    """
+    Interface to standardize viewer data.
+    """
+    def __init__(self, Rt, K, cols, rows, imgnames=None):
+        self.imgnames = imgnames
+        self.Rt = Rt
+        self.K = K
+        self.num_frames = Rt.shape[0]
+        self.cols = cols
+        self.rows = rows
+        self.validate_format()
+    def validate_format(self):
+        assert len(self.Rt.shape) == 3
+        assert self.Rt.shape[0] == self.num_frames
+        assert self.Rt.shape[1] == 3
+        assert self.Rt.shape[2] == 4
+        assert len(self.K.shape) == 2
+        assert self.K.shape[0] == 3
+        assert self.K.shape[1] == 3
+        if self.imgnames is not None:
+            assert self.num_frames == len(self.imgnames)
+            assert self.num_frames > 0
+            im_p = self.imgnames[0]
+            assert op.exists(im_p), f"Image path {im_p} does not exist"
+class ARCTICViewer:
+    def __init__(
+        self,
+        render_types=["rgb", "depth", "mask"],
+        interactive=True,
+        size=(2024, 2024),
+    ):
+        if not interactive:
+            v = HeadlessRenderer()
+        else:
+            v = Viewer(size=size)
+        self.v = v
+        self.interactive = interactive
+        # self.layers = layers
+        self.render_types = render_types
+    def view_interactive(self):
+        self.v.run()
+    def view_fn_headless(self, num_iter, out_folder):
+        v = self.v
+        v._init_scene()
+        logger.info("Rendering to video")
+        if "video" in self.render_types:
+            vid_p = op.join(out_folder, "video.mp4")
+            v.save_video(video_dir=vid_p)
+        pbar = tqdm(range(num_iter))
+        for fidx in pbar:
+            out_rgb = op.join(out_folder, "images", f"rgb/{fidx:04d}.png")
+            out_mask = op.join(out_folder, "images", f"mask/{fidx:04d}.png")
+            out_depth = op.join(out_folder, "images", f"depth/{fidx:04d}.npy")
+            # render RGB, depth, segmentation masks
+            if "rgb" in self.render_types:
+                v.export_frame(out_rgb)
+            if "depth" in self.render_types:
+                os.makedirs(op.dirname(out_depth), exist_ok=True)
+                render_depth(v, out_depth)
+            if "mask" in self.render_types:
+                os.makedirs(op.dirname(out_mask), exist_ok=True)
+                render_mask(v, out_mask)
+            v.scene.next_frame()
+        logger.info(f"Exported to {out_folder}")
+    @abstractmethod
+    def load_data(self):
+        pass
+    def check_format(self, batch):
+        meshes_all, data = batch
+        assert isinstance(meshes_all, dict)
+        assert len(meshes_all) > 0
+        for mesh in meshes_all.values():
+            assert isinstance(mesh, Meshes)
+        assert isinstance(data, ViewerData)
+    def render_seq(self, batch, out_folder="./render_out"):
+        meshes_all, data = batch
+        self.setup_viewer(data)
+        for mesh in meshes_all.values():
+            self.v.scene.add(mesh)
+        if self.interactive:
+            self.view_interactive()
+        else:
+            num_iter = data["num_frames"]
+            self.view_fn_headless(num_iter, out_folder)
+    def setup_viewer(self, data):
+        v = self.v
+        fps = 30
+        if "imgnames" in data:
+            setup_billboard(data, v)
+        # camera.show_path()
+        v.run_animations = True  # autoplay
+        v.run_animations = False  # autoplay
+        v.playback_fps = fps
+        v.scene.fps = fps
+        v.scene.origin.enabled = False
+        v.scene.floor.enabled = False
+        v.auto_set_floor = False
+        v.scene.floor.position[1] = -3
+        # v.scene.camera.position = np.array((0.0, 0.0, 0))
+        self.v = v
+def dist2vc(dist_ro, dist_lo, dist_o, _cmap, tf_fn=None):
+    if tf_fn is not None:
+        exp_map = tf_fn
+    else:
+        exp_map = small_exp_map
+    dist_ro = exp_map(dist_ro)
+    dist_lo = exp_map(dist_lo)
+    dist_o = exp_map(dist_o)
+    vc_ro = _cmap(dist_ro)
+    vc_lo = _cmap(dist_lo)
+    vc_o = _cmap(dist_o)
+    return vc_ro, vc_lo, vc_o
+def small_exp_map(_dist):
+    dist = np.copy(_dist)
+    # dist = 1.0 - np.clip(dist, 0, 0.1) / 0.1
+    dist = np.exp(-20.0 * dist)
+    return dist
+def construct_viewer_meshes(data, draw_edges=False, flat_shading=True):
+    rotation_flip = aa2rot_numpy(np.array([1, 0, 0]) * np.pi)
+    meshes = {}
+    for key, val in data.items():
+        if "object" in key:
+            flat_shading = False
+        else:
+            flat_shading = flat_shading
+        v3d = val["v3d"]
+        meshes[key] = Meshes(
+            v3d,
+            val["f3d"],
+            vertex_colors=val["vc"],
+            name=val["name"],
+            flat_shading=flat_shading,
+            draw_edges=draw_edges,
+            material=materials[val["color"]],
+            rotation=rotation_flip,
+        )
+    return meshes
+def setup_viewer(
+    v, shared_folder_p, video, images_path, data, flag, seq_name, side_angle
+):
+    fps = 10
+    cols, rows = 224, 224
+    focal = 1000.0
+    # setup image paths
+    regex = re.compile(r"(\d*)$")
+    def sort_key(x):
+        name = os.path.splitext(x)[0]
+        return int(regex.search(name).group(0))
+    # setup billboard
+    images_path = op.join(shared_folder_p, "images")
+    images_paths = [
+        os.path.join(images_path, f)
+        for f in sorted(os.listdir(images_path), key=sort_key)
+    ]
+    assert len(images_paths) > 0
+    cam_t = data[f"{flag}.object.cam_t"]
+    num_frames = min(cam_t.shape[0], len(images_paths))
+    cam_t = cam_t[:num_frames]
+    # setup camera
+    K = np.array([[focal, 0, rows / 2.0], [0, focal, cols / 2.0], [0, 0, 1]])
+    Rt = np.zeros((num_frames, 3, 4))
+    Rt[:, :, 3] = cam_t
+    Rt[:, :3, :3] = np.eye(3)
+    Rt[:, 1:3, :3] *= -1.0
+    camera = OpenCVCamera(K, Rt, cols, rows, viewer=v)
+    if side_angle is None:
+        billboard = Billboard.from_camera_and_distance(
+            camera, 10.0, cols, rows, images_paths
+        )
+        v.scene.add(billboard)
+    v.scene.add(camera)
+    v.run_animations = True  # autoplay
+    v.playback_fps = fps
+    v.scene.fps = fps
+    v.scene.origin.enabled = False
+    v.scene.floor.enabled = False
+    v.auto_set_floor = False
+    v.scene.floor.position[1] = -3
+    v.set_temp_camera(camera)
+    # v.scene.camera.position = np.array((0.0, 0.0, 0))
+    return v
+def render_depth(v, depth_p):
+    depth = np.array(v.get_depth()).astype(np.float16)
+    np.save(depth_p, depth)
+def render_mask(v, mask_p):
+    nodes_uid = {node.name: node.uid for node in v.scene.collect_nodes()}
+    my_cmap = {
+        uid: [SEGM_IDS[name], SEGM_IDS[name], SEGM_IDS[name]]
+        for name, uid in nodes_uid.items()
+        if name in SEGM_IDS.keys()
+    }
+    mask = np.array(v.get_mask(color_map=my_cmap)).astype(np.uint8)
+    mask = Image.fromarray(mask)
+    mask.save(mask_p)
+def setup_billboard(data, v):
+    images_paths = data.imgnames
+    K = data.K
+    Rt = data.Rt
+    rows = data.rows
+    cols = data.cols
+    camera = OpenCVCamera(K, Rt, cols, rows, viewer=v)
+    if images_paths is not None:
+        billboard = Billboard.from_camera_and_distance(
+            camera, 10.0, cols, rows, images_paths
+        )
+        v.scene.add(billboard)
+    v.scene.add(camera)
+    v.scene.camera.load_cam()
+    v.set_temp_camera(camera)

common/vis_utils.py ADDED Viewed

	@@ -0,0 +1,129 @@

+import matplotlib.cm as cm
+import matplotlib.pyplot as plt
+import numpy as np
+from PIL import Image
+# connection between the 8 points of 3d bbox
+BONES_3D_BBOX = [
+    (0, 1),
+    (1, 2),
+    (2, 3),
+    (3, 0),
+    (0, 4),
+    (1, 5),
+    (2, 6),
+    (3, 7),
+    (4, 5),
+    (5, 6),
+    (6, 7),
+    (7, 4),
+]
+def plot_2d_bbox(bbox_2d, bones, color, ax):
+    if ax is None:
+        axx = plt
+    else:
+        axx = ax
+    colors = cm.rainbow(np.linspace(0, 1, len(bbox_2d)))
+    for pt, c in zip(bbox_2d, colors):
+        axx.scatter(pt[0], pt[1], color=c, s=50)
+    if bones is None:
+        bones = BONES_3D_BBOX
+    for bone in bones:
+        sidx, eidx = bone
+        # bottom of bbox is white
+        if min(sidx, eidx) >= 4:
+            color = "w"
+        axx.plot(
+            [bbox_2d[sidx][0], bbox_2d[eidx][0]],
+            [bbox_2d[sidx][1], bbox_2d[eidx][1]],
+            color,
+        )
+    return axx
+# http://www.icare.univ-lille1.fr/tutorials/convert_a_matplotlib_figure
+def fig2data(fig):
+    """
+    @brief Convert a Matplotlib figure to a 4D
+    numpy array with RGBA channels and return it
+    @param fig a matplotlib figure
+    @return a numpy 3D array of RGBA values
+    """
+    # draw the renderer
+    fig.canvas.draw()
+    # Get the RGBA buffer from the figure
+    w, h = fig.canvas.get_width_height()
+    buf = np.frombuffer(fig.canvas.tostring_argb(), dtype=np.uint8)
+    buf.shape = (w, h, 4)
+    # canvas.tostring_argb give pixmap in ARGB mode.
+    # Roll the ALPHA channel to have it in RGBA mode
+    buf = np.roll(buf, 3, axis=2)
+    return buf
+# http://www.icare.univ-lille1.fr/tutorials/convert_a_matplotlib_figure
+def fig2img(fig):
+    """
+    @brief Convert a Matplotlib figure to a PIL Image
+    in RGBA format and return it
+    @param fig a matplotlib figure
+    @return a Python Imaging Library ( PIL ) image
+    """
+    # put the figure pixmap into a numpy array
+    buf = fig2data(fig)
+    w, h, _ = buf.shape
+    return Image.frombytes("RGBA", (w, h), buf.tobytes())
+def concat_pil_images(images):
+    """
+    Put a list of PIL images next to each other
+    """
+    assert isinstance(images, list)
+    widths, heights = zip(*(i.size for i in images))
+    total_width = sum(widths)
+    max_height = max(heights)
+    new_im = Image.new("RGB", (total_width, max_height))
+    x_offset = 0
+    for im in images:
+        new_im.paste(im, (x_offset, 0))
+        x_offset += im.size[0]
+    return new_im
+def stack_pil_images(images):
+    """
+    Stack a list of PIL images next to each other
+    """
+    assert isinstance(images, list)
+    widths, heights = zip(*(i.size for i in images))
+    total_height = sum(heights)
+    max_width = max(widths)
+    new_im = Image.new("RGB", (max_width, total_height))
+    y_offset = 0
+    for im in images:
+        new_im.paste(im, (0, y_offset))
+        y_offset += im.size[1]
+    return new_im
+def im_list_to_plt(image_list, figsize, title_list=None):
+    fig, axes = plt.subplots(nrows=1, ncols=len(image_list), figsize=figsize)
+    for idx, (ax, im) in enumerate(zip(axes, image_list)):
+        ax.imshow(im)
+        ax.set_title(title_list[idx])
+    fig.tight_layout()
+    im = fig2img(fig)
+    plt.close()
+    return im

common/xdict.py ADDED Viewed

	@@ -0,0 +1,288 @@

+import numpy as np
+import torch
+import common.thing as thing
+def _print_stat(key, thing):
+    """
+    Helper function for printing statistics about a key-value pair in an xdict.
+    """
+    mytype = type(thing)
+    if isinstance(thing, (list, tuple)):
+        print("{:<20}: {:<30}\t{:}".format(key, len(thing), mytype))
+    elif isinstance(thing, (torch.Tensor)):
+        dev = thing.device
+        shape = str(thing.shape).replace(" ", "")
+        print("{:<20}: {:<30}\t{:}\t{}".format(key, shape, mytype, dev))
+    elif isinstance(thing, (np.ndarray)):
+        dev = ""
+        shape = str(thing.shape).replace(" ", "")
+        print("{:<20}: {:<30}\t{:}".format(key, shape, mytype))
+    else:
+        print("{:<20}: {:}".format(key, mytype))
+class xdict(dict):
+    """
+    A subclass of Python's built-in dict class, which provides additional methods for manipulating and operating on dictionaries.
+    """
+    def __init__(self, mydict=None):
+        """
+        Constructor for the xdict class. Creates a new xdict object and optionally initializes it with key-value pairs from the provided dictionary mydict. If mydict is not provided, an empty xdict is created.
+        """
+        if mydict is None:
+            return
+        for k, v in mydict.items():
+            super().__setitem__(k, v)
+    def subset(self, keys):
+        """
+        Returns a new xdict object containing only the key-value pairs with keys in the provided list 'keys'.
+        """
+        out_dict = {}
+        for k in keys:
+            out_dict[k] = self[k]
+        return xdict(out_dict)
+    def __setitem__(self, key, val):
+        """
+        Overrides the dict.__setitem__ method to raise an assertion error if a key already exists.
+        """
+        assert key not in self.keys(), f"Key already exists {key}"
+        super().__setitem__(key, val)
+    def search(self, keyword, replace_to=None):
+        """
+        Returns a new xdict object containing only the key-value pairs with keys that contain the provided keyword.
+        """
+        out_dict = {}
+        for k in self.keys():
+            if keyword in k:
+                if replace_to is None:
+                    out_dict[k] = self[k]
+                else:
+                    out_dict[k.replace(keyword, replace_to)] = self[k]
+        return xdict(out_dict)
+    def rm(self, keyword, keep_list=[], verbose=False):
+        """
+        Returns a new xdict object with keys that contain keyword removed. Keys in keep_list are excluded from the removal.
+        """
+        out_dict = {}
+        for k in self.keys():
+            if keyword not in k or k in keep_list:
+                out_dict[k] = self[k]
+            else:
+                if verbose:
+                    print(f"Removing: {k}")
+        return xdict(out_dict)
+    def overwrite(self, k, v):
+        """
+        The original assignment operation of Python dict
+        """
+        super().__setitem__(k, v)
+    def merge(self, dict2):
+        """
+        Same as dict.update(), but raises an assertion error if there are duplicate keys between the two dictionaries.
+        Args:
+            dict2 (dict or xdict): The dictionary or xdict instance to merge with.
+        Raises:
+            AssertionError: If dict2 is not a dictionary or xdict instance.
+            AssertionError: If there are duplicate keys between the two instances.
+        """
+        assert isinstance(dict2, (dict, xdict))
+        mykeys = set(self.keys())
+        intersect = mykeys.intersection(set(dict2.keys()))
+        assert len(intersect) == 0, f"Merge failed: duplicate keys ({intersect})"
+        self.update(dict2)
+    def mul(self, scalar):
+        """
+        Multiplies each value (could be tensor, np.array, list) in the xdict instance by the provided scalar.
+        Args:
+            scalar (float): The scalar to multiply the values by.
+        Raises:
+            AssertionError: If scalar is not a float.
+        """
+        if isinstance(scalar, int):
+            scalar = 1.0 * scalar
+        assert isinstance(scalar, float)
+        out_dict = {}
+        for k in self.keys():
+            if isinstance(self[k], list):
+                out_dict[k] = [v * scalar for v in self[k]]
+            else:
+                out_dict[k] = self[k] * scalar
+        return xdict(out_dict)
+    def prefix(self, text):
+        """
+        Adds a prefix to each key in the xdict instance.
+        Args:
+            text (str): The prefix to add.
+        Returns:
+            xdict: The xdict instance with the added prefix.
+        """
+        out_dict = {}
+        for k in self.keys():
+            out_dict[text + k] = self[k]
+        return xdict(out_dict)
+    def replace_keys(self, str_src, str_tar):
+        """
+        Replaces a substring in all keys of the xdict instance.
+        Args:
+            str_src (str): The substring to replace.
+            str_tar (str): The replacement string.
+        Returns:
+            xdict: The xdict instance with the replaced keys.
+        """
+        out_dict = {}
+        for k in self.keys():
+            old_key = k
+            new_key = old_key.replace(str_src, str_tar)
+            out_dict[new_key] = self[k]
+        return xdict(out_dict)
+    def postfix(self, text):
+        """
+        Adds a postfix to each key in the xdict instance.
+        Args:
+            text (str): The postfix to add.
+        Returns:
+            xdict: The xdict instance with the added postfix.
+        """
+        out_dict = {}
+        for k in self.keys():
+            out_dict[k + text] = self[k]
+        return xdict(out_dict)
+    def sorted_keys(self):
+        """
+        Returns a sorted list of the keys in the xdict instance.
+        Returns:
+            list: A sorted list of keys in the xdict instance.
+        """
+        return sorted(list(self.keys()))
+    def to(self, dev):
+        """
+        Moves the xdict instance to a specific device.
+        Args:
+            dev (torch.device): The device to move the instance to.
+        Returns:
+            xdict: The xdict instance moved to the specified device.
+        """
+        if dev is None:
+            return self
+        raw_dict = dict(self)
+        return xdict(thing.thing2dev(raw_dict, dev))
+    def to_torch(self):
+        """
+        Converts elements in the xdict to Torch tensors and returns a new xdict.
+        Returns:
+        xdict: A new xdict with Torch tensors as values.
+        """
+        return xdict(thing.thing2torch(self))
+    def to_np(self):
+        """
+        Converts elements in the xdict to numpy arrays and returns a new xdict.
+        Returns:
+        xdict: A new xdict with numpy arrays as values.
+        """
+        return xdict(thing.thing2np(self))
+    def tolist(self):
+        """
+        Converts elements in the xdict to Python lists and returns a new xdict.
+        Returns:
+        xdict: A new xdict with Python lists as values.
+        """
+        return xdict(thing.thing2list(self))
+    def print_stat(self):
+        """
+        Prints statistics for each item in the xdict.
+        """
+        for k, v in self.items():
+            _print_stat(k, v)
+    def detach(self):
+        """
+        Detaches all Torch tensors in the xdict from the computational graph and moves them to the CPU.
+        Non-tensor objects are ignored.
+        Returns:
+        xdict: A new xdict with detached Torch tensors as values.
+        """
+        return xdict(thing.detach_thing(self))
+    def has_invalid(self):
+        """
+        Checks if any of the Torch tensors in the xdict contain NaN or Inf values.
+        Returns:
+        bool: True if at least one tensor contains NaN or Inf values, False otherwise.
+        """
+        for k, v in self.items():
+            if isinstance(v, torch.Tensor):
+                if torch.isnan(v).any():
+                    print(f"{k} contains nan values")
+                    return True
+                if torch.isinf(v).any():
+                    print(f"{k} contains inf values")
+                    return True
+        return False
+    def apply(self, operation, criterion=None):
+        """
+        Applies an operation to the values in the xdict, based on an optional criterion.
+        Args:
+        operation (callable): A callable object that takes a single argument and returns a value.
+        criterion (callable, optional): A callable object that takes two arguments (key and value) and returns a boolean.
+        Returns:
+        xdict: A new xdict with the same keys as the original, but with the values modified by the operation.
+        """
+        out = {}
+        for k, v in self.items():
+            if criterion is None or criterion(k, v):
+                out[k] = operation(v)
+        return xdict(out)
+    def save(self, path, dev=None, verbose=True):
+        """
+        Saves the xdict to disk as a Torch tensor.
+        Args:
+        path (str): The path to save the xdict.
+        dev (torch.device, optional): The device to use for saving the tensor (default is CPU).
+        verbose (bool, optional): Whether to print a message indicating that the xdict has been saved (default is True).
+        """
+        if verbose:
+            print(f"Saving to {path}")
+        torch.save(self.to(dev), path)

data_loaders/.DS_Store ADDED Viewed

Binary file (6.15 kB). View file

data_loaders/__pycache__/get_data.cpython-38.pyc ADDED Viewed

Binary file (4.42 kB). View file

data_loaders/__pycache__/tensors.cpython-38.pyc ADDED Viewed

Binary file (6.98 kB). View file

data_loaders/get_data.py ADDED Viewed

	@@ -0,0 +1,178 @@

+from torch.utils.data import DataLoader
+from data_loaders.tensors import collate as all_collate
+from data_loaders.tensors import t2m_collate, motion_ours_collate, motion_ours_singe_seq_collate, motion_ours_obj_base_rel_dist_collate
+# from data_loaders.humanml.data.dataset import HumanML3D
+import torch
+def get_dataset_class(name, args=None):
+    if name == "amass":
+        from .amass import AMASS
+        return AMASS
+    elif name == "uestc":
+        from .a2m.uestc import UESTC
+        return UESTC
+    elif name == "humanact12":
+        from .a2m.humanact12poses import HumanAct12Poses
+        return HumanAct12Poses ## to pose ##
+    elif name == "humanml":
+        from data_loaders.humanml.data.dataset import HumanML3D
+        return HumanML3D
+    elif name == "kit":
+        from data_loaders.humanml.data.dataset import KIT
+        return KIT
+    elif name == "motion_ours": # motion ours
+        if len(args.single_seq_path) > 0 and not args.use_predicted_infos and not args.use_interpolated_infos:
+            print(f"Using single frame dataset for evaluation purpose...")
+            # from data_loaders.humanml.data.dataset_ours_single_seq import GRAB_Dataset_V16
+            if args.rep_type == "obj_base_rel_dist":
+                from data_loaders.humanml.data.dataset_ours_single_seq import GRAB_Dataset_V17 as my_data
+            elif args.rep_type == "ambient_obj_base_rel_dist":
+                from data_loaders.humanml.data.dataset_ours_single_seq import GRAB_Dataset_V18 as my_data
+            elif args.rep_type in[ "obj_base_rel_dist_we", "obj_base_rel_dist_we_wj", "obj_base_rel_dist_we_wj_latents"]:
+                if args.use_arctic and args.use_pose_pred:
+                    from data_loaders.humanml.data.dataset_ours_single_seq import GRAB_Dataset_V19_Arctic_from_Pred as my_data
+                elif args.use_hho:
+                    from data_loaders.humanml.data.dataset_ours_single_seq import GRAB_Dataset_V19_HHO as my_data
+                elif args.use_arctic:
+                    from data_loaders.humanml.data.dataset_ours_single_seq import GRAB_Dataset_V19_Arctic as my_data
+                elif len(args.cad_model_fn) > 0:
+                    from data_loaders.humanml.data.dataset_ours_single_seq import GRAB_Dataset_V19_Ours as my_data
+                elif len(args.predicted_info_fn) > 0:
+                    from data_loaders.humanml.data.dataset_ours_single_seq import GRAB_Dataset_V19_From_Evaluated_Info as my_data
+                else:
+                    from data_loaders.humanml.data.dataset_ours_single_seq import GRAB_Dataset_V19 as my_data
+            else:
+                from data_loaders.humanml.data.dataset_ours_single_seq import GRAB_Dataset_V16 as my_data
+            return my_data
+        else:
+            if args.rep_type == "obj_base_rel_dist":
+                from data_loaders.humanml.data.dataset_ours import GRAB_Dataset_V17 as my_data
+            elif args.rep_type == "ambient_obj_base_rel_dist":
+                from data_loaders.humanml.data.dataset_ours import GRAB_Dataset_V18 as my_data
+            elif args.rep_type in ["obj_base_rel_dist_we", "obj_base_rel_dist_we_wj", "obj_base_rel_dist_we_wj_latents"]:
+                if args.use_arctic:
+                    from data_loaders.humanml.data.dataset_ours import GRAB_Dataset_V19_ARCTIC as my_data
+                elif args.use_vox_data: # use vox data here #
+                    from data_loaders.humanml.data.dataset_ours import GRAB_Dataset_V20 as my_data
+                elif args.use_predicted_infos: # train with predicted infos for test tim adaptation #
+                    from data_loaders.humanml.data.dataset_ours import GRAB_Dataset_V21 as my_data
+                elif args.use_interpolated_infos:
+                    # GRAB_Dataset_V22
+                    from data_loaders.humanml.data.dataset_ours import GRAB_Dataset_V22 as my_data
+                else:
+                    from data_loaders.humanml.data.dataset_ours import GRAB_Dataset_V19 as my_data
+            else:
+                from data_loaders.humanml.data.dataset_ours import GRAB_Dataset_V16 as my_data
+            return my_data
+            # from data_loaders.humanml.data.dataset_ours import GRAB_Dataset_V16
+        # return GRAB_Dataset_V16
+    else:
+        raise ValueError(f'Unsupported dataset name [{name}]')
+def get_collate_fn(name, hml_mode='train', args=None):
+    print(f"name: {name}, hml_mode: {hml_mode}")
+    if hml_mode == 'gt':
+        from data_loaders.humanml.data.dataset import collate_fn as t2m_eval_collate
+        return t2m_eval_collate
+    if name in ["humanml", "kit"]:
+        return t2m_collate
+    elif name in ["motion_ours"]:
+        ## === single seq path === ##
+        print(f"single_seq_path: {args.single_seq_path}, rep_type: {args.rep_type}")
+        # motion_ours_obj_base_rel_dist_collate
+        ## rep_type of the obj_base_pts rel_dist; ambient obj base rel dist ##
+        if args.rep_type in ["obj_base_rel_dist", "ambient_obj_base_rel_dist", "obj_base_rel_dist_we", "obj_base_rel_dist_we_wj", "obj_base_rel_dist_we_wj_latents"]:
+            return motion_ours_obj_base_rel_dist_collate
+        else: # single_seq_path #
+            if len(args.single_seq_path) > 0:
+                return motion_ours_singe_seq_collate
+            else:
+                return motion_ours_collate
+        # if len(args.single_seq_path) > 0:
+        #     return motion_ours_singe_seq_collate
+        # else:
+        #     if args.rep_type == "obj_base_rel_dist":
+        #         return motion_ours_obj_base_rel_dist_collate
+        #     else:
+        #         return motion_ours_collate
+    else:
+        return all_collate
+## get dataset and datasset ###
+def get_dataset(name, num_frames, split='train', hml_mode='train', args=None):
+    DATA = get_dataset_class(name, args=args)
+    if name in ["humanml", "kit"]:
+        dataset = DATA(split=split, num_frames=num_frames, mode=hml_mode)
+    elif name in ["motion_ours"]:
+        # humanml_datawarper = HumanML3D(split=split, num_frames=num_frames, mode=hml_mode, load_vectorizer=True)
+        # w_vectorizer = humanml_datawarper.w_vectorizer
+        w_vectorizer = None
+        # split = "val" ## add split, split here --> split --> split and split ##
+        data_path = "/data1/sim/GRAB_processed"
+        # split, w_vectorizer, window_size=30, step_size=15, num_points=8000, args=None
+        window_size = args.window_size
+        # split=  "val"
+        dataset = DATA(data_path, split=split, w_vectorizer=w_vectorizer, window_size=window_size, step_size=15, num_points=8000, args=args)
+    else:
+        dataset = DATA(split=split, num_frames=num_frames)
+    return dataset
+def get_dataset_only(name, batch_size, num_frames, split='train', hml_mode='train', args=None):
+    dataset = get_dataset(name, num_frames, split, hml_mode, args=args)
+    return dataset
+# python -m train.train_mdm --save_dir save/my_humanml_trans_enc_512 --dataset motion_ours
+def get_dataset_loader(name, batch_size, num_frames, split='train', hml_mode='train', args=None):
+    dataset = get_dataset(name, num_frames, split, hml_mode, args=args)
+    collate = get_collate_fn(name, hml_mode, args=args)
+    if args is not None and name in ["motion_ours"] and len(args.single_seq_path) > 0:
+        shuffle_loader = False
+        drop_last = False
+    else:
+        shuffle_loader = True
+        drop_last = True
+    num_workers = 8 ## get data; get data loader ##
+    num_workers = 16 # num_workers # ## num_workders #
+    ### ==== create dataloader here ==== ###
+    ### ==== create dataloader here ==== ###
+    loader = DataLoader( # tag for each sequence
+        dataset, batch_size=batch_size, shuffle=shuffle_loader,
+        num_workers=num_workers, drop_last=drop_last, collate_fn=collate
+    )
+    return loader
+# python -m train.train_mdm --save_dir save/my_humanml_trans_enc_512 --dataset motion_ours
+def get_dataset_loader_dist(name, batch_size, num_frames, split='train', hml_mode='train', args=None):
+    dataset = get_dataset(name, num_frames, split, hml_mode, args=args)
+    collate = get_collate_fn(name, hml_mode, args=args)
+    if args is not None and name in ["motion_ours"] and len(args.single_seq_path) > 0:
+        # shuffle_loader = False
+        drop_last = False
+    else:
+        # shuffle_loader = True
+        drop_last = True
+    num_workers = 8 ## get data; get data loader ##
+    num_workers = 16 # num_workers # ## num_workders #
+    ### ==== create dataloader here ==== ###
+    ### ==== create dataloader here ==== ###
+    ''' dist sampler and loader '''
+    sampler = torch.utils.data.distributed.DistributedSampler(dataset)
+    loader = DataLoader(dataset, batch_size=batch_size,
+        sampler=sampler, num_workers=num_workers, drop_last=drop_last, collate_fn=collate)
+    # loader = DataLoader( # tag for each sequence
+    #     dataset, batch_size=batch_size, shuffle=shuffle_loader,
+    #     num_workers=num_workers, drop_last=drop_last, collate_fn=collate
+    # )
+    return loader

data_loaders/humanml/.DS_Store ADDED Viewed

Binary file (6.15 kB). View file

data_loaders/humanml/README.md ADDED Viewed

	@@ -0,0 +1 @@


1	+ This code is based on https://github.com/EricGuo5513/text-to-motion.git

data_loaders/humanml/common/__pycache__/quaternion.cpython-38.pyc ADDED Viewed

Binary file (11.6 kB). View file

data_loaders/humanml/common/__pycache__/skeleton.cpython-38.pyc ADDED Viewed

Binary file (6.15 kB). View file

data_loaders/humanml/common/quaternion.py ADDED Viewed

	@@ -0,0 +1,423 @@

+# Copyright (c) 2018-present, Facebook, Inc.
+# All rights reserved.
+#
+# This source code is licensed under the license found in the
+# LICENSE file in the root directory of this source tree.
+#
+import torch
+import numpy as np
+_EPS4 = np.finfo(float).eps * 4.0
+_FLOAT_EPS = np.finfo(np.float).eps
+# PyTorch-backed implementations
+def qinv(q):
+    assert q.shape[-1] == 4, 'q must be a tensor of shape (*, 4)'
+    mask = torch.ones_like(q)
+    mask[..., 1:] = -mask[..., 1:]
+    return q * mask
+def qinv_np(q):
+    assert q.shape[-1] == 4, 'q must be a tensor of shape (*, 4)'
+    return qinv(torch.from_numpy(q).float()).numpy()
+def qnormalize(q):
+    assert q.shape[-1] == 4, 'q must be a tensor of shape (*, 4)'
+    return q / torch.norm(q, dim=-1, keepdim=True)
+def qmul(q, r):
+    """
+    Multiply quaternion(s) q with quaternion(s) r.
+    Expects two equally-sized tensors of shape (*, 4), where * denotes any number of dimensions.
+    Returns q*r as a tensor of shape (*, 4).
+    """
+    assert q.shape[-1] == 4
+    assert r.shape[-1] == 4
+    original_shape = q.shape
+    # Compute outer product
+    terms = torch.bmm(r.view(-1, 4, 1), q.view(-1, 1, 4))
+    w = terms[:, 0, 0] - terms[:, 1, 1] - terms[:, 2, 2] - terms[:, 3, 3]
+    x = terms[:, 0, 1] + terms[:, 1, 0] - terms[:, 2, 3] + terms[:, 3, 2]
+    y = terms[:, 0, 2] + terms[:, 1, 3] + terms[:, 2, 0] - terms[:, 3, 1]
+    z = terms[:, 0, 3] - terms[:, 1, 2] + terms[:, 2, 1] + terms[:, 3, 0]
+    return torch.stack((w, x, y, z), dim=1).view(original_shape)
+def qrot(q, v):
+    """
+    Rotate vector(s) v about the rotation described by quaternion(s) q.
+    Expects a tensor of shape (*, 4) for q and a tensor of shape (*, 3) for v,
+    where * denotes any number of dimensions.
+    Returns a tensor of shape (*, 3).
+    """
+    assert q.shape[-1] == 4
+    assert v.shape[-1] == 3
+    assert q.shape[:-1] == v.shape[:-1]
+    original_shape = list(v.shape)
+    # print(q.shape)
+    q = q.contiguous().view(-1, 4)
+    v = v.contiguous().view(-1, 3)
+    qvec = q[:, 1:]
+    uv = torch.cross(qvec, v, dim=1)
+    uuv = torch.cross(qvec, uv, dim=1)
+    return (v + 2 * (q[:, :1] * uv + uuv)).view(original_shape)
+def qeuler(q, order, epsilon=0, deg=True):
+    """
+    Convert quaternion(s) q to Euler angles.
+    Expects a tensor of shape (*, 4), where * denotes any number of dimensions.
+    Returns a tensor of shape (*, 3).
+    """
+    assert q.shape[-1] == 4
+    original_shape = list(q.shape)
+    original_shape[-1] = 3
+    q = q.view(-1, 4)
+    q0 = q[:, 0]
+    q1 = q[:, 1]
+    q2 = q[:, 2]
+    q3 = q[:, 3]
+    if order == 'xyz':
+        x = torch.atan2(2 * (q0 * q1 - q2 * q3), 1 - 2 * (q1 * q1 + q2 * q2))
+        y = torch.asin(torch.clamp(2 * (q1 * q3 + q0 * q2), -1 + epsilon, 1 - epsilon))
+        z = torch.atan2(2 * (q0 * q3 - q1 * q2), 1 - 2 * (q2 * q2 + q3 * q3))
+    elif order == 'yzx':
+        x = torch.atan2(2 * (q0 * q1 - q2 * q3), 1 - 2 * (q1 * q1 + q3 * q3))
+        y = torch.atan2(2 * (q0 * q2 - q1 * q3), 1 - 2 * (q2 * q2 + q3 * q3))
+        z = torch.asin(torch.clamp(2 * (q1 * q2 + q0 * q3), -1 + epsilon, 1 - epsilon))
+    elif order == 'zxy':
+        x = torch.asin(torch.clamp(2 * (q0 * q1 + q2 * q3), -1 + epsilon, 1 - epsilon))
+        y = torch.atan2(2 * (q0 * q2 - q1 * q3), 1 - 2 * (q1 * q1 + q2 * q2))
+        z = torch.atan2(2 * (q0 * q3 - q1 * q2), 1 - 2 * (q1 * q1 + q3 * q3))
+    elif order == 'xzy':
+        x = torch.atan2(2 * (q0 * q1 + q2 * q3), 1 - 2 * (q1 * q1 + q3 * q3))
+        y = torch.atan2(2 * (q0 * q2 + q1 * q3), 1 - 2 * (q2 * q2 + q3 * q3))
+        z = torch.asin(torch.clamp(2 * (q0 * q3 - q1 * q2), -1 + epsilon, 1 - epsilon))
+    elif order == 'yxz':
+        x = torch.asin(torch.clamp(2 * (q0 * q1 - q2 * q3), -1 + epsilon, 1 - epsilon))
+        y = torch.atan2(2 * (q1 * q3 + q0 * q2), 1 - 2 * (q1 * q1 + q2 * q2))
+        z = torch.atan2(2 * (q1 * q2 + q0 * q3), 1 - 2 * (q1 * q1 + q3 * q3))
+    elif order == 'zyx':
+        x = torch.atan2(2 * (q0 * q1 + q2 * q3), 1 - 2 * (q1 * q1 + q2 * q2))
+        y = torch.asin(torch.clamp(2 * (q0 * q2 - q1 * q3), -1 + epsilon, 1 - epsilon))
+        z = torch.atan2(2 * (q0 * q3 + q1 * q2), 1 - 2 * (q2 * q2 + q3 * q3))
+    else:
+        raise
+    if deg:
+        return torch.stack((x, y, z), dim=1).view(original_shape) * 180 / np.pi
+    else:
+        return torch.stack((x, y, z), dim=1).view(original_shape)
+# Numpy-backed implementations
+def qmul_np(q, r):
+    q = torch.from_numpy(q).contiguous().float()
+    r = torch.from_numpy(r).contiguous().float()
+    return qmul(q, r).numpy()
+def qrot_np(q, v):
+    q = torch.from_numpy(q).contiguous().float()
+    v = torch.from_numpy(v).contiguous().float()
+    return qrot(q, v).numpy()
+def qeuler_np(q, order, epsilon=0, use_gpu=False):
+    if use_gpu:
+        q = torch.from_numpy(q).cuda().float()
+        return qeuler(q, order, epsilon).cpu().numpy()
+    else:
+        q = torch.from_numpy(q).contiguous().float()
+        return qeuler(q, order, epsilon).numpy()
+def qfix(q):
+    """
+    Enforce quaternion continuity across the time dimension by selecting
+    the representation (q or -q) with minimal distance (or, equivalently, maximal dot product)
+    between two consecutive frames.
+    Expects a tensor of shape (L, J, 4), where L is the sequence length and J is the number of joints.
+    Returns a tensor of the same shape.
+    """
+    assert len(q.shape) == 3
+    assert q.shape[-1] == 4
+    result = q.copy()
+    dot_products = np.sum(q[1:] * q[:-1], axis=2)
+    mask = dot_products < 0
+    mask = (np.cumsum(mask, axis=0) % 2).astype(bool)
+    result[1:][mask] *= -1
+    return result
+def euler2quat(e, order, deg=True):
+    """
+    Convert Euler angles to quaternions.
+    """
+    assert e.shape[-1] == 3
+    original_shape = list(e.shape)
+    original_shape[-1] = 4
+    e = e.view(-1, 3)
+    ## if euler angles in degrees
+    if deg:
+        e = e * np.pi / 180.
+    x = e[:, 0]
+    y = e[:, 1]
+    z = e[:, 2]
+    rx = torch.stack((torch.cos(x / 2), torch.sin(x / 2), torch.zeros_like(x), torch.zeros_like(x)), dim=1)
+    ry = torch.stack((torch.cos(y / 2), torch.zeros_like(y), torch.sin(y / 2), torch.zeros_like(y)), dim=1)
+    rz = torch.stack((torch.cos(z / 2), torch.zeros_like(z), torch.zeros_like(z), torch.sin(z / 2)), dim=1)
+    result = None
+    for coord in order:
+        if coord == 'x':
+            r = rx
+        elif coord == 'y':
+            r = ry
+        elif coord == 'z':
+            r = rz
+        else:
+            raise
+        if result is None:
+            result = r
+        else:
+            result = qmul(result, r)
+    # Reverse antipodal representation to have a non-negative "w"
+    if order in ['xyz', 'yzx', 'zxy']:
+        result *= -1
+    return result.view(original_shape)
+def expmap_to_quaternion(e):
+    """
+    Convert axis-angle rotations (aka exponential maps) to quaternions.
+    Stable formula from "Practical Parameterization of Rotations Using the Exponential Map".
+    Expects a tensor of shape (*, 3), where * denotes any number of dimensions.
+    Returns a tensor of shape (*, 4).
+    """
+    assert e.shape[-1] == 3
+    original_shape = list(e.shape)
+    original_shape[-1] = 4
+    e = e.reshape(-1, 3)
+    theta = np.linalg.norm(e, axis=1).reshape(-1, 1)
+    w = np.cos(0.5 * theta).reshape(-1, 1)
+    xyz = 0.5 * np.sinc(0.5 * theta / np.pi) * e
+    return np.concatenate((w, xyz), axis=1).reshape(original_shape)
+def euler_to_quaternion(e, order):
+    """
+    Convert Euler angles to quaternions.
+    """
+    assert e.shape[-1] == 3
+    original_shape = list(e.shape)
+    original_shape[-1] = 4
+    e = e.reshape(-1, 3)
+    x = e[:, 0]
+    y = e[:, 1]
+    z = e[:, 2]
+    rx = np.stack((np.cos(x / 2), np.sin(x / 2), np.zeros_like(x), np.zeros_like(x)), axis=1)
+    ry = np.stack((np.cos(y / 2), np.zeros_like(y), np.sin(y / 2), np.zeros_like(y)), axis=1)
+    rz = np.stack((np.cos(z / 2), np.zeros_like(z), np.zeros_like(z), np.sin(z / 2)), axis=1)
+    result = None
+    for coord in order:
+        if coord == 'x':
+            r = rx
+        elif coord == 'y':
+            r = ry
+        elif coord == 'z':
+            r = rz
+        else:
+            raise
+        if result is None:
+            result = r
+        else:
+            result = qmul_np(result, r)
+    # Reverse antipodal representation to have a non-negative "w"
+    if order in ['xyz', 'yzx', 'zxy']:
+        result *= -1
+    return result.reshape(original_shape)
+def quaternion_to_matrix(quaternions):
+    """
+    Convert rotations given as quaternions to rotation matrices.
+    Args:
+        quaternions: quaternions with real part first,
+            as tensor of shape (..., 4).
+    Returns:
+        Rotation matrices as tensor of shape (..., 3, 3).
+    """
+    r, i, j, k = torch.unbind(quaternions, -1)
+    two_s = 2.0 / (quaternions * quaternions).sum(-1)
+    o = torch.stack(
+        (
+            1 - two_s * (j * j + k * k),
+            two_s * (i * j - k * r),
+            two_s * (i * k + j * r),
+            two_s * (i * j + k * r),
+            1 - two_s * (i * i + k * k),
+            two_s * (j * k - i * r),
+            two_s * (i * k - j * r),
+            two_s * (j * k + i * r),
+            1 - two_s * (i * i + j * j),
+        ),
+        -1,
+    )
+    return o.reshape(quaternions.shape[:-1] + (3, 3))
+def quaternion_to_matrix_np(quaternions):
+    q = torch.from_numpy(quaternions).contiguous().float()
+    return quaternion_to_matrix(q).numpy()
+def quaternion_to_cont6d_np(quaternions):
+    rotation_mat = quaternion_to_matrix_np(quaternions)
+    cont_6d = np.concatenate([rotation_mat[..., 0], rotation_mat[..., 1]], axis=-1)
+    return cont_6d
+def quaternion_to_cont6d(quaternions):
+    rotation_mat = quaternion_to_matrix(quaternions)
+    cont_6d = torch.cat([rotation_mat[..., 0], rotation_mat[..., 1]], dim=-1)
+    return cont_6d
+def cont6d_to_matrix(cont6d):
+    assert cont6d.shape[-1] == 6, "The last dimension must be 6"
+    x_raw = cont6d[..., 0:3]
+    y_raw = cont6d[..., 3:6]
+    x = x_raw / torch.norm(x_raw, dim=-1, keepdim=True)
+    z = torch.cross(x, y_raw, dim=-1)
+    z = z / torch.norm(z, dim=-1, keepdim=True)
+    y = torch.cross(z, x, dim=-1)
+    x = x[..., None]
+    y = y[..., None]
+    z = z[..., None]
+    mat = torch.cat([x, y, z], dim=-1)
+    return mat
+def cont6d_to_matrix_np(cont6d):
+    q = torch.from_numpy(cont6d).contiguous().float()
+    return cont6d_to_matrix(q).numpy()
+def qpow(q0, t, dtype=torch.float):
+    ''' q0 : tensor of quaternions
+    t: tensor of powers
+    '''
+    q0 = qnormalize(q0)
+    theta0 = torch.acos(q0[..., 0])
+    ## if theta0 is close to zero, add epsilon to avoid NaNs
+    mask = (theta0 <= 10e-10) * (theta0 >= -10e-10)
+    theta0 = (1 - mask) * theta0 + mask * 10e-10
+    v0 = q0[..., 1:] / torch.sin(theta0).view(-1, 1)
+    if isinstance(t, torch.Tensor):
+        q = torch.zeros(t.shape + q0.shape)
+        theta = t.view(-1, 1) * theta0.view(1, -1)
+    else:  ## if t is a number
+        q = torch.zeros(q0.shape)
+        theta = t * theta0
+    q[..., 0] = torch.cos(theta)
+    q[..., 1:] = v0 * torch.sin(theta).unsqueeze(-1)
+    return q.to(dtype)
+def qslerp(q0, q1, t):
+    '''
+    q0: starting quaternion
+    q1: ending quaternion
+    t: array of points along the way
+    Returns:
+    Tensor of Slerps: t.shape + q0.shape
+    '''
+    q0 = qnormalize(q0)
+    q1 = qnormalize(q1)
+    q_ = qpow(qmul(q1, qinv(q0)), t)
+    return qmul(q_,
+                q0.contiguous().view(torch.Size([1] * len(t.shape)) + q0.shape).expand(t.shape + q0.shape).contiguous())
+def qbetween(v0, v1):
+    '''
+    find the quaternion used to rotate v0 to v1
+    '''
+    assert v0.shape[-1] == 3, 'v0 must be of the shape (*, 3)'
+    assert v1.shape[-1] == 3, 'v1 must be of the shape (*, 3)'
+    v = torch.cross(v0, v1)
+    w = torch.sqrt((v0 ** 2).sum(dim=-1, keepdim=True) * (v1 ** 2).sum(dim=-1, keepdim=True)) + (v0 * v1).sum(dim=-1,
+                                                                                                              keepdim=True)
+    return qnormalize(torch.cat([w, v], dim=-1))
+def qbetween_np(v0, v1):
+    '''
+    find the quaternion used to rotate v0 to v1
+    '''
+    assert v0.shape[-1] == 3, 'v0 must be of the shape (*, 3)'
+    assert v1.shape[-1] == 3, 'v1 must be of the shape (*, 3)'
+    v0 = torch.from_numpy(v0).float()
+    v1 = torch.from_numpy(v1).float()
+    return qbetween(v0, v1).numpy()
+def lerp(p0, p1, t):
+    if not isinstance(t, torch.Tensor):
+        t = torch.Tensor([t])
+    new_shape = t.shape + p0.shape
+    new_view_t = t.shape + torch.Size([1] * len(p0.shape))
+    new_view_p = torch.Size([1] * len(t.shape)) + p0.shape
+    p0 = p0.view(new_view_p).expand(new_shape)
+    p1 = p1.view(new_view_p).expand(new_shape)
+    t = t.view(new_view_t).expand(new_shape)
+    return p0 + t * (p1 - p0)

data_loaders/humanml/common/skeleton.py ADDED Viewed

	@@ -0,0 +1,199 @@

+from data_loaders.humanml.common.quaternion import *
+import scipy.ndimage.filters as filters
+class Skeleton(object):
+    def __init__(self, offset, kinematic_tree, device):
+        self.device = device
+        self._raw_offset_np = offset.numpy()
+        self._raw_offset = offset.clone().detach().to(device).float()
+        self._kinematic_tree = kinematic_tree
+        self._offset = None
+        self._parents = [0] * len(self._raw_offset)
+        self._parents[0] = -1
+        for chain in self._kinematic_tree:
+            for j in range(1, len(chain)):
+                self._parents[chain[j]] = chain[j-1]
+    def njoints(self):
+        return len(self._raw_offset)
+    def offset(self):
+        return self._offset
+    def set_offset(self, offsets):
+        self._offset = offsets.clone().detach().to(self.device).float()
+    def kinematic_tree(self):
+        return self._kinematic_tree
+    def parents(self):
+        return self._parents
+    # joints (batch_size, joints_num, 3)
+    def get_offsets_joints_batch(self, joints):
+        assert len(joints.shape) == 3
+        _offsets = self._raw_offset.expand(joints.shape[0], -1, -1).clone()
+        for i in range(1, self._raw_offset.shape[0]):
+            _offsets[:, i] = torch.norm(joints[:, i] - joints[:, self._parents[i]], p=2, dim=1)[:, None] * _offsets[:, i]
+        self._offset = _offsets.detach()
+        return _offsets
+    # joints (joints_num, 3)
+    def get_offsets_joints(self, joints):
+        assert len(joints.shape) == 2
+        _offsets = self._raw_offset.clone()
+        for i in range(1, self._raw_offset.shape[0]):
+            # print(joints.shape)
+            _offsets[i] = torch.norm(joints[i] - joints[self._parents[i]], p=2, dim=0) * _offsets[i]
+        self._offset = _offsets.detach()
+        return _offsets
+    # face_joint_idx should follow the order of right hip, left hip, right shoulder, left shoulder
+    # joints (batch_size, joints_num, 3)
+    def inverse_kinematics_np(self, joints, face_joint_idx, smooth_forward=False):
+        assert len(face_joint_idx) == 4
+        '''Get Forward Direction'''
+        l_hip, r_hip, sdr_r, sdr_l = face_joint_idx
+        across1 = joints[:, r_hip] - joints[:, l_hip]
+        across2 = joints[:, sdr_r] - joints[:, sdr_l]
+        across = across1 + across2
+        across = across / np.sqrt((across**2).sum(axis=-1))[:, np.newaxis]
+        # print(across1.shape, across2.shape)
+        # forward (batch_size, 3)
+        forward = np.cross(np.array([[0, 1, 0]]), across, axis=-1)
+        if smooth_forward:
+            forward = filters.gaussian_filter1d(forward, 20, axis=0, mode='nearest')
+            # forward (batch_size, 3)
+        forward = forward / np.sqrt((forward**2).sum(axis=-1))[..., np.newaxis]
+        '''Get Root Rotation'''
+        target = np.array([[0,0,1]]).repeat(len(forward), axis=0)
+        root_quat = qbetween_np(forward, target)
+        '''Inverse Kinematics'''
+        # quat_params (batch_size, joints_num, 4)
+        # print(joints.shape[:-1])
+        quat_params = np.zeros(joints.shape[:-1] + (4,))
+        # print(quat_params.shape)
+        root_quat[0] = np.array([[1.0, 0.0, 0.0, 0.0]])
+        quat_params[:, 0] = root_quat
+        # quat_params[0, 0] = np.array([[1.0, 0.0, 0.0, 0.0]])
+        for chain in self._kinematic_tree:
+            R = root_quat
+            for j in range(len(chain) - 1):
+                # (batch, 3)
+                u = self._raw_offset_np[chain[j+1]][np.newaxis,...].repeat(len(joints), axis=0)
+                # print(u.shape)
+                # (batch, 3)
+                v = joints[:, chain[j+1]] - joints[:, chain[j]]
+                v = v / np.sqrt((v**2).sum(axis=-1))[:, np.newaxis]
+                # print(u.shape, v.shape)
+                rot_u_v = qbetween_np(u, v)
+                R_loc = qmul_np(qinv_np(R), rot_u_v)
+                quat_params[:,chain[j + 1], :] = R_loc
+                R = qmul_np(R, R_loc)
+        return quat_params
+    # Be sure root joint is at the beginning of kinematic chains
+    def forward_kinematics(self, quat_params, root_pos, skel_joints=None, do_root_R=True):
+        # quat_params (batch_size, joints_num, 4)
+        # joints (batch_size, joints_num, 3)
+        # root_pos (batch_size, 3)
+        if skel_joints is not None:
+            offsets = self.get_offsets_joints_batch(skel_joints)
+        if len(self._offset.shape) == 2:
+            offsets = self._offset.expand(quat_params.shape[0], -1, -1)
+        joints = torch.zeros(quat_params.shape[:-1] + (3,)).to(self.device)
+        joints[:, 0] = root_pos
+        for chain in self._kinematic_tree:
+            if do_root_R:
+                R = quat_params[:, 0]
+            else:
+                R = torch.tensor([[1.0, 0.0, 0.0, 0.0]]).expand(len(quat_params), -1).detach().to(self.device)
+            for i in range(1, len(chain)):
+                R = qmul(R, quat_params[:, chain[i]])
+                offset_vec = offsets[:, chain[i]]
+                joints[:, chain[i]] = qrot(R, offset_vec) + joints[:, chain[i-1]]
+        return joints
+    # Be sure root joint is at the beginning of kinematic chains
+    def forward_kinematics_np(self, quat_params, root_pos, skel_joints=None, do_root_R=True):
+        # quat_params (batch_size, joints_num, 4)
+        # joints (batch_size, joints_num, 3)
+        # root_pos (batch_size, 3)
+        if skel_joints is not None:
+            skel_joints = torch.from_numpy(skel_joints)
+            offsets = self.get_offsets_joints_batch(skel_joints)
+        if len(self._offset.shape) == 2:
+            offsets = self._offset.expand(quat_params.shape[0], -1, -1)
+        offsets = offsets.numpy()
+        joints = np.zeros(quat_params.shape[:-1] + (3,))
+        joints[:, 0] = root_pos
+        for chain in self._kinematic_tree:
+            if do_root_R:
+                R = quat_params[:, 0]
+            else:
+                R = np.array([[1.0, 0.0, 0.0, 0.0]]).repeat(len(quat_params), axis=0)
+            for i in range(1, len(chain)):
+                R = qmul_np(R, quat_params[:, chain[i]])
+                offset_vec = offsets[:, chain[i]]
+                joints[:, chain[i]] = qrot_np(R, offset_vec) + joints[:, chain[i - 1]]
+        return joints
+    def forward_kinematics_cont6d_np(self, cont6d_params, root_pos, skel_joints=None, do_root_R=True):
+        # cont6d_params (batch_size, joints_num, 6)
+        # joints (batch_size, joints_num, 3)
+        # root_pos (batch_size, 3)
+        if skel_joints is not None:
+            skel_joints = torch.from_numpy(skel_joints)
+            offsets = self.get_offsets_joints_batch(skel_joints)
+        if len(self._offset.shape) == 2:
+            offsets = self._offset.expand(cont6d_params.shape[0], -1, -1)
+        offsets = offsets.numpy()
+        joints = np.zeros(cont6d_params.shape[:-1] + (3,))
+        joints[:, 0] = root_pos
+        for chain in self._kinematic_tree:
+            if do_root_R:
+                matR = cont6d_to_matrix_np(cont6d_params[:, 0])
+            else:
+                matR = np.eye(3)[np.newaxis, :].repeat(len(cont6d_params), axis=0)
+            for i in range(1, len(chain)):
+                matR = np.matmul(matR, cont6d_to_matrix_np(cont6d_params[:, chain[i]]))
+                offset_vec = offsets[:, chain[i]][..., np.newaxis]
+                # print(matR.shape, offset_vec.shape)
+                joints[:, chain[i]] = np.matmul(matR, offset_vec).squeeze(-1) + joints[:, chain[i-1]]
+        return joints
+    def forward_kinematics_cont6d(self, cont6d_params, root_pos, skel_joints=None, do_root_R=True):
+        # cont6d_params (batch_size, joints_num, 6)
+        # joints (batch_size, joints_num, 3)
+        # root_pos (batch_size, 3)
+        if skel_joints is not None:
+            # skel_joints = torch.from_numpy(skel_joints)
+            offsets = self.get_offsets_joints_batch(skel_joints)
+        if len(self._offset.shape) == 2:
+            offsets = self._offset.expand(cont6d_params.shape[0], -1, -1)
+        joints = torch.zeros(cont6d_params.shape[:-1] + (3,)).to(cont6d_params.device)
+        joints[..., 0, :] = root_pos
+        for chain in self._kinematic_tree:
+            if do_root_R:
+                matR = cont6d_to_matrix(cont6d_params[:, 0])
+            else:
+                matR = torch.eye(3).expand((len(cont6d_params), -1, -1)).detach().to(cont6d_params.device)
+            for i in range(1, len(chain)):
+                matR = torch.matmul(matR, cont6d_to_matrix(cont6d_params[:, chain[i]]))
+                offset_vec = offsets[:, chain[i]].unsqueeze(-1)
+                # print(matR.shape, offset_vec.shape)
+                joints[:, chain[i]] = torch.matmul(matR, offset_vec).squeeze(-1) + joints[:, chain[i-1]]
+        return joints

data_loaders/humanml/data/__init__.py ADDED Viewed

File without changes

data_loaders/humanml/data/__pycache__/__init__.cpython-38.pyc ADDED Viewed

Binary file (172 Bytes). View file

data_loaders/humanml/data/__pycache__/dataset.cpython-38.pyc ADDED Viewed

Binary file (19.1 kB). View file

data_loaders/humanml/data/__pycache__/dataset_ours.cpython-38.pyc ADDED Viewed

Binary file (73.1 kB). View file

data_loaders/humanml/data/__pycache__/dataset_ours_single_seq.cpython-38.pyc ADDED Viewed

Binary file (87.9 kB). View file

data_loaders/humanml/data/__pycache__/utils.cpython-38.pyc ADDED Viewed

Binary file (15.3 kB). View file

data_loaders/humanml/data/dataset.py ADDED Viewed

	@@ -0,0 +1,795 @@

+import torch
+from torch.utils import data
+import numpy as np
+import os
+from os.path import join as pjoin
+import random
+import codecs as cs
+from tqdm import tqdm
+import spacy
+from torch.utils.data._utils.collate import default_collate
+from data_loaders.humanml.utils.word_vectorizer import WordVectorizer
+from data_loaders.humanml.utils.get_opt import get_opt
+# import spacy
+def collate_fn(batch):
+    batch.sort(key=lambda x: x[3], reverse=True)
+    return default_collate(batch)
+'''For use of training text-2-motion generative model'''
+class Text2MotionDataset(data.Dataset):
+    def __init__(self, opt, mean, std, split_file, w_vectorizer):
+        self.opt = opt
+        self.w_vectorizer = w_vectorizer
+        self.max_length = 20
+        self.pointer = 0
+        min_motion_len = 40 if self.opt.dataset_name =='t2m' else 24
+        joints_num = opt.joints_num
+        data_dict = {}
+        id_list = []
+        with cs.open(split_file, 'r') as f:
+            for line in f.readlines():
+                id_list.append(line.strip())
+        new_name_list = []
+        length_list = []
+        for name in tqdm(id_list):
+            try:
+                motion = np.load(pjoin(opt.motion_dir, name + '.npy'))
+                if (len(motion)) < min_motion_len or (len(motion) >= 200):
+                    continue
+                text_data = []
+                flag = False
+                with cs.open(pjoin(opt.text_dir, name + '.txt')) as f:
+                    for line in f.readlines():
+                        text_dict = {}
+                        line_split = line.strip().split('#')
+                        caption = line_split[0]
+                        tokens = line_split[1].split(' ')
+                        f_tag = float(line_split[2])
+                        to_tag = float(line_split[3])
+                        f_tag = 0.0 if np.isnan(f_tag) else f_tag
+                        to_tag = 0.0 if np.isnan(to_tag) else to_tag
+                        text_dict['caption'] = caption
+                        text_dict['tokens'] = tokens
+                        if f_tag == 0.0 and to_tag == 0.0:
+                            flag = True
+                            text_data.append(text_dict)
+                        else:
+                            try:
+                                n_motion = motion[int(f_tag*20) : int(to_tag*20)]
+                                if (len(n_motion)) < min_motion_len or (len(n_motion) >= 200):
+                                    continue
+                                new_name = random.choice('ABCDEFGHIJKLMNOPQRSTUVW') + '_' + name
+                                while new_name in data_dict:
+                                    new_name = random.choice('ABCDEFGHIJKLMNOPQRSTUVW') + '_' + name
+                                data_dict[new_name] = {'motion': n_motion,
+                                                       'length': len(n_motion),
+                                                       'text':[text_dict]}
+                                new_name_list.append(new_name)
+                                length_list.append(len(n_motion))
+                            except:
+                                print(line_split)
+                                print(line_split[2], line_split[3], f_tag, to_tag, name)
+                                # break
+                if flag:
+                    data_dict[name] = {'motion': motion,
+                                       'length': len(motion),
+                                       'text':text_data}
+                    new_name_list.append(name)
+                    length_list.append(len(motion))
+            except:
+                # Some motion may not exist in KIT dataset
+                pass
+        name_list, length_list = zip(*sorted(zip(new_name_list, length_list), key=lambda x: x[1]))
+        if opt.is_train:
+            # root_rot_velocity (B, seq_len, 1)
+            std[0:1] = std[0:1] / opt.feat_bias
+            # root_linear_velocity (B, seq_len, 2)
+            std[1:3] = std[1:3] / opt.feat_bias
+            # root_y (B, seq_len, 1)
+            std[3:4] = std[3:4] / opt.feat_bias
+            # ric_data (B, seq_len, (joint_num - 1)*3)
+            std[4: 4 + (joints_num - 1) * 3] = std[4: 4 + (joints_num - 1) * 3] / 1.0
+            # rot_data (B, seq_len, (joint_num - 1)*6)
+            std[4 + (joints_num - 1) * 3: 4 + (joints_num - 1) * 9] = std[4 + (joints_num - 1) * 3: 4 + (
+                        joints_num - 1) * 9] / 1.0
+            # local_velocity (B, seq_len, joint_num*3)
+            std[4 + (joints_num - 1) * 9: 4 + (joints_num - 1) * 9 + joints_num * 3] = std[
+                                                                                       4 + (joints_num - 1) * 9: 4 + (
+                                                                                                   joints_num - 1) * 9 + joints_num * 3] / 1.0
+            # foot contact (B, seq_len, 4)
+            std[4 + (joints_num - 1) * 9 + joints_num * 3:] = std[
+                                                              4 + (joints_num - 1) * 9 + joints_num * 3:] / opt.feat_bias
+            assert 4 + (joints_num - 1) * 9 + joints_num * 3 + 4 == mean.shape[-1]
+            np.save(pjoin(opt.meta_dir, 'mean.npy'), mean)
+            np.save(pjoin(opt.meta_dir, 'std.npy'), std)
+        self.mean = mean
+        self.std = std
+        self.length_arr = np.array(length_list)
+        self.data_dict = data_dict
+        self.name_list = name_list
+        self.reset_max_len(self.max_length)
+    def reset_max_len(self, length):
+        assert length <= self.opt.max_motion_length
+        self.pointer = np.searchsorted(self.length_arr, length)
+        print("Pointer Pointing at %d"%self.pointer)
+        self.max_length = length
+    def inv_transform(self, data):
+        return data * self.std + self.mean
+    def __len__(self):
+        return len(self.data_dict) - self.pointer
+    def __getitem__(self, item):
+        idx = self.pointer + item
+        data = self.data_dict[self.name_list[idx]]
+        motion, m_length, text_list = data['motion'], data['length'], data['text']
+        # Randomly select a caption
+        text_data = random.choice(text_list)
+        caption, tokens = text_data['caption'], text_data['tokens']
+        if len(tokens) < self.opt.max_text_len:
+            # pad with "unk"
+            tokens = ['sos/OTHER'] + tokens + ['eos/OTHER']
+            sent_len = len(tokens)
+            tokens = tokens + ['unk/OTHER'] * (self.opt.max_text_len + 2 - sent_len)
+        else:
+            # crop
+            tokens = tokens[:self.opt.max_text_len]
+            tokens = ['sos/OTHER'] + tokens + ['eos/OTHER']
+            sent_len = len(tokens)
+        pos_one_hots = []
+        word_embeddings = []
+        for token in tokens:
+            word_emb, pos_oh = self.w_vectorizer[token]
+            pos_one_hots.append(pos_oh[None, :])
+            word_embeddings.append(word_emb[None, :])
+        pos_one_hots = np.concatenate(pos_one_hots, axis=0)
+        word_embeddings = np.concatenate(word_embeddings, axis=0)
+        len_gap = (m_length - self.max_length) // self.opt.unit_length
+        if self.opt.is_train:
+            if m_length != self.max_length:
+            # print("Motion original length:%d_%d"%(m_length, len(motion)))
+                if self.opt.unit_length < 10:
+                    coin2 = np.random.choice(['single', 'single', 'double'])
+                else:
+                    coin2 = 'single'
+                if len_gap == 0 or (len_gap == 1 and coin2 == 'double'):
+                    m_length = self.max_length
+                    idx = random.randint(0, m_length - self.max_length)
+                    motion = motion[idx:idx+self.max_length]
+                else:
+                    if coin2 == 'single':
+                        n_m_length = self.max_length + self.opt.unit_length * len_gap
+                    else:
+                        n_m_length = self.max_length + self.opt.unit_length * (len_gap - 1)
+                    idx = random.randint(0, m_length - n_m_length)
+                    motion = motion[idx:idx + self.max_length]
+                    m_length = n_m_length
+                # print(len_gap, idx, coin2)
+        else:
+            if self.opt.unit_length < 10:
+                coin2 = np.random.choice(['single', 'single', 'double'])
+            else:
+                coin2 = 'single'
+            if coin2 == 'double':
+                m_length = (m_length // self.opt.unit_length - 1) * self.opt.unit_length
+            elif coin2 == 'single':
+                m_length = (m_length // self.opt.unit_length) * self.opt.unit_length
+            idx = random.randint(0, len(motion) - m_length)
+            motion = motion[idx:idx+m_length]
+        "Z Normalization"
+        motion = (motion - self.mean) / self.std
+        return word_embeddings, pos_one_hots, caption, sent_len, motion, m_length
+'''For use of training text motion matching model, and evaluations'''
+## text2motions dataset v2 ##
+class Text2MotionDatasetV2(data.Dataset): # text2motion dataset
+    def __init__(self, opt, mean, std, split_file, w_vectorizer):
+        self.opt = opt
+        self.w_vectorizer = w_vectorizer
+        self.max_length = 20
+        self.pointer = 0
+        self.max_motion_length = opt.max_motion_length
+        min_motion_len = 40 if self.opt.dataset_name =='t2m' else 24
+        data_dict = {}
+        id_list = []
+        with cs.open(split_file, 'r') as f:
+            for line in f.readlines():
+                id_list.append(line.strip()) ## id list ##
+        # id_list = id_list[:200]
+        new_name_list = []
+        length_list = []
+        for name in tqdm(id_list):
+            try:
+                ## motion_dir ##
+                motion = np.load(pjoin(opt.motion_dir, name + '.npy'))
+                if (len(motion)) < min_motion_len or (len(motion) >= 200):
+                    continue
+                text_data = []
+                flag = False
+                ## motionnn
+                with cs.open(pjoin(opt.text_dir, name + '.txt')) as f:
+                    for line in f.readlines():
+                        text_dict = {}
+                        line_split = line.strip().split('#')
+                        caption = line_split[0]
+                        tokens = line_split[1].split(' ')
+                        f_tag = float(line_split[2])
+                        to_tag = float(line_split[3])
+                        f_tag = 0.0 if np.isnan(f_tag) else f_tag
+                        to_tag = 0.0 if np.isnan(to_tag) else to_tag
+                        text_dict['caption'] = caption ## caption, motion, ##
+                        text_dict['tokens'] = tokens
+                        if f_tag == 0.0 and to_tag == 0.0:
+                            flag = True
+                            text_data.append(text_dict)
+                        else:
+                            try:
+                                n_motion = motion[int(f_tag*20) : int(to_tag*20)]
+                                if (len(n_motion)) < min_motion_len or (len(n_motion) >= 200):
+                                    continue
+                                # new name for indexing #
+                                new_name = random.choice('ABCDEFGHIJKLMNOPQRSTUVW') + '_' + name
+                                while new_name in data_dict:
+                                    new_name = random.choice('ABCDEFGHIJKLMNOPQRSTUVW') + '_' + name
+                                data_dict[new_name] = {'motion': n_motion,
+                                                       'length': len(n_motion), ## length of motion ##
+                                                       'text':[text_dict]}
+                                new_name_list.append(new_name)
+                                length_list.append(len(n_motion))
+                            except:
+                                print(line_split)
+                                print(line_split[2], line_split[3], f_tag, to_tag, name)
+                                # break
+                if flag:
+                    ## motion, lenght, text ##
+                    data_dict[name] = {'motion': motion, ## motion, length of the motion, text data
+                                       'length': len(motion), ## motion, lenght, text ##
+                                       'text': text_data}
+                    new_name_list.append(name)
+                    length_list.append(len(motion))
+            except:
+                pass
+        name_list, length_list = zip(*sorted(zip(new_name_list, length_list), key=lambda x: x[1]))
+        self.mean = mean
+        self.std = std
+        self.length_arr = np.array(length_list)
+        self.data_dict = data_dict
+        self.name_list = name_list
+        self.reset_max_len(self.max_length)
+    def reset_max_len(self, length):
+        assert length <= self.max_motion_length
+        self.pointer = np.searchsorted(self.length_arr, length)
+        print("Pointer Pointing at %d"%self.pointer)
+        self.max_length = length
+    def inv_transform(self, data):
+        return data * self.std + self.mean
+    def __len__(self):
+        return len(self.data_dict) - self.pointer
+    def __getitem__(self, item):
+        idx = self.pointer + item
+        data = self.data_dict[self.name_list[idx]] # data
+        motion, m_length, text_list = data['motion'], data['length'], data['text']
+        # Randomly select a caption
+        text_data = random.choice(text_list)
+        caption, tokens = text_data['caption'], text_data['tokens']
+        if len(tokens) < self.opt.max_text_len:
+            # pad with "unk"
+            tokens = ['sos/OTHER'] + tokens + ['eos/OTHER']
+            sent_len = len(tokens)
+            tokens = tokens + ['unk/OTHER'] * (self.opt.max_text_len + 2 - sent_len)
+        else:
+            # crop
+            tokens = tokens[:self.opt.max_text_len]
+            tokens = ['sos/OTHER'] + tokens + ['eos/OTHER']
+            sent_len = len(tokens)
+        pos_one_hots = [] ## pose one hots ##
+        word_embeddings = []
+        for token in tokens:
+            word_emb, pos_oh = self.w_vectorizer[token]
+            pos_one_hots.append(pos_oh[None, :])
+            word_embeddings.append(word_emb[None, :])
+        pos_one_hots = np.concatenate(pos_one_hots, axis=0)
+        word_embeddings = np.concatenate(word_embeddings, axis=0)
+        # Crop the motions in to times of 4, and introduce small variations
+        if self.opt.unit_length < 10:
+            coin2 = np.random.choice(['single', 'single', 'double'])
+        else:
+            coin2 = 'single'
+        if coin2 == 'double':
+            m_length = (m_length // self.opt.unit_length - 1) * self.opt.unit_length
+        elif coin2 == 'single':
+            m_length = (m_length // self.opt.unit_length) * self.opt.unit_length
+        idx = random.randint(0, len(motion) - m_length)
+        motion = motion[idx:idx+m_length]
+        "Z Normalization"
+        motion = (motion - self.mean) / self.std
+        if m_length < self.max_motion_length:
+            motion = np.concatenate([motion, # positions # right? #
+                                     np.zeros((self.max_motion_length - m_length, motion.shape[1]))
+                                     ], axis=0)
+        # print(word_embeddings.shape, motion.shape)
+        # print(tokens)
+        return word_embeddings, pos_one_hots, caption, sent_len, motion, m_length, '_'.join(tokens)
+## and
+'''For use of training baseline'''
+class Text2MotionDatasetBaseline(data.Dataset):
+    def __init__(self, opt, mean, std, split_file, w_vectorizer):
+        self.opt = opt
+        self.w_vectorizer = w_vectorizer
+        self.max_length = 20
+        self.pointer = 0
+        self.max_motion_length = opt.max_motion_length
+        min_motion_len = 40 if self.opt.dataset_name =='t2m' else 24
+        data_dict = {}
+        id_list = []
+        with cs.open(split_file, 'r') as f:
+            for line in f.readlines():
+                id_list.append(line.strip())
+        # id_list = id_list[:200]
+        new_name_list = []
+        length_list = []
+        for name in tqdm(id_list):
+            try:
+                motion = np.load(pjoin(opt.motion_dir, name + '.npy'))
+                if (len(motion)) < min_motion_len or (len(motion) >= 200):
+                    continue
+                text_data = []
+                flag = False
+                with cs.open(pjoin(opt.text_dir, name + '.txt')) as f:
+                    for line in f.readlines():
+                        text_dict = {}
+                        line_split = line.strip().split('#')
+                        caption = line_split[0]
+                        tokens = line_split[1].split(' ')
+                        f_tag = float(line_split[2])
+                        to_tag = float(line_split[3])
+                        f_tag = 0.0 if np.isnan(f_tag) else f_tag
+                        to_tag = 0.0 if np.isnan(to_tag) else to_tag
+                        text_dict['caption'] = caption
+                        text_dict['tokens'] = tokens
+                        if f_tag == 0.0 and to_tag == 0.0:
+                            flag = True
+                            text_data.append(text_dict)
+                        else:
+                            try:
+                                n_motion = motion[int(f_tag*20) : int(to_tag*20)]
+                                if (len(n_motion)) < min_motion_len or (len(n_motion) >= 200):
+                                    continue
+                                new_name = random.choice('ABCDEFGHIJKLMNOPQRSTUVW') + '_' + name
+                                while new_name in data_dict:
+                                    new_name = random.choice('ABCDEFGHIJKLMNOPQRSTUVW') + '_' + name
+                                data_dict[new_name] = {'motion': n_motion,
+                                                       'length': len(n_motion),
+                                                       'text':[text_dict]}
+                                new_name_list.append(new_name)
+                                length_list.append(len(n_motion))
+                            except:
+                                print(line_split)
+                                print(line_split[2], line_split[3], f_tag, to_tag, name)
+                                # break
+                if flag:
+                    data_dict[name] = {'motion': motion,
+                                       'length': len(motion),
+                                       'text': text_data}
+                    new_name_list.append(name)
+                    length_list.append(len(motion))
+            except:
+                pass
+        name_list, length_list = zip(*sorted(zip(new_name_list, length_list), key=lambda x: x[1]))
+        self.mean = mean
+        self.std = std
+        self.length_arr = np.array(length_list)
+        self.data_dict = data_dict
+        self.name_list = name_list
+        self.reset_max_len(self.max_length)
+    def reset_max_len(self, length):
+        assert length <= self.max_motion_length
+        self.pointer = np.searchsorted(self.length_arr, length)
+        print("Pointer Pointing at %d"%self.pointer)
+        self.max_length = length
+    def inv_transform(self, data):
+        return data * self.std + self.mean
+    def __len__(self):
+        return len(self.data_dict) - self.pointer
+    def __getitem__(self, item):
+        idx = self.pointer + item
+        data = self.data_dict[self.name_list[idx]]
+        motion, m_length, text_list = data['motion'], data['length'], data['text']
+        # Randomly select a caption
+        text_data = random.choice(text_list)
+        caption, tokens = text_data['caption'], text_data['tokens']
+        if len(tokens) < self.opt.max_text_len:
+            # pad with "unk"
+            tokens = ['sos/OTHER'] + tokens + ['eos/OTHER']
+            sent_len = len(tokens)
+            tokens = tokens + ['unk/OTHER'] * (self.opt.max_text_len + 2 - sent_len)
+        else:
+            # crop
+            tokens = tokens[:self.opt.max_text_len]
+            tokens = ['sos/OTHER'] + tokens + ['eos/OTHER']
+            sent_len = len(tokens)
+        pos_one_hots = []
+        word_embeddings = []
+        for token in tokens:
+            word_emb, pos_oh = self.w_vectorizer[token]
+            pos_one_hots.append(pos_oh[None, :])
+            word_embeddings.append(word_emb[None, :])
+        pos_one_hots = np.concatenate(pos_one_hots, axis=0)
+        word_embeddings = np.concatenate(word_embeddings, axis=0)
+        len_gap = (m_length - self.max_length) // self.opt.unit_length
+        if m_length != self.max_length:
+            # print("Motion original length:%d_%d"%(m_length, len(motion)))
+            if self.opt.unit_length < 10:
+                coin2 = np.random.choice(['single', 'single', 'double'])
+            else:
+                coin2 = 'single'
+            if len_gap == 0 or (len_gap == 1 and coin2 == 'double'):
+                m_length = self.max_length
+                s_idx = random.randint(0, m_length - self.max_length)
+            else:
+                if coin2 == 'single':
+                    n_m_length = self.max_length + self.opt.unit_length * len_gap
+                else:
+                    n_m_length = self.max_length + self.opt.unit_length * (len_gap - 1)
+                s_idx = random.randint(0, m_length - n_m_length)
+                m_length = n_m_length
+        else:
+            s_idx = 0
+        src_motion = motion[s_idx: s_idx + m_length]
+        tgt_motion = motion[s_idx: s_idx + self.max_length]
+        "Z Normalization"
+        src_motion = (src_motion - self.mean) / self.std
+        tgt_motion = (tgt_motion - self.mean) / self.std
+        if m_length < self.max_motion_length:
+            src_motion = np.concatenate([src_motion,
+                                     np.zeros((self.max_motion_length - m_length, motion.shape[1]))
+                                     ], axis=0)
+        # print(m_length, src_motion.shape, tgt_motion.shape)
+        # print(word_embeddings.shape, motion.shape)
+        # print(tokens)
+        return word_embeddings, caption, sent_len, src_motion, tgt_motion, m_length
+class MotionDatasetV2(data.Dataset):
+    def __init__(self, opt, mean, std, split_file):
+        self.opt = opt
+        joints_num = opt.joints_num
+        self.data = []
+        self.lengths = []
+        id_list = []
+        with cs.open(split_file, 'r') as f:
+            for line in f.readlines():
+                id_list.append(line.strip())
+        for name in tqdm(id_list):
+            try:
+                motion = np.load(pjoin(opt.motion_dir, name + '.npy'))
+                if motion.shape[0] < opt.window_size:
+                    continue
+                self.lengths.append(motion.shape[0] - opt.window_size)
+                self.data.append(motion)
+            except:
+                # Some motion may not exist in KIT dataset
+                pass
+        self.cumsum = np.cumsum([0] + self.lengths)
+        if opt.is_train:
+            # root_rot_velocity (B, seq_len, 1)
+            std[0:1] = std[0:1] / opt.feat_bias
+            # root_linear_velocity (B, seq_len, 2)
+            std[1:3] = std[1:3] / opt.feat_bias
+            # root_y (B, seq_len, 1)
+            std[3:4] = std[3:4] / opt.feat_bias
+            # ric_data (B, seq_len, (joint_num - 1)*3)
+            std[4: 4 + (joints_num - 1) * 3] = std[4: 4 + (joints_num - 1) * 3] / 1.0
+            # rot_data (B, seq_len, (joint_num - 1)*6)
+            std[4 + (joints_num - 1) * 3: 4 + (joints_num - 1) * 9] = std[4 + (joints_num - 1) * 3: 4 + (
+                        joints_num - 1) * 9] / 1.0
+            # local_velocity (B, seq_len, joint_num*3)
+            std[4 + (joints_num - 1) * 9: 4 + (joints_num - 1) * 9 + joints_num * 3] = std[
+                                                                                       4 + (joints_num - 1) * 9: 4 + (
+                                                                                                   joints_num - 1) * 9 + joints_num * 3] / 1.0
+            # foot contact (B, seq_len, 4)
+            std[4 + (joints_num - 1) * 9 + joints_num * 3:] = std[
+                                                              4 + (joints_num - 1) * 9 + joints_num * 3:] / opt.feat_bias
+            assert 4 + (joints_num - 1) * 9 + joints_num * 3 + 4 == mean.shape[-1]
+            np.save(pjoin(opt.meta_dir, 'mean.npy'), mean)
+            np.save(pjoin(opt.meta_dir, 'std.npy'), std)
+        self.mean = mean
+        self.std = std
+        print("Total number of motions {}, snippets {}".format(len(self.data), self.cumsum[-1]))
+    def inv_transform(self, data):
+        return data * self.std + self.mean
+    def __len__(self):
+        return self.cumsum[-1]
+    def __getitem__(self, item):
+        if item != 0:
+            motion_id = np.searchsorted(self.cumsum, item) - 1
+            idx = item - self.cumsum[motion_id] - 1
+        else:
+            motion_id = 0
+            idx = 0
+        # idx + j
+        motion = self.data[motion_id][idx:idx+self.opt.window_size]
+        "Z Normalization"
+        motion = (motion - self.mean) / self.std
+        return motion
+class RawTextDataset(data.Dataset):
+    def __init__(self, opt, mean, std, text_file, w_vectorizer):
+        self.mean = mean
+        self.std = std
+        self.opt = opt
+        self.data_dict = []
+        self.nlp = spacy.load('en_core_web_sm')
+        with cs.open(text_file) as f:
+            for line in f.readlines():
+                word_list, pos_list = self.process_text(line.strip())
+                tokens = ['%s/%s'%(word_list[i], pos_list[i]) for i in range(len(word_list))]
+                self.data_dict.append({'caption':line.strip(), "tokens":tokens})
+        self.w_vectorizer = w_vectorizer
+        print("Total number of descriptions {}".format(len(self.data_dict)))
+    def process_text(self, sentence):
+        sentence = sentence.replace('-', '')
+        doc = self.nlp(sentence)
+        word_list = []
+        pos_list = []
+        for token in doc:
+            word = token.text
+            if not word.isalpha():
+                continue
+            if (token.pos_ == 'NOUN' or token.pos_ == 'VERB') and (word != 'left'):
+                word_list.append(token.lemma_)
+            else:
+                word_list.append(word)
+            pos_list.append(token.pos_)
+        return word_list, pos_list
+    def inv_transform(self, data):
+        return data * self.std + self.mean
+    def __len__(self):
+        return len(self.data_dict)
+    def __getitem__(self, item):
+        data = self.data_dict[item]
+        caption, tokens = data['caption'], data['tokens']
+        if len(tokens) < self.opt.max_text_len:
+            # pad with "unk"
+            tokens = ['sos/OTHER'] + tokens + ['eos/OTHER']
+            sent_len = len(tokens)
+            tokens = tokens + ['unk/OTHER'] * (self.opt.max_text_len + 2 - sent_len)
+        else:
+            # crop
+            tokens = tokens[:self.opt.max_text_len]
+            tokens = ['sos/OTHER'] + tokens + ['eos/OTHER']
+            sent_len = len(tokens)
+        pos_one_hots = []
+        word_embeddings = []
+        for token in tokens:
+            word_emb, pos_oh = self.w_vectorizer[token]
+            pos_one_hots.append(pos_oh[None, :])
+            word_embeddings.append(word_emb[None, :])
+        pos_one_hots = np.concatenate(pos_one_hots, axis=0)
+        word_embeddings = np.concatenate(word_embeddings, axis=0)
+        return word_embeddings, pos_one_hots, caption, sent_len
+class TextOnlyDataset(data.Dataset):
+    def __init__(self, opt, mean, std, split_file):
+        self.mean = mean
+        self.std = std
+        self.opt = opt
+        self.data_dict = []
+        self.max_length = 20
+        self.pointer = 0
+        self.fixed_length = 120
+        data_dict = {}
+        id_list = []
+        with cs.open(split_file, 'r') as f:
+            for line in f.readlines():
+                id_list.append(line.strip())
+        # id_list = id_list[:200]
+        new_name_list = []
+        length_list = []
+        for name in tqdm(id_list):
+            try:
+                text_data = []
+                flag = False
+                with cs.open(pjoin(opt.text_dir, name + '.txt')) as f:
+                    for line in f.readlines():
+                        text_dict = {}
+                        line_split = line.strip().split('#')
+                        caption = line_split[0]
+                        tokens = line_split[1].split(' ')
+                        f_tag = float(line_split[2])
+                        to_tag = float(line_split[3])
+                        f_tag = 0.0 if np.isnan(f_tag) else f_tag
+                        to_tag = 0.0 if np.isnan(to_tag) else to_tag
+                        text_dict['caption'] = caption
+                        text_dict['tokens'] = tokens
+                        if f_tag == 0.0 and to_tag == 0.0:
+                            flag = True
+                            text_data.append(text_dict)
+                        else:
+                            try:
+                                new_name = random.choice('ABCDEFGHIJKLMNOPQRSTUVW') + '_' + name
+                                while new_name in data_dict:
+                                    new_name = random.choice('ABCDEFGHIJKLMNOPQRSTUVW') + '_' + name
+                                data_dict[new_name] = {'text':[text_dict]}
+                                new_name_list.append(new_name)
+                            except:
+                                print(line_split)
+                                print(line_split[2], line_split[3], f_tag, to_tag, name)
+                                # break
+                if flag:
+                    data_dict[name] = {'text': text_data}
+                    new_name_list.append(name)
+            except:
+                pass
+        self.length_arr = np.array(length_list)
+        self.data_dict = data_dict
+        self.name_list = new_name_list
+    def inv_transform(self, data):
+        return data * self.std + self.mean
+    def __len__(self):
+        return len(self.data_dict)
+    def __getitem__(self, item):
+        idx = self.pointer + item
+        data = self.data_dict[self.name_list[idx]]
+        text_list = data['text']
+        # Randomly select a caption
+        text_data = random.choice(text_list)
+        caption, tokens = text_data['caption'], text_data['tokens']
+        return None, None, caption, None, np.array([0]), self.fixed_length, None
+        # fixed_length can be set from outside before sampling
+## t2m original dataset
+# A wrapper class for t2m original dataset for MDM purposes
+# humanml 3D
+class HumanML3D(data.Dataset): ## humanml dataset ## ## human ml dataset text2motion ##
+    def __init__(self, mode, datapath='./dataset/humanml_opt.txt', split="train", load_vectorizer=False, **kwargs):
+        self.mode = mode
+        self.dataset_name = 't2m'
+        self.dataname = 't2m'
+        ### humanml3d --> humanml3d,
+        # Configurations of T2M dataset and KIT dataset is almost the same
+        abs_base_path = f'.'
+        dataset_opt_path = pjoin(abs_base_path, datapath) ## pjoin, pjoin, getopt,  # abs
+        device = None  # torch.device('cuda:4') # This param is not in use in this context
+        opt = get_opt(dataset_opt_path, device)
+        opt.meta_dir = pjoin(abs_base_path, opt.meta_dir)
+        opt.motion_dir = pjoin(abs_base_path, opt.motion_dir)
+        opt.text_dir = pjoin(abs_base_path, opt.text_dir)
+        opt.model_dir = pjoin(abs_base_path, opt.model_dir)
+        opt.checkpoints_dir = pjoin(abs_base_path, opt.checkpoints_dir)
+        opt.data_root = pjoin(abs_base_path, opt.data_root) ## data_root --> data root;
+        opt.save_root = pjoin(abs_base_path, opt.save_root)
+        opt.meta_dir = './dataset'
+        self.opt = opt
+        print('Loading dataset %s ...' % opt.dataset_name)
+        if mode == 'gt':
+            # used by T2M models (including evaluators)
+            self.mean = np.load(pjoin(opt.meta_dir, f'{opt.dataset_name}_mean.npy'))
+            self.std = np.load(pjoin(opt.meta_dir, f'{opt.dataset_name}_std.npy'))
+        elif mode in ['train', 'eval', 'text_only']:
+            # used by our models
+            self.mean = np.load(pjoin(opt.data_root, 'Mean.npy'))
+            self.std = np.load(pjoin(opt.data_root, 'Std.npy'))
+        if mode == 'eval':
+            # used by T2M models (including evaluators)
+            # this is to translate their norms to ours
+            self.mean_for_eval = np.load(pjoin(opt.meta_dir, f'{opt.dataset_name}_mean.npy'))
+            self.std_for_eval = np.load(pjoin(opt.meta_dir, f'{opt.dataset_name}_std.npy'))
+        print(f"dataset_name: {opt.dataset_name}")
+        if load_vectorizer:
+            self.split_file = pjoin(opt.data_root, f'train.txt')
+        else:
+            self.split_file = pjoin(opt.data_root, f'{split}.txt')
+        if mode == 'text_only' and (not load_vectorizer):
+            self.t2m_dataset = TextOnlyDataset(self.opt, self.mean, self.std, self.split_file)
+        else:
+            self.w_vectorizer = WordVectorizer(pjoin(abs_base_path, 'glove'), 'our_vab')
+            ### text to
+            self.t2m_dataset = Text2MotionDatasetV2(self.opt, self.mean, self.std, self.split_file, self.w_vectorizer)
+            self.num_actions = 1 # dummy placeholder
+        # assert len(self.t2m_dataset) > 1, 'You loaded an empty dataset, ' \
+        #                                   'it is probably because your data dir has only texts and no motions.\n' \
+        #                                   'To train and evaluate MDM you should get the FULL data as described ' \
+        #                                   'in the README file.'
+    def __getitem__(self, item):
+        return self.t2m_dataset.__getitem__(item)
+    def __len__(self):
+        return self.t2m_dataset.__len__()
+# A wrapper class for t2m original dataset for MDM purposes
+class KIT(HumanML3D):
+    def __init__(self, mode, datapath='./dataset/kit_opt.txt', split="train", **kwargs):
+        super(KIT, self).__init__(mode, datapath, split, **kwargs)

data_loaders/humanml/data/dataset_ours.py ADDED Viewed

The diff for this file is too large to render. See raw diff

data_loaders/humanml/data/dataset_ours_single_seq.py ADDED Viewed

The diff for this file is too large to render. See raw diff

data_loaders/humanml/data/utils.py ADDED Viewed

	@@ -0,0 +1,507 @@

+import numpy as np
+import torch
+import time
+from scipy.spatial.transform import Rotation as R
+try:
+    from torch_cluster import fps
+except:
+    pass
+from collections import OrderedDict
+import os, argparse, copy, json
+import math
+def sample_pcd_from_mesh(vertices, triangles, npoints=512):
+    arears = []
+    for i in range(triangles.shape[0]):
+        v_a, v_b, v_c = int(triangles[i, 0].item()), int(triangles[i, 1].item()), int(triangles[i, 2].item())
+        v_a, v_b, v_c = vertices[v_a], vertices[v_b], vertices[v_c]
+        ab, ac = v_b - v_a, v_c - v_a
+        cos_ab_ac = (np.sum(ab * ac) / np.clip(np.sqrt(np.sum(ab ** 2)) * np.sqrt(np.sum(ac ** 2)), a_min=1e-9, a_max=9999999.0)).item()
+        sin_ab_ac = math.sqrt(1. - cos_ab_ac ** 2)
+        cur_area = 0.5 * sin_ab_ac * np.sqrt(np.sum(ab ** 2)).item() * np.sqrt(np.sum(ac ** 2)).item()
+        arears.append(cur_area)
+    tot_area = sum(arears)
+    sampled_pcts = []
+    tot_indices = []
+    tot_factors = []
+    for i in range(triangles.shape[0]):
+        v_a, v_b, v_c = int(triangles[i, 0].item()), int(triangles[i, 1].item()), int(
+            triangles[i, 2].item())
+        v_a, v_b, v_c = vertices[v_a], vertices[v_b], vertices[v_c]
+        # ab, ac = v_b - v_a, v_c - v_a
+        # cur_sampled_pts = int(npoints * (arears[i] / tot_area))
+        cur_sampled_pts = math.ceil(npoints * (arears[i] / tot_area))
+        # if cur_sampled_pts == 0:
+        cur_sampled_pts = int(arears[i] * npoints)
+        cur_sampled_pts = 1 if cur_sampled_pts == 0 else cur_sampled_pts
+        tmp_x, tmp_y = np.random.uniform(0, 1., (cur_sampled_pts,)).tolist(), np.random.uniform(0., 1., (cur_sampled_pts,)).tolist()
+        for xx, yy in zip(tmp_x, tmp_y):
+            sqrt_xx, sqrt_yy = math.sqrt(xx), math.sqrt(yy)
+            aa = 1. - sqrt_xx
+            bb = sqrt_xx * (1. - yy)
+            cc = yy * sqrt_xx
+            cur_pos = v_a * aa + v_b * bb + v_c * cc
+            sampled_pcts.append(cur_pos)
+            tot_indices.append(triangles[i]) # tot_indices for triangles # # vertices indices
+            tot_factors.append([aa, bb, cc])
+    tot_indices = np.array(tot_indices, dtype=np.long)
+    tot_factors = np.array(tot_factors, dtype=np.float32)
+    sampled_ptcs = np.array(sampled_pcts)
+    print("sampled points  from surface:", sampled_ptcs.shape)
+    # sampled_pcts = np.concatenate([sampled_pcts, vertices], axis=0)
+    return sampled_ptcs, tot_indices, tot_factors
+def read_obj_file_ours(obj_fn, sub_one=False):
+  vertices = []
+  faces = []
+  with open(obj_fn, "r") as rf:
+    for line in rf:
+      items = line.strip().split(" ")
+      if items[0] == 'v':
+        cur_verts = items[1:]
+        cur_verts = [float(vv) for vv in cur_verts]
+        vertices.append(cur_verts)
+      elif items[0] == 'f':
+        cur_faces = items[1:] # faces
+        cur_face_idxes = []
+        for cur_f in cur_faces:
+          try:
+            cur_f_idx = int(cur_f.split("/")[0])
+          except:
+            cur_f_idx = int(cur_f.split("//")[0])
+          cur_face_idxes.append(cur_f_idx if not sub_one else cur_f_idx - 1)
+        faces.append(cur_face_idxes)
+    rf.close()
+  vertices = np.array(vertices, dtype=np.float)
+  return vertices, faces
+def clamp_gradient(model, clip):
+    for p in model.parameters():
+        torch.nn.utils.clip_grad_value_(p, clip)
+def clamp_gradient_norm(model, max_norm, norm_type=2):
+    for p in model.parameters():
+        torch.nn.utils.clip_grad_norm_(p, max_norm, norm_type=2)
+def save_network(net, directory, network_label, epoch_label=None, **kwargs):
+    """
+    save model to directory with name {network_label}_{epoch_label}.pth
+    Args:
+        net: pytorch model
+        directory: output directory
+        network_label: str
+        epoch_label: convertible to str
+        kwargs: additional value to be included
+    """
+    save_filename = "_".join((network_label, str(epoch_label))) + ".pth"
+    save_path = os.path.join(directory, save_filename)
+    merge_states = OrderedDict()
+    merge_states["states"] = net.cpu().state_dict()
+    for k in kwargs:
+        merge_states[k] = kwargs[k]
+    torch.save(merge_states, save_path)
+    net = net.cuda()
+def load_network(net, path):
+    """
+    load network parameters whose name exists in the pth file.
+    return:
+        INT trained step
+    """
+    # warnings.DeprecationWarning("load_network is deprecated. Use module.load_state_dict(strict=False) instead.")
+    if isinstance(path, str):
+        logger.info("loading network from {}".format(path))
+        if path[-3:] == "pth":
+            loaded_state = torch.load(path)
+            if "states" in loaded_state:
+                loaded_state = loaded_state["states"]
+        else:
+            loaded_state = np.load(path).item()
+            if "states" in loaded_state:
+                loaded_state = loaded_state["states"]
+    elif isinstance(path, dict):
+        loaded_state = path
+    network = net.module if isinstance(
+        net, torch.nn.DataParallel) else net
+    missingkeys, unexpectedkeys = network.load_state_dict(loaded_state, strict=False)
+    if len(missingkeys)>0:
+        logger.warn("load_network {} missing keys".format(len(missingkeys)), "\n".join(missingkeys))
+    if len(unexpectedkeys)>0:
+        logger.warn("load_network {} unexpected keys".format(len(unexpectedkeys)), "\n".join(unexpectedkeys))
+def weights_init(m):
+    """
+    initialize the weighs of the network for Convolutional layers and batchnorm layers
+    """
+    if isinstance(m, (torch.nn.modules.conv._ConvNd, torch.nn.Linear)):
+        torch.nn.init.xavier_uniform_(m.weight)
+        if m.bias is not None:
+            torch.nn.init.constant_(m.bias, 0.0)
+    elif isinstance(m, torch.nn.modules.batchnorm._BatchNorm):
+        torch.nn.init.constant_(m.bias, 0.0)
+        torch.nn.init.constant_(m.weight, 1.0)
+def seal(mesh_to_seal):
+    circle_v_id = np.array([108, 79, 78, 121, 214, 215, 279, 239, 234, 92, 38, 122, 118, 117, 119, 120], dtype = np.int32)
+    center = (mesh_to_seal.v[circle_v_id, :]).mean(0)
+    sealed_mesh = copy.copy(mesh_to_seal)
+    sealed_mesh.v = np.vstack([mesh_to_seal.v, center])
+    center_v_id = sealed_mesh.v.shape[0] - 1
+    for i in range(circle_v_id.shape[0]):
+        new_faces = [circle_v_id[i-1], circle_v_id[i], center_v_id]
+        sealed_mesh.f = np.vstack([sealed_mesh.f, new_faces])
+    return sealed_mesh
+def read_pos_fr_txt(txt_fn):
+    pos_data = []
+    with open(txt_fn, "r") as rf:
+        for line in rf:
+            cur_pos = line.strip().split(" ")
+            cur_pos = [float(p) for p in cur_pos]
+            pos_data.append(cur_pos)
+        rf.close()
+    pos_data = np.array(pos_data, dtype=np.float32)
+    print(f"pos_data: {pos_data.shape}")
+    return pos_data
+def read_field_data_fr_txt(field_fn):
+    field_data = []
+    with open(field_fn, "r") as rf:
+        for line in rf:
+            cur_field = line.strip().split(" ")
+            cur_field = [float(p) for p in cur_field]
+            field_data.append(cur_field)
+        rf.close()
+    field_data = np.array(field_data, dtype=np.float32)
+    print(f"filed_data: {field_data.shape}")
+    return field_data
+def farthest_point_sampling(pos: torch.FloatTensor, n_sampling: int):
+  bz, N = pos.size(0), pos.size(1)
+  feat_dim = pos.size(-1)
+  device = pos.device
+  sampling_ratio = float(n_sampling / N)
+  pos_float = pos.float()
+  batch = torch.arange(bz, dtype=torch.long).view(bz, 1).to(device)
+  mult_one = torch.ones((N,), dtype=torch.long).view(1, N).to(device)
+  batch = batch * mult_one
+  batch = batch.view(-1)
+  pos_float = pos_float.contiguous().view(-1, feat_dim).contiguous() # (bz x N, 3)
+  # sampling_ratio = torch.tensor([sampling_ratio for _ in range(bz)], dtype=torch.float).to(device)
+  # batch = torch.zeros((N, ), dtype=torch.long, device=device)
+  sampled_idx = fps(pos_float, batch, ratio=sampling_ratio, random_start=False)
+  # shape of sampled_idx?
+  return sampled_idx
+def batched_index_select_ours(values, indices, dim = 1):
+    value_dims = values.shape[(dim + 1):]
+    values_shape, indices_shape = map(lambda t: list(t.shape), (values, indices))
+    indices = indices[(..., *((None,) * len(value_dims)))]
+    indices = indices.expand(*((-1,) * len(indices_shape)), *value_dims)
+    value_expand_len = len(indices_shape) - (dim + 1)
+    values = values[(*((slice(None),) * dim), *((None,) * value_expand_len), ...)]
+    value_expand_shape = [-1] * len(values.shape)
+    expand_slice = slice(dim, (dim + value_expand_len))
+    value_expand_shape[expand_slice] = indices.shape[expand_slice]
+    values = values.expand(*value_expand_shape)
+    dim += value_expand_len
+    return values.gather(dim, indices)
+def compute_nearest(query, verts):
+    # query: bsz x nn_q x 3
+    # verts: bsz x nn_q x 3
+    dists = torch.sum((query.unsqueeze(2) - verts.unsqueeze(1)) ** 2, dim=-1)
+    minn_dists, minn_dists_idx = torch.min(dists, dim=-1) # bsz x nn_q
+    minn_pts_pos = batched_index_select_ours(values=verts, indices=minn_dists_idx, dim=1)
+    minn_pts_pos = minn_pts_pos.unsqueeze(2)
+    minn_dists_idx = minn_dists_idx.unsqueeze(2)
+    return minn_dists, minn_dists_idx, minn_pts_pos
+def batched_index_select(t, dim, inds):
+    """
+    Helper function to extract batch-varying indicies along array
+    :param t: array to select from
+    :param dim: dimension to select along
+    :param inds: batch-vary indicies
+    :return:
+    """
+    dummy = inds.unsqueeze(2).expand(inds.size(0), inds.size(1), t.size(2))
+    out = t.gather(dim, dummy) # b x e x f
+    return out
+def batched_get_rot_mtx_fr_vecs(normal_vecs):
+    # normal_vecs: nn_pts x 3 #
+    #
+    normal_vecs = normal_vecs / torch.clamp(torch.norm(normal_vecs, p=2, dim=-1, keepdim=True), min=1e-5)
+    sin_theta = normal_vecs[..., 0]
+    cos_theta = torch.sqrt(1. - sin_theta ** 2)
+    sin_phi = normal_vecs[..., 1] / torch.clamp(cos_theta, min=1e-5)
+    # cos_phi = torch.sqrt(1. - sin_phi ** 2)
+    cos_phi = normal_vecs[..., 2] / torch.clamp(cos_theta, min=1e-5)
+    sin_phi[cos_theta < 1e-5] = 1.
+    cos_phi[cos_theta < 1e-5] = 0.
+    #
+    y_rot_mtx = torch.stack(
+        [
+            torch.stack([cos_theta, torch.zeros_like(cos_theta), -sin_theta], dim=-1),
+            torch.stack([torch.zeros_like(cos_theta), torch.ones_like(cos_theta), torch.zeros_like(cos_theta)], dim=-1),
+            torch.stack([sin_theta, torch.zeros_like(cos_theta), cos_theta], dim=-1)
+        ], dim=-1
+    )
+    x_rot_mtx = torch.stack(
+        [
+            torch.stack([torch.ones_like(cos_theta), torch.zeros_like(cos_theta), torch.zeros_like(cos_theta)], dim=-1),
+            torch.stack([torch.zeros_like(cos_phi), cos_phi, -sin_phi], dim=-1),
+            torch.stack([torch.zeros_like(cos_phi), sin_phi, cos_phi], dim=-1)
+        ], dim=-1
+    )
+    rot_mtx = torch.matmul(x_rot_mtx, y_rot_mtx)
+    return rot_mtx
+def batched_get_rot_mtx_fr_vecs_v2(normal_vecs):
+    # normal_vecs: nn_pts x 3 #
+    #
+    normal_vecs = normal_vecs / torch.clamp(torch.norm(normal_vecs, p=2, dim=-1, keepdim=True), min=1e-5)
+    sin_theta = normal_vecs[..., 0]
+    cos_theta = torch.sqrt(1. - sin_theta ** 2)
+    sin_phi = normal_vecs[..., 1] / torch.clamp(cos_theta, min=1e-5)
+    # cos_phi = torch.sqrt(1. - sin_phi ** 2)
+    cos_phi = normal_vecs[..., 2] / torch.clamp(cos_theta, min=1e-5)
+    sin_phi[cos_theta < 1e-5] = 1.
+    cos_phi[cos_theta < 1e-5] = 0.
+    # o: nn_pts x 3 #
+    o = torch.stack(
+        [torch.zeros_like(cos_phi), cos_phi, -sin_phi], dim=-1
+    )
+    nxo = torch.cross(o, normal_vecs)
+    # rot_mtx: nn_pts x 3 x 3 #
+    rot_mtx = torch.stack(
+        [nxo, o, normal_vecs], dim=-1
+    )
+    return rot_mtx
+def batched_get_orientation_matrices(rot_vec):
+    rot_matrices = []
+    for i_w in range(rot_vec.shape[0]):
+        cur_rot_vec = rot_vec[i_w]
+        cur_rot_mtx = R.from_rotvec(cur_rot_vec).as_matrix()
+        rot_matrices.append(cur_rot_mtx)
+    rot_matrices = np.stack(rot_matrices, axis=0)
+    return rot_matrices
+def batched_get_minn_dist_corresponding_pts(tips, obj_pcs):
+    dist_tips_to_obj_pc_minn_idx = np.argmin(
+        ((tips.reshape(tips.shape[0], tips.shape[1], 1, 3) - obj_pcs.reshape(obj_pcs.shape[0], 1, obj_pcs.shape[1], 3)) ** 2).sum(axis=-1), axis=-1
+    )
+    obj_pcs_th = torch.from_numpy(obj_pcs).float()
+    dist_tips_to_obj_pc_minn_idx_th = torch.from_numpy(dist_tips_to_obj_pc_minn_idx).long()
+    nearest_pc_th = batched_index_select(obj_pcs_th, 1, dist_tips_to_obj_pc_minn_idx_th)
+    return nearest_pc_th, dist_tips_to_obj_pc_minn_idx_th
+def get_affinity_fr_dist(dist, s=0.02):
+    ### affinity scores ###
+    k = 0.5 * torch.cos(torch.pi / s * torch.abs(dist)) + 0.5
+    return k
+def batched_reverse_transform(rot, transl, t_pc, trans=True):
+    # t_pc: ws x nn_obj x 3
+    # rot; ws x 3 x 3
+    # transl: ws x 1 x 3
+    if trans:
+        reverse_trans_pc = t_pc - transl
+    else:
+        reverse_trans_pc = t_pc
+    reverse_trans_pc = np.matmul(np.transpose(rot, (0, 2, 1)), np.transpose(reverse_trans_pc, (0, 2, 1)))
+    reverse_trans_pc = np.transpose(reverse_trans_pc, (0, 2, 1))
+    return reverse_trans_pc
+def capsule_sdf(mesh_verts, mesh_normals, query_points, query_normals, caps_rad, caps_top, caps_bot, foreach_on_mesh):
+    # if caps on hand: mesh_verts = hand vert
+    """
+    Find the SDF of query points to mesh verts
+    Capsule SDF formulation from https://iquilezles.org/www/articles/distfunctions/distfunctions.htm
+    :param mesh_verts: (batch, V, 3)
+    :param mesh_normals: (batch, V, 3)
+    :param query_points: (batch, Q, 3)
+    :param caps_rad: scalar, radius of capsules
+    :param caps_top: scalar, distance from mesh to top of capsule
+    :param caps_bot: scalar, distance from mesh to bottom of capsule
+    :param foreach_on_mesh: boolean, foreach point on mesh find closest query (V), or foreach query find closest mesh (Q)
+    :return: normalized sdsf + 1 (batch, V or Q)
+    """
+    # TODO implement normal check?
+    if foreach_on_mesh:     # Foreach mesh vert, find closest query point
+        # knn_dists, nearest_idx, nearest_pos = pytorch3d.ops.knn_points(mesh_verts, query_points, K=1, return_nn=True)   # TODO should attract capsule middle?
+        # knn_dists, nearest_idx, nearest_pos =  compute_nearest(query_points, mesh_verts)
+        knn_dists, nearest_idx, nearest_pos =  compute_nearest(mesh_verts, query_points)
+        capsule_tops = mesh_verts + mesh_normals * caps_top
+        capsule_bots = mesh_verts + mesh_normals * caps_bot
+        delta_top = nearest_pos[:, :, 0, :] - capsule_tops
+        normal_dot = torch.sum(mesh_normals * batched_index_select(query_normals, 1, nearest_idx.squeeze(2)), dim=2)
+        rt_nearest_verts = mesh_verts
+        rt_nearest_normals = mesh_normals
+    else:   # Foreach query vert, find closest mesh point
+        # knn_dists, nearest_idx, nearest_pos = pytorch3d.ops.knn_points(query_points, mesh_verts, K=1, return_nn=True)   # TODO should attract capsule middle?
+        st_time = time.time()
+        knn_dists, nearest_idx, nearest_pos =  compute_nearest(query_points, mesh_verts)
+        ed_time = time.time()
+        # print(f"Time for computing nearest: {ed_time - st_time}")
+        closest_mesh_verts = batched_index_select(mesh_verts, 1, nearest_idx.squeeze(2))    # Shape (batch, V, 3)
+        closest_mesh_normals = batched_index_select(mesh_normals, 1, nearest_idx.squeeze(2))    # Shape (batch, V, 3)
+        capsule_tops = closest_mesh_verts + closest_mesh_normals * caps_top  # Coordinates of the top focii of the capsules (batch, V, 3)
+        capsule_bots = closest_mesh_verts + closest_mesh_normals * caps_bot
+        delta_top = query_points - capsule_tops
+        # normal_dot = torch.sum(query_normals * closest_mesh_normals, dim=2)
+        normal_dot = None
+        rt_nearest_verts = closest_mesh_verts
+        rt_nearest_normals = closest_mesh_normals
+    # (top -> bot) #!!#
+    bot_to_top = capsule_bots - capsule_tops  # Vector from capsule bottom to top
+    along_axis = torch.sum(delta_top * bot_to_top, dim=2)   # Dot product
+    top_to_bot_square = torch.sum(bot_to_top * bot_to_top, dim=2)
+    # print(f"top_to_bot_square: {top_to_bot_square[..., :10]}")
+    h = torch.clamp(along_axis / top_to_bot_square, 0, 1)   # Could avoid NaNs with offset in division here
+    dist_to_axis = torch.norm(delta_top - bot_to_top * h.unsqueeze(2), dim=2)   # Distance to capsule centerline
+    # two endpoints;  edge of the capsule #
+    return dist_to_axis / caps_rad, normal_dot, rt_nearest_verts, rt_nearest_normals  # (Normalized SDF)+1 0 on endpoint, 1 on edge of capsule
+def reparameterize_gaussian(mean, logvar):
+    std = torch.exp(0.5 * logvar)  ### std and eps -->
+    eps = torch.randn(std.size()).to(mean.device)
+    return mean + std * eps
+def gaussian_entropy(logvar):
+    const = 0.5 * float(logvar.size(1)) * (1. + np.log(np.pi * 2))
+    ent = 0.5 * logvar.sum(dim=1, keepdim=False) + const
+    return ent
+def standard_normal_logprob(z): # feature dim
+    dim = z.size(-1)
+    log_z = -0.5 * dim * np.log(2 * np.pi)
+    return log_z - z.pow(2) / 2
+def truncated_normal_(tensor, mean=0, std=1, trunc_std=2):
+    """
+    Taken from https://discuss.pytorch.org/t/implementing-truncated-normal-initializer/4778/15
+    """
+    size = tensor.shape
+    tmp = tensor.new_empty(size + (4,)).normal_()
+    valid = (tmp < trunc_std) & (tmp > -trunc_std)
+    ind = valid.max(-1, keepdim=True)[1]
+    tensor.data.copy_(tmp.gather(-1, ind).squeeze(-1))
+    tensor.data.mul_(std).add_(mean)
+    return tensor
+def makepath(desired_path, isfile = False):
+    '''
+    if the path does not exist make it
+    :param desired_path: can be path to a file or a folder name
+    :return:
+    '''
+    import os
+    if isfile:
+        if not os.path.exists(os.path.dirname(desired_path)):os.makedirs(os.path.dirname(desired_path))
+    else:
+        if not os.path.exists(desired_path): os.makedirs(desired_path)
+    return desired_path
+def batch_gather(arr, ind):
+    """
+    :param arr: B x N x D
+    :param ind: B x M
+    :return: B x M x D
+    """
+    dummy = ind.unsqueeze(2).expand(ind.size(0), ind.size(1), arr.size(2))
+    out = torch.gather(arr, 1, dummy)
+    return out
+def random_rotate_np(x):
+    aa = np.random.randn(3)
+    theta = np.sqrt(np.sum(aa**2))
+    k = aa / np.maximum(theta, 1e-6)
+    K = np.array([[0, -k[2], k[1]],
+                  [k[2], 0, -k[0]],
+                  [-k[1], k[0], 0]])
+    R = np.eye(3) + np.sin(theta)*K + (1-np.cos(theta))*np.matmul(K, K)
+    R = R.astype(np.float32)
+    return np.matmul(x, R), R
+def rotate_x(x, rad):
+    rad = -rad
+    rotmat = np.array([
+        [1, 0, 0],
+        [0, np.cos(rad), -np.sin(rad)],
+        [0, np.sin(rad), np.cos(rad)]
+    ])
+    return np.dot(x, rotmat)
+def rotate_y(x, rad):
+    rad = -rad
+    rotmat = np.array([
+        [np.cos(rad), 0, np.sin(rad)],
+        [0, 1, 0],
+        [-np.sin(rad), 0, np.cos(rad)]
+    ])
+    return np.dot(x, rotmat)
+def rotate_z(x, rad):
+    rad = -rad
+    rotmat = np.array([
+        [np.cos(rad), -np.sin(rad), 0],
+        [np.sin(rad), np.cos(rad), 0],
+        [0, 0, 1]
+    ])
+    return np.dot(x, rotmat)

data_loaders/humanml/motion_loaders/__init__.py ADDED Viewed

File without changes

data_loaders/humanml/motion_loaders/__pycache__/__init__.cpython-38.pyc ADDED Viewed

Binary file (182 Bytes). View file