Add pipeline tag and improve model card documentation
Browse filesHi! I'm Niels from the Hugging Face community team. I've updated the model card to include the `image-to-3d` pipeline tag and added more context about the paper, installation, and usage based on the official GitHub repository. This should help researchers and developers discover and use your work more easily!
README.md
CHANGED
|
@@ -1,13 +1,71 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
# Quantized Visual Geometry Grounded Transformer
|
| 6 |
|
| 7 |
[](https://arxiv.org/abs/2509.21302)
|
|
|
|
| 8 |
|
| 9 |
-
This repository contains the
|
| 10 |
|
| 11 |
-
|
| 12 |
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
pipeline_tag: image-to-3d
|
| 4 |
---
|
| 5 |
|
| 6 |
# Quantized Visual Geometry Grounded Transformer
|
| 7 |
|
| 8 |
[](https://arxiv.org/abs/2509.21302)
|
| 9 |
+
[](https://github.com/wlfeng0509/QuantVGGT)
|
| 10 |
|
| 11 |
+
This repository contains the weights and calibration data for **QuantVGGT**, presented in the paper [Quantized Visual Geometry Grounded Transformer](https://arxiv.org/abs/2509.21302).
|
| 12 |
|
| 13 |
+
QuantVGGT is the first quantization framework specifically designed for Visual Geometry Grounded Transformers (VGGTs). It addresses unique challenges in compressing billion-scale 3D reconstruction models, such as heavy-tailed activation distributions and multi-view calibration instability.
|
| 14 |
|
| 15 |
+
## Installation
|
| 16 |
+
|
| 17 |
+
To get started, clone the official repository and install the dependencies:
|
| 18 |
+
|
| 19 |
+
```bash
|
| 20 |
+
git clone https://github.com/wlfeng0509/QuantVGGT.git
|
| 21 |
+
cd QuantVGGT
|
| 22 |
+
pip install -r requirements.txt
|
| 23 |
+
pip install -r requirements_demo.txt
|
| 24 |
+
```
|
| 25 |
+
|
| 26 |
+
## Quick Start
|
| 27 |
+
|
| 28 |
+
You can use the provided scripts for inference and calibration. For example, to generate filtered Co3D calibration data:
|
| 29 |
+
|
| 30 |
+
```bash
|
| 31 |
+
python Quant_VGGT/vggt/evaluation/make_calibation.py \
|
| 32 |
+
--model_path VGGT-1B/model_tracker_fixed_e20.pt \
|
| 33 |
+
--co3d_dir co3d_datasets/ \
|
| 34 |
+
--co3d_anno_dir co3d_v2_annotations/ \
|
| 35 |
+
--seed 0 \
|
| 36 |
+
--cache_path all_calib_data.pt \
|
| 37 |
+
--save_path calib_data.pt \
|
| 38 |
+
--class_mode all \
|
| 39 |
+
--kmeans_n 6 \
|
| 40 |
+
--kmeans_m 7
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
To quantize, calibrate, and evaluate on Co3D:
|
| 44 |
+
|
| 45 |
+
```bash
|
| 46 |
+
python Quant_VGGT/vggt/evaluation/run_co3d.py \
|
| 47 |
+
--model_path Quant_VGGT/VGGT-1B/model_tracker_fixed_e20.pt \
|
| 48 |
+
--co3d_dir co3d_datasets/ \
|
| 49 |
+
--co3d_anno_dir co3d_v2_annotations/ \
|
| 50 |
+
--dtype quarot_w4a4 \
|
| 51 |
+
--seed 0 \
|
| 52 |
+
--lac \
|
| 53 |
+
--lwc \
|
| 54 |
+
--cache_path calib_data.pt \
|
| 55 |
+
--class_mode all \
|
| 56 |
+
--exp_name a44_uqant \
|
| 57 |
+
--resume_qs
|
| 58 |
+
```
|
| 59 |
+
|
| 60 |
+
## Citation
|
| 61 |
+
|
| 62 |
+
If you find QuantVGGT useful for your work, please cite the following paper:
|
| 63 |
+
|
| 64 |
+
```bibtex
|
| 65 |
+
@article{feng2025quantized,
|
| 66 |
+
title={Quantized Visual Geometry Grounded Transformer},
|
| 67 |
+
author={Feng, Weilun and Qin, Haotong and Wu, Mingqiang and Yang, Chuanguang and Li, Yuqi and Li, Xiangqi and An, Zhulin and Huang, Libo and Zhang, Yulun and Magno, Michele and others},
|
| 68 |
+
journal={arXiv preprint arXiv:2509.21302},
|
| 69 |
+
year={2025}
|
| 70 |
+
}
|
| 71 |
+
```
|