Multi-3DLLM Checkpoints
This repository hosts the released BeyondSingleObject checkpoints:
multi-3dllm/: MO3D, Shape Mating, and Change Captioningmulti-3dllm-classification/: ModelNet40 zero-shot classification
Use the code and scripts from:
https://github.com/KohsukeIde/BeyondSingleObject
Download
huggingface-cli download idekoh/Multi-3DLLM \
--local-dir checkpoints \
--include "multi-3dllm/**" "multi-3dllm-classification/**"
Expected local layout:
checkpoints/
โโโ multi-3dllm/
โโโ multi-3dllm-classification/
data/
Usage
Example inference and LLM-based evaluation:
MODEL_PATH=checkpoints/multi-3dllm \
OUTPUT_DIR=outputs/infer \
scripts/eval/infer.sh
ModelNet40 classification:
MODEL_PATH=checkpoints/multi-3dllm-classification \
OUTPUT_DIR=outputs/modelnet40_eval \
LIMIT=0 \
PROMPT_MODE=paper \
NUM_OBJECTS=1 \
TARGET_POSITION=1 \
scripts/eval/eval_modelnet.sh
Repeat (NUM_OBJECTS, TARGET_POSITION) = (1,1), (2,1), (2,2), (3,1), (3,2), (3,3) for the full table.
Notes
The LLM-judged metrics for reasoning and delta-caption quality depend on the judge model and prompt configuration. Use the released evaluation scripts for reproducible comparisons, and report the exact judge configuration together with the checkpoint.
License
These checkpoints are built with the BeyondSingleObject codebase and use PointLLM-style initialization and data. They may inherit terms from upstream model, code, and dataset components, including PointLLM, Vicuna/Llama, Objaverse/Cap3D, ShapeTalk, Thingi10K, Neural Shape Mating, and ModelNet40. Please check the corresponding upstream licenses before redistribution or commercial use.
Citation
@inproceedings{ide2026beyondsingleobject,
title={BeyondSingleObject: Learning 3D Relations with Large Language Models},
author={Ide, Kohsuke and Yamada, Ryousuke and Qiu, Yue and Ma, Xianzheng and Fukuhara, Yoshihiro and Kataoka, Hirokatsu and Satoh, Yutaka},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings},
year={2026}
}