MarcusLoren commited on
Commit
ddd8656
1 Parent(s): edf8d63

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -3
README.md CHANGED
@@ -1,3 +1,65 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+
6
+ ### MeshGPT-alpha-preview
7
+
8
+ MeshGPT is a text-to-3D model based on an autoencoder (tokenizer) and a transformer to generate the tokens.
9
+ The autoencoder's purpose is to be able to translate 3D models into tokens which then the decoder part of it can convert back to 3D mesh.<br/>
10
+ For all purposes and definitions the autoencoder is the **world first** published **3D model tokenizer**! (correct me if i'm wrong!)
11
+
12
+ ## Model Details
13
+ The autoencoder (tokenizer) is a relative small model using 50M parameters and the transformer model uses 184M parameters and the core is based on GPT2-small.
14
+ Due to hardware contraints it's trained using a codebook/vocabablity size of 2048.<br/>
15
+ Devoloped by: Me (with credits for MeshGPT codebase to [Phil Wang](https://github.com/lucidrains))
16
+
17
+ ### Warning:
18
+ This model has been created without any sponsors or renting any GPU hardware, so it has a very limited capability in terms what it can generate.
19
+ It can handle fine single objects such as 'chair' or 'table' but more complex objects requires more training (see training dataset section).
20
+
21
+ ### Usage:
22
+
23
+ Install:
24
+
25
+ ```
26
+ pip install git+https://github.com/MarcusLoppe/meshgpt-pytorch.git
27
+ ```
28
+ ```
29
+ from meshgpt_pytorch import (
30
+ MeshAutoencoder,
31
+ MeshTransformer,
32
+ mesh_render
33
+ )
34
+
35
+ device = "cuda" if torch.cuda.is_available() else "cpu"
36
+ transformer = MeshTransformer.from_pretrained("MarcusLoren/MeshGPT_tiny_alpha").to(device)
37
+
38
+ output = []
39
+ for text in [ 'bed' , "chair"]:
40
+ face_coords, face_mask = transformer.generate(texts = [text], temperature = 0.0)
41
+ # (batch, num faces, vertices (3), coordinates (3)), (batch, num faces)
42
+ output.append(face_coords)
43
+
44
+ mesh_render.combind_mesh(f'./render.obj', output)
45
+
46
+ ```
47
+
48
+ ## Training dataset
49
+ I've only had access to the free tier GPU on kaggle so this model is only trained on 4k models with max 250 triangles.
50
+ The dataset contains total of 800 text labels so in terms what it can generate it's limited.
51
+ 3D models was sourced from [objaverse](https://huggingface.co/datasets/allenai/objaverse), [shapenet](https://huggingface.co/datasets/ShapeNet/shapenetcore-gltf) and [ModelNet40](https://www.kaggle.com/datasets/balraj98/modelnet40-princeton-3d-object-dataset/data).
52
+
53
+ ## How it works:
54
+ MeshGPT uses an autoencoder which takes 3D mesh (has support for quads but not implemented in this model) then quantizes them into a codebook which can be used as tokens.
55
+ The second part of MeshGPT is the transformer that trains on the tokens generated by the autoencoder while cross-attending to a text embedding.
56
+
57
+ The final product is a tokenizer and a transformer that can input a text embedding and then autoregressive generate a 3D model based on the text input.
58
+ The tokens generated by the transformer can then be converted into 3D mesh using the autoencoder.
59
+
60
+ ## Credits
61
+ The idea for MeshGPT came from the paper ( https://arxiv.org/abs/2311.15475 ) but the creators didn't release any code or model.
62
+ Phil Wang (https://github.com/lucidrains) drew inspiration from the paper and did a ton of improvements over the papers implementation and created the repo : https://github.com/lucidrains/meshgpt-pytorch
63
+ My goal has been to figure out how to train and implement MeshGPT into reality. <br/>
64
+ See my github repo for a notebook on how to get started training your own MeshGPT! [MarcusLoppe/meshgpt-pytorch](https://github.com/MarcusLoppe/meshgpt-pytorch/)
65
+