THEODOROS commited on
Commit
2472515
1 Parent(s): b0d6f07

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -0
README.md CHANGED
@@ -1,3 +1,68 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - THEODOROS/Architext_v1
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - architecture
10
+ - design
11
  ---
12
+ # Architext GPT-J 6B
13
+
14
+ # Model Description
15
+ Architext GPT-J-6B is a transformer model trained using Ben Wang's Mesh Transformer JAX on the Pile and finetuned specifically on a synthetically generated dataset of architectural layouts of apartments. It is capable of generating a large diversity of designs, in a convenient geometric representation that can be used downstream in different design workflows, using just a natural language prompt.
16
+
17
+ # Training data
18
+ GPT-J 6B was pre-trained on the Pile, a large-scale curated dataset created by EleutherAI. It was then finetuned on synthetically generated data that was procedurally generated using the Rhinocers/Grasshopper software suite. The model was finetuned for 1.25 billion tokens over 11,500 steps on TPU v3-8. It was trained as an autoregressive language model, using cross-entropy loss to maximize the likelihood of predicting the next token correctly.
19
+
20
+ # Intended Use and Limitations
21
+ Architext models learn an inner representation of the architectural design that can be used to generate a larger diversity of geometric designs and can be useful for many downstream design workflows and tasks. While it could be adapted to many different design outputs, the model is best at generating residential floor plans given a natural language prompt.
22
+
23
+ # How to use
24
+ ```python
25
+ This model can be easily loaded using the AutoModelForCausalLM functionality:
26
+
27
+ from transformers import AutoTokenizer, AutoModelForCausalLM
28
+
29
+ tokenizer = AutoTokenizer.from_pretrained("architext/Architext-gptj-6B")
30
+ model = AutoModelForCausalLM.from_pretrained("Architext-gptj-6B")
31
+ ```
32
+
33
+ # Limitations and Biases
34
+ The core functionality of Architext is taking a string of text and generating a design output, by still continuously predicting the next token. While language models are widely used for tasks other than this, there are a lot of unknowns with this work especially in the design context. Architext will often generate a design that is not semantically correct, depending on the prompt description it was given, although it almost always generates designs that are valid (non intersecting spaces, no orphan rooms). It is also limited within a small diversity of natural language prompts, specifically prompts that describe:
35
+
36
+ * typology: "a house with two bedrooms and three bathrooms" or "a house with six rooms"
37
+ * adjacency: "the bedroom is adjacent to the living room" or "the kitchen is not adjacent to the bathroom"
38
+ * location: "the bedroom is in the north side of the house" or "a bedroom is in the south east side of the house"
39
+
40
+ Of course, the designs that are generated are conceptual designs and one should never depend on Architext to directly generate accurate construction documentation.
41
+
42
+ # Citation and Related Information
43
+ ## BibTeX entry
44
+ To cite this model:
45
+
46
+ ```
47
+ @article{galanos2023architext,
48
+ title={Architext: Language-Driven Generative Architecture Design},
49
+ author={Galanos, Theodoros and Liapis, Antonios and Yannakakis, Georgios N},
50
+ journal={arXiv preprint arXiv:2303.07519},
51
+ year={2023}
52
+ }
53
+ ```
54
+
55
+ To cite the codebase that trained this model:
56
+
57
+ ```
58
+ @misc{mesh-transformer-jax,
59
+ author = {Wang, Ben},
60
+ title = {{Mesh-Transformer-JAX: Model-Parallel Implementation of Transformer Language Model with JAX}},
61
+ howpublished = {\url{https://github.com/kingoflolz/mesh-transformer-jax}},
62
+ year = 2021,
63
+ month = May
64
+ }
65
+ ```
66
+
67
+ # Acknowledgements
68
+ This project would not have been possible without compute generously provided by Google through the TPU Research Cloud that generously provided access to Clout TPU VMs used to finetune this model.