naoto0804 commited on
Commit
c964e3a
1 Parent(s): 8588b3e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -51
README.md CHANGED
@@ -7,6 +7,7 @@ tags: []
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
 
10
 
11
 
12
  ## Model Details
@@ -17,61 +18,68 @@ tags: []
17
 
18
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
 
 
20
  - **Developed by:** [More Information Needed]
21
  - **Funded by [optional]:** [More Information Needed]
22
  - **Shared by [optional]:** [More Information Needed]
23
  - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
 
 
 
 
27
 
28
  ### Model Sources [optional]
29
 
30
  <!-- Provide the basic links for the model. -->
31
 
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
 
36
  ## Uses
37
 
38
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
 
40
- ### Direct Use
 
 
41
 
42
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
 
44
- [More Information Needed]
45
 
46
- ### Downstream Use [optional]
47
 
48
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
 
50
- [More Information Needed]
51
 
52
- ### Out-of-Scope Use
53
 
54
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
 
56
- [More Information Needed]
57
 
58
- ## Bias, Risks, and Limitations
59
 
60
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
 
62
- [More Information Needed]
63
 
64
- ### Recommendations
65
 
66
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
 
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
 
70
- ## How to Get Started with the Model
71
 
72
- Use the code below to get started with the model.
73
 
74
- [More Information Needed]
75
 
76
  ## Training Details
77
 
@@ -79,69 +87,92 @@ Use the code below to get started with the model.
79
 
80
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
 
82
- [More Information Needed]
83
-
84
- ### Training Procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
 
86
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
 
88
- #### Preprocessing [optional]
89
 
90
- [More Information Needed]
91
 
 
92
 
93
- #### Training Hyperparameters
94
 
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
 
97
- #### Speeds, Sizes, Times [optional]
98
 
99
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
 
101
- [More Information Needed]
102
 
103
- ## Evaluation
104
 
105
  <!-- This section describes the evaluation protocols and provides the results. -->
106
 
107
- ### Testing Data, Factors & Metrics
108
 
109
  #### Testing Data
110
 
111
  <!-- This should link to a Dataset Card if possible. -->
112
 
113
- [More Information Needed]
114
 
115
- #### Factors
116
 
117
  <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
 
119
- [More Information Needed]
120
 
121
- #### Metrics
122
 
123
  <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
 
125
- [More Information Needed]
126
 
127
- ### Results
128
 
129
- [More Information Needed]
130
 
131
- #### Summary
132
 
133
 
134
 
135
- ## Model Examination [optional]
136
 
137
  <!-- Relevant interpretability work for the model goes here -->
138
 
139
- [More Information Needed]
140
 
141
- ## Environmental Impact
142
 
143
  <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
 
 
 
145
  Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
 
147
  - **Hardware Type:** [More Information Needed]
@@ -168,32 +199,41 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
168
 
169
  [More Information Needed]
170
 
171
- ## Citation [optional]
172
 
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
 
175
- **BibTeX:**
176
 
177
- [More Information Needed]
 
 
 
 
 
 
 
178
 
 
179
  **APA:**
180
 
181
  [More Information Needed]
182
 
183
  ## Glossary [optional]
 
184
 
185
  <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
 
187
- [More Information Needed]
188
 
189
- ## More Information [optional]
190
 
191
- [More Information Needed]
192
 
193
- ## Model Card Authors [optional]
194
 
195
- [More Information Needed]
196
 
197
  ## Model Card Contact
198
 
199
- [More Information Needed]
 
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
10
+ This model is based on [LLaVA1.5-7b](https://huggingface.co/liuhaotian/llava-v1.5-7b). The model is finetuned with LoRA on [OpenCOLE1.0 dataset](naoto0804/opencole) to generate text layouts.
11
 
12
 
13
  ## Model Details
 
18
 
19
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
20
 
21
+ <!--
22
  - **Developed by:** [More Information Needed]
23
  - **Funded by [optional]:** [More Information Needed]
24
  - **Shared by [optional]:** [More Information Needed]
25
  - **Model type:** [More Information Needed]
26
+ -->
27
+
28
+ - **Language(s) (NLP):** English
29
+ - **License:**
30
+ Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.
31
+
32
+ - **Finetuned from model:** [LLaVA1.5-7b](https://huggingface.co/liuhaotian/llava-v1.5-7b)
33
 
34
  ### Model Sources [optional]
35
 
36
  <!-- Provide the basic links for the model. -->
37
 
38
+ - **Repository:** [CyberAgentAILab/OpenCOLE](https://github.com/CyberAgentAILab/OpenCOLE)
39
+ - **Paper:** [OpenCOLE: Towards Reproducible Automatic Graphic Design Generation]()
40
+ <!-- **Demo [optional]:** [More Information Needed] -->
41
 
42
  ## Uses
43
 
44
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
45
 
46
+ Please refer to [OpenCOLE](https://github.com/CyberAgentAILab/OpenCOLE).
47
+
48
+ <!-- ### Direct Use -->
49
 
50
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
51
 
52
+ <!-- [More Information Needed] -->
53
 
54
+ <!-- ### Downstream Use [optional] -->
55
 
56
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
57
 
58
+ <!-- [More Information Needed] -->
59
 
60
+ <!-- ### Out-of-Scope Use -->
61
 
62
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
63
 
64
+ <!-- [More Information Needed] -->
65
 
66
+ <!-- ## Bias, Risks, and Limitations -->
67
 
68
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
69
 
70
+ <!-- [More Information Needed] -->
71
 
72
+ <!-- ### Recommendations -->
73
 
74
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
75
 
76
+ <!-- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. -->
77
 
78
+ <!--## How to Get Started with the Model -->
79
 
80
+ <!--Use the code below to get started with the model. -->
81
 
82
+ <!--[More Information Needed] -->
83
 
84
  ## Training Details
85
 
 
87
 
88
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
89
 
90
+ - About 18k image-text extracted automatically from OpenCOLE
91
+
92
+ Below is an example.
93
+ ```
94
+ [
95
+ {
96
+ "id": "592d203395a7a863ddcd9df1",
97
+ "image": "images/592/592d203395a7a863ddcd9df1.png",
98
+ "conversations": [
99
+ {
100
+ "from": "human",
101
+ "value": "<image>\nGiven an image and text input including set of keywords to be placed on the image and its properties (optional), plan the layout of the texts. The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {\"properties\": {\"foo\": {\"title\": \"Foo\", \"description\": \"a list of strings\", \"type\": \"array\", \"items\": {\"type\": \"string\"}}}, \"required\": [\"foo\"]}\nthe object {\"foo\": [\"bar\", \"baz\"]} is a well-formatted instance of the schema. The object {\"properties\": {\"foo\": [\"bar\", \"baz\"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{\"properties\": {\"elements\": {\"title\": \"Elements\", \"default\": [], \"type\": \"array\", \"items\": {\"$ref\": \"#/definitions/Element\"}}}, \"definitions\": {\"Element\": {\"title\": \"Element\", \"type\": \"object\", \"properties\": {\"text\": {\"title\": \"Text\", \"description\": \"Dummy text\", \"type\": \"string\"}, \"width\": {\"title\": \"Width\", \"description\": \"range: 0 <= width <= 127\", \"type\": \"integer\"}, \"height\": {\"title\": \"Height\", \"description\": \"range: 0 <= height <= 127\", \"type\": \"integer\"}, \"left\": {\"title\": \"Left\", \"description\": \"range: 0 <= left <= 127\", \"type\": \"integer\"}, \"top\": {\"title\": \"Top\", \"description\": \"range: 0 <= top <= 127\", \"type\": \"integer\"}, \"font\": {\"title\": \"Font\", \"type\": \"string\"}, \"color\": {\"title\": \"Color\", \"description\": \"range: 0 <= color <= 127\", \"type\": \"integer\"}, \"text_align\": {\"title\": \"Text Align\", \"description\": \"choices: \\\"\\\", \\\"left\\\", \\\"center\\\", \\\"right\\\"\", \"type\": \"string\"}, \"capitalize\": {\"title\": \"Capitalize\", \"description\": \"choices: \\\"false\\\", \\\"true\\\"\", \"type\": \"string\"}, \"font_size\": {\"title\": \"Font Size\", \"description\": \"range: 0 <= font_size <= 127\", \"type\": \"integer\"}, \"angle\": {\"title\": \"Angle\", \"description\": \"range: 0 <= angle <= 127\", \"type\": \"integer\"}, \"letter_spacing\": {\"title\": \"Letter Spacing\", \"description\": \"range: 0 <= letter_spacing <= 127\", \"type\": \"integer\"}, \"line_height\": {\"title\": \"Line Height\", \"description\": \"range: 0 <= line_height <= 127\", \"type\": \"integer\"}}, \"required\": [\"text\", \"width\", \"height\", \"left\", \"top\", \"font\", \"color\", \"text_align\", \"capitalize\", \"font_size\", \"angle\", \"letter_spacing\", \"line_height\"]}}}\n``` Input: [\"WE DON'T HAVE\\nANOTHER PLANET\", \"GREEN\", \"GO\"]"
102
+ },
103
+ {
104
+ "from": "gpt",
105
+ "value": "{\"elements\": [{\"text\": \"GO\", \"width\": 62, \"height\": 40, \"left\": 11, \"top\": 43, \"font\": \"Cormorant Infant\", \"color\": 38, \"text_align\": \"center\", \"capitalize\": \"false\", \"font_size\": 79, \"angle\": 0, \"letter_spacing\": 61, \"line_height\": 27}, {\"text\": \"GREEN\", \"width\": 69, \"height\": 30, \"left\": 6, \"top\": 60, \"font\": \"Cormorant Infant\", \"color\": 56, \"text_align\": \"center\", \"capitalize\": \"false\", \"font_size\": 67, \"angle\": 0, \"letter_spacing\": 50, \"line_height\": 27}, {\"text\": \"WE DON'T HAVE\\nANOTHER PLANET\", \"width\": 71, \"height\": 37, \"left\": 3, \"top\": 74, \"font\": \"Cormorant Infant\", \"color\": 56, \"text_align\": \"center\", \"capitalize\": \"false\", \"font_size\": 39, \"angle\": 0, \"letter_spacing\": 29, \"line_height\": 47}]}"
106
+ }
107
+ ]
108
+ },
109
+ ...
110
+ ```
111
+
112
+ <!-- ### Training Procedure -->
113
 
114
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
115
 
116
+ <!-- #### Preprocessing [optional] -->
117
 
118
+ <!-- [More Information Needed] -->
119
 
120
+ <!-- #### Training Hyperparameters -->
121
 
122
+ <!-- - **Training regime:** [More Information Needed] -->
123
 
124
+ - <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
125
 
126
+ <!-- #### Speeds, Sizes, Times [optional] -->
127
 
128
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
129
 
130
+ <!-- [More Information Needed] -->
131
 
132
+ <!-- ## Evaluation ->
133
 
134
  <!-- This section describes the evaluation protocols and provides the results. -->
135
 
136
+ <!-- ### Testing Data, Factors & Metrics ->
137
 
138
  #### Testing Data
139
 
140
  <!-- This should link to a Dataset Card if possible. -->
141
 
142
+ <!-- [More Information Needed] -->
143
 
144
+ <!-- #### Factors -->
145
 
146
  <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
147
 
148
+ <!-- [More Information Needed] -->
149
 
150
+ <!-- #### Metrics -->
151
 
152
  <!-- These are the evaluation metrics being used, ideally with a description of why. -->
153
 
154
+ <!-- [More Information Needed] -->
155
 
156
+ <!-- ### Results -->
157
 
158
+ <!-- [More Information Needed] -->
159
 
160
+ <!-- #### Summary -->
161
 
162
 
163
 
164
+ <!-- ## Model Examination [optional] -->
165
 
166
  <!-- Relevant interpretability work for the model goes here -->
167
 
168
+ <!-- [More Information Needed] -->
169
 
170
+ <!-- ## Environmental Impact -->
171
 
172
  <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
173
 
174
+ <!--
175
+
176
  Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
177
 
178
  - **Hardware Type:** [More Information Needed]
 
199
 
200
  [More Information Needed]
201
 
202
+ -->
203
 
204
+ ## Citation
205
 
206
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
207
 
208
+ ```
209
+ @inproceedings{inoue2024opencole,
210
+ title={{OpenCOLE: Towards Reproducible Automatic Graphic Design Generation}},
211
+ author={Naoto Inoue and Kento Masui and Wataru Shimoda and Kota Yamaguchi},
212
+ booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
213
+ year={2024},
214
+ }
215
+ ```
216
 
217
+ <!--
218
  **APA:**
219
 
220
  [More Information Needed]
221
 
222
  ## Glossary [optional]
223
+ -->
224
 
225
  <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
226
 
227
+ <!-- [More Information Needed] -->
228
 
229
+ <!-- ## More Information [optional] -->
230
 
231
+ <!-- [More Information Needed] -->
232
 
233
+ <!-- ## Model Card Authors [optional] -->
234
 
235
+ <!-- [More Information Needed] -->
236
 
237
  ## Model Card Contact
238
 
239
+ [Naoto Inoue](https://github.com/naoto0804)