victormiller commited on
Commit
670d19f
·
verified ·
1 Parent(s): 56cb2bd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -2
README.md CHANGED
@@ -16,8 +16,21 @@ datasets:
16
 
17
  ## Model Description
18
 
19
- CrystalChat-7B based multi-modal large language model (MLLM) mimics the training recipe used for Vicuna-7B based [LLaVa-v1.5](https://huggingface.co/docs/transformers/main/model_doc/llava). CrystalChat-7B based MLLMs models are entirely transparent, having open-sourced all materials, including code, data, model checkpoint, intermediate results, and more at [TODO: Add paper link](). CrystalChat-7B-Web2Code MLLM is specialized in webpage images-to-html code generation.
20
-
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ## Evaluations
23
 
 
16
 
17
  ## Model Description
18
 
19
+ CrystalChat-7B based multi-modal large language model (MLLM) mimics the training recipe used for Vicuna-7B based [LLaVa-v1.5](https://huggingface.co/docs/transformers/main/model_doc/llava). CrystalChat-7B based MLLMs models are entirely transparent, having open-sourced all materials, including code, data, model checkpoint, intermediate results, and more at [Web2Code: A Large-scale Webpage-to-Code Dataset
20
+ and Evaluation Framework for Multimodal LLMs](https://arxiv.org/pdf/2406.20098). CrystalChat-7B-Web2Code MLLM is specialized in webpage images-to-html code generation.
21
+
22
+ ## Web2Code Dataset
23
+ Our Web2Code instruction tuning dataset construction and instruction generation process
24
+ involves four key components:
25
+ 1. Creation of new webpage image-code pair data: We generated
26
+ high-quality HTML webpage-code pairs following the CodeAlpaca prompt [6] using GPT-3.5 and
27
+ convert them into instruction-following data. (2) Refinement of existing webpage code generation
28
+ data: We transform existing datasets including WebSight [ 22 ] and Pix2Code [ 4] into an instruction-
29
+ following data format similar to LLaVA data [33 ], so they can be used as instruction-following data
30
+ to train MLLMs. (3) Creation of a new text question-answer pair data: We generated a new question-
31
+ answer pair dataset utilizing our new GPT-3.5 generated data from (1) for webpage understanding.
32
+ (4) Refinement of existing webpage understanding data: We refine the WebSRC [ 10] question-answer
33
+ data to improve its quality using the GPT-4.
34
 
35
  ## Evaluations
36