jjhsnail0822 commited on
Commit
c1971f3
·
verified ·
1 Parent(s): 5745cd1

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - aldente0630/LLaVA-Pretrain-Ko
5
+ - tabtoyou/KoLLaVA-v1.5-Instruct-581k
6
+ language:
7
+ - ko
8
+ - en
9
+ base_model:
10
+ - jjhsnail0822/danube-ko-1.8b-base
11
+ - openai/clip-vit-large-patch14-336
12
+ tags:
13
+ - h2o-danube2
14
+ - korean
15
+ - sLLM
16
+ - llm
17
+ - multimodal
18
+ - LLaVA
19
+ ---
20
+
21
+
22
+ ## Model Details
23
+
24
+ llava-danube-ko-1.8b-instruct is a Korean multi-modal language model based on [jjhsnail0822/danube-ko-1.8b-base](https://huggingface.co/jjhsnail0822/danube-ko-1.8b-base).
25
+
26
+ ## Model Developers
27
+
28
+ Jinhong Jeong, Ungsang Yoon
29
+
30
+ ## Model Architecture
31
+
32
+ We used the [LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT) technique to add multi-modal capabilities to the model by 2-stage visual instruction tuning. [jjhsnail0822/danube-ko-1.8b-base](https://huggingface.co/jjhsnail0822/danube-ko-1.8b-base) was used as the base LLM, and the visual encoder was fine-tuned from [openai/clip-vit-large-patch14-336](https://huggingface.co/openai/clip-vit-large-patch14-336). The model has sequence length of 2048.
33
+
34
+ ## Training Datasets
35
+
36
+ In the multi-modal pretrain stage, images filtered from LAION/CC/SBU dataset was used. For the visual instruction tuning stage, we prepared the training dataset from COCO, GQA, Visual Genome datasets and EKVQA dataset from AI-Hub. About 90GB of compressed image data was used for the whole training process.
37
+
38
+ ## Model Benchmark
39
+
40
+ TBA
41
+
42
+ ## Disclaimer
43
+
44
+ The Model can generate information that is biased, discriminatory, socially inappropriate, etc. The Model can also generate information that is not accurate. The Model is used at your own risk, and the developers are not responsible for the information generated by the model.