Update README.md
Browse files
README.md
CHANGED
@@ -13,7 +13,9 @@ Users have to apply it on top of the original LLaMA weights to get actual LLaVA
|
|
13 |
|
14 |
*Visual instruction tuning towards buiding large language and vision models with GPT-4 level capabilities in the biomedicine space.*
|
15 |
|
16 |
-
[[Paper, NeurIPS 2023 Datasets and Benchmarks Track (Spotlight)](https://arxiv.org/abs/2306.00890)]
|
|
|
|
|
17 |
|
18 |
|
19 |
[Chunyuan Li*](https://chunyuan.li/), [Cliff Wong*](https://scholar.google.com/citations?user=Sl05ifcAAAAJ&hl=en), [Sheng Zhang*](https://scholar.google.com/citations?user=-LVEXQ8AAAAJ&hl=en), [Naoto Usuyama](https://www.microsoft.com/en-us/research/people/naotous/), [Haotian Liu](https://hliu.cc), [Jianwei Yang](https://jwyang.github.io/), [Tristan Naumann](https://scholar.google.com/citations?user=cjlSeqwAAAAJ&hl=en), [Hoifung Poon](https://scholar.google.com/citations?user=yqqmVbkAAAAJ&hl=en), [Jianfeng Gao](https://scholar.google.com/citations?user=CQ1cqKkAAAAJ&hl=en) (*Equal Contribution)
|
@@ -58,6 +60,38 @@ This model was developed using English corpora, and thus may be considered Engli
|
|
58 |
|
59 |
Further, this model was developed in part using the [PMC-15M](https://aka.ms/biomedclip-paper) dataset. The figure-caption pairs that make up this dataset may contain biases reflecting the current practice of academic publication. For example, the corresponding papers may be enriched for positive findings, contain examples of extreme cases, and otherwise reflect distributions that are not representative of other sources of biomedical data.
|
60 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
61 |
## Serving
|
62 |
|
63 |
The model weights above are *delta* weights. The usage of LLaVA-Med checkpoints should comply with the base LLM's model license: [LLaMA](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md).
|
|
|
13 |
|
14 |
*Visual instruction tuning towards buiding large language and vision models with GPT-4 level capabilities in the biomedicine space.*
|
15 |
|
16 |
+
[[Paper, NeurIPS 2023 Datasets and Benchmarks Track (Spotlight)](https://arxiv.org/abs/2306.00890)]
|
17 |
+
|
18 |
+
[LLaVA-Med Github Repository](https://github.com/microsoft/LLaVA-Med)
|
19 |
|
20 |
|
21 |
[Chunyuan Li*](https://chunyuan.li/), [Cliff Wong*](https://scholar.google.com/citations?user=Sl05ifcAAAAJ&hl=en), [Sheng Zhang*](https://scholar.google.com/citations?user=-LVEXQ8AAAAJ&hl=en), [Naoto Usuyama](https://www.microsoft.com/en-us/research/people/naotous/), [Haotian Liu](https://hliu.cc), [Jianwei Yang](https://jwyang.github.io/), [Tristan Naumann](https://scholar.google.com/citations?user=cjlSeqwAAAAJ&hl=en), [Hoifung Poon](https://scholar.google.com/citations?user=yqqmVbkAAAAJ&hl=en), [Jianfeng Gao](https://scholar.google.com/citations?user=CQ1cqKkAAAAJ&hl=en) (*Equal Contribution)
|
|
|
60 |
|
61 |
Further, this model was developed in part using the [PMC-15M](https://aka.ms/biomedclip-paper) dataset. The figure-caption pairs that make up this dataset may contain biases reflecting the current practice of academic publication. For example, the corresponding papers may be enriched for positive findings, contain examples of extreme cases, and otherwise reflect distributions that are not representative of other sources of biomedical data.
|
62 |
|
63 |
+
## Install
|
64 |
+
|
65 |
+
1. Clone the [LLaVA-Med Github repository](https://github.com/microsoft/LLaVA-Med) and navigate to LLaVA-Med folder
|
66 |
+
```bash
|
67 |
+
https://github.com/microsoft/LLaVA-Med.git
|
68 |
+
cd LLaVA-Med
|
69 |
+
```
|
70 |
+
|
71 |
+
2. Install Package: Create conda environment
|
72 |
+
|
73 |
+
```Shell
|
74 |
+
conda create -n llava-med python=3.10 -y
|
75 |
+
conda activate llava-med
|
76 |
+
pip install --upgrade pip # enable PEP 660 support
|
77 |
+
```
|
78 |
+
|
79 |
+
3. Install additional packages for training cases
|
80 |
+
|
81 |
+
```Shell
|
82 |
+
pip uninstall torch torchvision -y
|
83 |
+
pip install torch==2.0.0+cu117 torchvision==0.15.1+cu117 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu117
|
84 |
+
pip install openai==0.27.8
|
85 |
+
pip uninstall transformers -y
|
86 |
+
pip install git+https://github.com/huggingface/transformers@cae78c46
|
87 |
+
pip install -e .
|
88 |
+
```
|
89 |
+
```
|
90 |
+
pip install einops ninja open-clip-torch
|
91 |
+
pip install flash-attn --no-build-isolation
|
92 |
+
```
|
93 |
+
|
94 |
+
|
95 |
## Serving
|
96 |
|
97 |
The model weights above are *delta* weights. The usage of LLaVA-Med checkpoints should comply with the base LLM's model license: [LLaMA](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md).
|