SFXX commited on
Commit
67c4199
1 Parent(s): 2873e87

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -7
README.md CHANGED
@@ -8,7 +8,7 @@ pipeline_tag: image-text-to-text
8
 
9
  # Model description
10
 
11
- `BLIP3` is a series of foundational vision-language models (VLMs) developed by Salesforce AI Research. \
12
  These models have been trained at scale on high-quality image caption datasets and interleaved image-text data. BLIP3 highlights a few features below,
13
 
14
  * The **pretrained** foundation model, `blip3-phi3-mini-base-r-v1`, achieves state-of-the-art performance under 5b parameters and demonstrates strong in-context learning capabilities.
@@ -49,12 +49,6 @@ More technical details will come with a technical report soon.
49
  | **blip3-phi3-mini-instruct-r-v1 (Ours)** | **72.1** | **74.1** | **1827** | 1467 | **360** | **44.6** | 39.8 | **45.1** | **39.3** | **74.2** | 87.2 | **75.8** | |
50
 
51
 
52
- # Bias, Risks, Limitations, and Ethical Considerations
53
- We removed Laion from our training data due to known CSAM concerns.
54
- The other main data sources are from the internet, including webpages,
55
- image stock sites, and curated datasets released by the research community.
56
- The model may be subject to bias from the original data source, as well as bias from LLMs and commercial APIs.
57
- We strongly recommend users conduct an assessment of safety and fairness before applying to downstream applications.
58
  # How to use
59
 
60
  > We require the use of the development version (`"4.41.0.dev0"`) of the `transformers` library. To get it, as of 05/07/2024, one can use `pip uninstall -y transformers && pip install git+https://github.com/huggingface/transformers.`
@@ -116,6 +110,13 @@ More comprehensive examples can be found in the [notebook](demo.ipynb).
116
  Our SFT evaluation is based on the VLMEvalKit, in which we fixed some inconsistencies with the official benchmarks (e.g., LLM judge API). During our development, we noticed that the raw resolution of the input image would noticeably affect the model output in some cases.
117
 
118
 
 
 
 
 
 
 
 
119
  # License
120
 
121
  Our code and weights are released under the Creative Commons Attribution Non Commercial 4.0 [LICENSE](LICENSE.txt). Please fill out a form at [here](https://forms.gle/ffPc9oZC2ZGeJ1N68) to consult the commercial use of model weights.
 
8
 
9
  # Model description
10
 
11
+ `BLIP3` is a series of foundational Large Multimodal Models (LMMs) developed by Salesforce AI Research. \
12
  These models have been trained at scale on high-quality image caption datasets and interleaved image-text data. BLIP3 highlights a few features below,
13
 
14
  * The **pretrained** foundation model, `blip3-phi3-mini-base-r-v1`, achieves state-of-the-art performance under 5b parameters and demonstrates strong in-context learning capabilities.
 
49
  | **blip3-phi3-mini-instruct-r-v1 (Ours)** | **72.1** | **74.1** | **1827** | 1467 | **360** | **44.6** | 39.8 | **45.1** | **39.3** | **74.2** | 87.2 | **75.8** | |
50
 
51
 
 
 
 
 
 
 
52
  # How to use
53
 
54
  > We require the use of the development version (`"4.41.0.dev0"`) of the `transformers` library. To get it, as of 05/07/2024, one can use `pip uninstall -y transformers && pip install git+https://github.com/huggingface/transformers.`
 
110
  Our SFT evaluation is based on the VLMEvalKit, in which we fixed some inconsistencies with the official benchmarks (e.g., LLM judge API). During our development, we noticed that the raw resolution of the input image would noticeably affect the model output in some cases.
111
 
112
 
113
+ # Bias, Risks, Limitations, and Ethical Considerations
114
+ The main data sources are from the internet, including webpages,
115
+ image stock sites, and curated datasets released by the research community. We have excluded certain data, such as LAION, due to known CSAM concerns.
116
+ The model may be subject to bias from the original data source, as well as bias from LLMs and commercial APIs.
117
+ We strongly recommend users assess safety and fairness before applying to downstream applications.
118
+
119
+
120
  # License
121
 
122
  Our code and weights are released under the Creative Commons Attribution Non Commercial 4.0 [LICENSE](LICENSE.txt). Please fill out a form at [here](https://forms.gle/ffPc9oZC2ZGeJ1N68) to consult the commercial use of model weights.