psamal commited on
Commit
66a3823
·
verified ·
1 Parent(s): 3b2a985

Update README: Qwen model link update

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -6,7 +6,7 @@ language:
6
  license: other
7
  license_name: webai-non-commercial-license-v1.0
8
  license_link: https://huggingface.co/webAI-Official/webAI-ColVec1-9b/blob/main/LICENSE.md
9
- base_model: Qwen/Qwen3.5-4B
10
  tags:
11
  - text
12
  - image
@@ -22,7 +22,7 @@ tags:
22
 
23
  ## ⚡ Summary
24
 
25
- **webAI-Official/webAI-ColVec1-9b** is a state-of-the-art [ColBERT](https://arxiv.org/abs/2407.01449)-style multimodal embedding model based on *[Qwen/Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B)*. It maps text queries, visual documents (images, PDFs) into aligned multi-vector embeddings.
26
 
27
  The model has been fine-tuned on a **merged multimodal dataset** of ~2M question-image pairs, including [DocVQA](https://huggingface.co/datasets/lmms-lab/DocVQA), [PubTables-1M](https://huggingface.co/datasets/bsmock/pubtables-1m), [TAT-QA](https://huggingface.co/datasets/next-tat/TAT-QA), [ViDoRe-ColPali-Training](https://huggingface.co/datasets/vidore/colpali_train_set), [VDR Multilingual](https://huggingface.co/datasets/llamaindex/vdr-multilingual-train), [VisRAG-Ret-Train-In-domain-data](https://huggingface.co/datasets/openbmb/VisRAG-Ret-Train-In-domain-data), [VisRAG-Ret-Train-Synthetic-data](https://huggingface.co/datasets/openbmb/VisRAG-Ret-Train-Synthetic-data) and proprietary domain-specific synthetic data
28
 
@@ -33,7 +33,7 @@ The datasets were filtered, balanced, and merged to produce a comprehensive trai
33
 
34
  | Feature | Detail |
35
  | --------------------- | ------------------------------------------------------------------------- |
36
- | **Architecture** | Qwen3.5-4B Vision-Language Model (VLM) + `2560 dim` Linear Projection Head |
37
  | **Methodology** | ColBERT-style Late Interaction (MaxSim scoring) |
38
  | **Output** | Multi-vector (Seq_Len × *2560*), L2-normalized |
39
  | **Modalities** | Text Queries, Images (Documents) |
 
6
  license: other
7
  license_name: webai-non-commercial-license-v1.0
8
  license_link: https://huggingface.co/webAI-Official/webAI-ColVec1-9b/blob/main/LICENSE.md
9
+ base_model: Qwen/Qwen3.5-9B
10
  tags:
11
  - text
12
  - image
 
22
 
23
  ## ⚡ Summary
24
 
25
+ **webAI-Official/webAI-ColVec1-9b** is a state-of-the-art [ColBERT](https://arxiv.org/abs/2407.01449)-style multimodal embedding model based on *[Qwen/Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B)*. It maps text queries, visual documents (images, PDFs) into aligned multi-vector embeddings.
26
 
27
  The model has been fine-tuned on a **merged multimodal dataset** of ~2M question-image pairs, including [DocVQA](https://huggingface.co/datasets/lmms-lab/DocVQA), [PubTables-1M](https://huggingface.co/datasets/bsmock/pubtables-1m), [TAT-QA](https://huggingface.co/datasets/next-tat/TAT-QA), [ViDoRe-ColPali-Training](https://huggingface.co/datasets/vidore/colpali_train_set), [VDR Multilingual](https://huggingface.co/datasets/llamaindex/vdr-multilingual-train), [VisRAG-Ret-Train-In-domain-data](https://huggingface.co/datasets/openbmb/VisRAG-Ret-Train-In-domain-data), [VisRAG-Ret-Train-Synthetic-data](https://huggingface.co/datasets/openbmb/VisRAG-Ret-Train-Synthetic-data) and proprietary domain-specific synthetic data
28
 
 
33
 
34
  | Feature | Detail |
35
  | --------------------- | ------------------------------------------------------------------------- |
36
+ | **Architecture** | Qwen3.5-9B Vision-Language Model (VLM) + `2560 dim` Linear Projection Head |
37
  | **Methodology** | ColBERT-style Late Interaction (MaxSim scoring) |
38
  | **Output** | Multi-vector (Seq_Len × *2560*), L2-normalized |
39
  | **Modalities** | Text Queries, Images (Documents) |