Update README: Qwen model link update

Browse files

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ language:
 license: other
 license_name: webai-non-commercial-license-v1.0
 license_link: https://huggingface.co/webAI-Official/webAI-ColVec1-9b/blob/main/LICENSE.md
-base_model: Qwen/Qwen3.5-4B
 tags:
   - text
   - image
@@ -22,7 +22,7 @@ tags:
 ## ⚡ Summary
-**webAI-Official/webAI-ColVec1-9b** is a state-of-the-art [ColBERT](https://arxiv.org/abs/2407.01449)-style multimodal embedding model based on *[Qwen/Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B)*. It maps text queries, visual documents (images, PDFs) into aligned multi-vector embeddings.
 The model has been fine-tuned on a **merged multimodal dataset** of ~2M question-image pairs, including [DocVQA](https://huggingface.co/datasets/lmms-lab/DocVQA), [PubTables-1M](https://huggingface.co/datasets/bsmock/pubtables-1m), [TAT-QA](https://huggingface.co/datasets/next-tat/TAT-QA), [ViDoRe-ColPali-Training](https://huggingface.co/datasets/vidore/colpali_train_set), [VDR Multilingual](https://huggingface.co/datasets/llamaindex/vdr-multilingual-train), [VisRAG-Ret-Train-In-domain-data](https://huggingface.co/datasets/openbmb/VisRAG-Ret-Train-In-domain-data), [VisRAG-Ret-Train-Synthetic-data](https://huggingface.co/datasets/openbmb/VisRAG-Ret-Train-Synthetic-data) and proprietary domain-specific synthetic data
@@ -33,7 +33,7 @@ The datasets were filtered, balanced, and merged to produce a comprehensive trai
 | Feature               | Detail                                                                    |
 | --------------------- | ------------------------------------------------------------------------- |
-| **Architecture**      | Qwen3.5-4B Vision-Language Model (VLM) + `2560 dim` Linear Projection Head |
 | **Methodology**       | ColBERT-style Late Interaction (MaxSim scoring)                           |
 | **Output**            | Multi-vector (Seq_Len × *2560*), L2-normalized                             |
 | **Modalities**        | Text Queries, Images (Documents)                                          |

 license: other
 license_name: webai-non-commercial-license-v1.0
 license_link: https://huggingface.co/webAI-Official/webAI-ColVec1-9b/blob/main/LICENSE.md
+base_model: Qwen/Qwen3.5-9B
 tags:
   - text
   - image
 ## ⚡ Summary
+**webAI-Official/webAI-ColVec1-9b** is a state-of-the-art [ColBERT](https://arxiv.org/abs/2407.01449)-style multimodal embedding model based on *[Qwen/Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B)*. It maps text queries, visual documents (images, PDFs) into aligned multi-vector embeddings.
 The model has been fine-tuned on a **merged multimodal dataset** of ~2M question-image pairs, including [DocVQA](https://huggingface.co/datasets/lmms-lab/DocVQA), [PubTables-1M](https://huggingface.co/datasets/bsmock/pubtables-1m), [TAT-QA](https://huggingface.co/datasets/next-tat/TAT-QA), [ViDoRe-ColPali-Training](https://huggingface.co/datasets/vidore/colpali_train_set), [VDR Multilingual](https://huggingface.co/datasets/llamaindex/vdr-multilingual-train), [VisRAG-Ret-Train-In-domain-data](https://huggingface.co/datasets/openbmb/VisRAG-Ret-Train-In-domain-data), [VisRAG-Ret-Train-Synthetic-data](https://huggingface.co/datasets/openbmb/VisRAG-Ret-Train-Synthetic-data) and proprietary domain-specific synthetic data
 | Feature               | Detail                                                                    |
 | --------------------- | ------------------------------------------------------------------------- |
+| **Architecture**      | Qwen3.5-9B Vision-Language Model (VLM) + `2560 dim` Linear Projection Head |
 | **Methodology**       | ColBERT-style Late Interaction (MaxSim scoring)                           |
 | **Output**            | Multi-vector (Seq_Len × *2560*), L2-normalized                             |
 | **Modalities**        | Text Queries, Images (Documents)                                          |