NexaAIDev
/

Squid

@@ -2,7 +2,7 @@
 license: cc-by-nc-4.0
 base_model: Qwen/Qwen2-7B-Instruct
 model-index:
-- name: Dolphin
   results: []
 tags:
 - RAG
@@ -14,7 +14,7 @@ spaces: false
 language:
 - en
 ---
-# Dolphin: Long Context as a New Modality for on-device RAG
 <p align="center">
 - <a href="https://www.nexaai.com/models" target="_blank">Nexa Model Hub</a>
@@ -26,7 +26,7 @@ language:
 </p>
 ## Overview
-Dolphin is a novel approach to accelerate language model inference by treating long context as a new modality, similar to image, audio, and video modalities in vision-language models. This innovative method incorporates a language encoder model to encode context information into embeddings, applying multimodal model concepts to enhance the efficiency of language model inference。 Below are model highlights:
 - 🧠 Context as a distinct modality
 - 🗜️ Language encoder for context compression
 - 🔗 Multimodal techniques applied to language processing
@@ -34,7 +34,7 @@ Dolphin is a novel approach to accelerate language model inference by treating l
 - 📜 Specialized for long context understanding
 ## Model Architecture
-Dolphin employs a decoder-decoder framework with two main components:
 1. A smaller decoder (0.5B parameters) for transforming information from extensive contexts
 2. A larger decoder (7B parameters) for comprehending and generating responses to current queries
 3. The architecture also includes a projector to align embeddings between the text encoder and the main decoder.
@@ -131,7 +131,7 @@ If you use Dolphin in your research, please cite our paper:
 ```bibtex
 @article{chen2024dolphinlongcontextnew,
-      title={Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models},
       author={Wei Chen and Zhiyuan Li and Shuo Xin and Yihao Wang},
       year={2024},
       eprint={2408.15518},

 license: cc-by-nc-4.0
 base_model: Qwen/Qwen2-7B-Instruct
 model-index:
+- name: Squid
   results: []
 tags:
 - RAG
 language:
 - en
 ---
+# Squid: Long Context as a New Modality for on-device RAG
 <p align="center">
 - <a href="https://www.nexaai.com/models" target="_blank">Nexa Model Hub</a>
 </p>
 ## Overview
+Squid is a novel approach to accelerate language model inference by treating long context as a new modality, similar to image, audio, and video modalities in vision-language models. This innovative method incorporates a language encoder model to encode context information into embeddings, applying multimodal model concepts to enhance the efficiency of language model inference。 Below are model highlights:
 - 🧠 Context as a distinct modality
 - 🗜️ Language encoder for context compression
 - 🔗 Multimodal techniques applied to language processing
 - 📜 Specialized for long context understanding
 ## Model Architecture
+Squid employs a decoder-decoder framework with two main components:
 1. A smaller decoder (0.5B parameters) for transforming information from extensive contexts
 2. A larger decoder (7B parameters) for comprehending and generating responses to current queries
 3. The architecture also includes a projector to align embeddings between the text encoder and the main decoder.
 ```bibtex
 @article{chen2024dolphinlongcontextnew,
+      title={Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models},
       author={Wei Chen and Zhiyuan Li and Shuo Xin and Yihao Wang},
       year={2024},
       eprint={2408.15518},