SentenceBertForGatingModel
Collection
Experiments for gating model back bone. • 5 items • Updated
How to use gomyk/me5s-me5s_compressed_v3_distilled with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("gomyk/me5s-me5s_compressed_v3_distilled")
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]Compact multilingual sentence encoder compressed from intfloat/multilingual-e5-small (9x compression).
| Property | Value |
|---|---|
| Base model | intfloat/multilingual-e5-small |
| Architecture | bert (encoder) |
| Hidden dim | 384 (from 384) |
| Layers | 4 (from 12) |
| Intermediate | 1536 |
| Attention heads | 12 |
| Vocab size | 15,424 (from 250,037) |
| Parameters | ~13.2M |
| Model size (FP32) | 51.0MB |
| Compression | 9x |
| Distilled | Yes |
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("me5s_compressed_v3_distilled", trust_remote_code=True)
sentences = [
"Hello, how are you?",
"안녕하세요, 잘 지내세요?",
"こんにちは、元気ですか?",
"你好,你好吗?",
]
embeddings = model.encode(sentences)
print(embeddings.shape) # (4, 384)
Overall Average: 54.35%
| Task Group | Average |
|---|---|
| Classification | 58.21% |
| Clustering | 30.92% |
| STS | 70.01% |
| Task | Average | Details |
|---|---|---|
| AmazonCounterfactualClassification | 68.96% | de: 72.24%, en-ext: 71.78%, en: 70.88%, ja: 60.94% |
| Banking77Classification | 67.0% | default: 67.0% |
| ImdbClassification | 60.0% | default: 60.0% |
| MTOPDomainClassification | 81.87% | en: 86.66%, es: 84.13%, hi: 81.63%, th: 80.64%, de: 79.96% |
| MassiveIntentClassification | 33.14% | en: 60.73%, ja: 56.7%, zh-CN: 55.96%, pt: 55.5%, it: 54.43% |
| MassiveScenarioClassification | 40.53% | en: 67.02%, zh-CN: 65.25%, ja: 64.48%, de: 62.75%, ko: 62.43% |
| ToxicConversationsClassification | 54.24% | default: 54.24% |
| TweetSentimentExtractionClassification | 59.93% | default: 59.93% |
| Task | Average | Details |
|---|---|---|
| ArXivHierarchicalClusteringP2P | 49.09% | default: 49.09% |
| ArXivHierarchicalClusteringS2S | 45.73% | default: 45.73% |
| BiorxivClusteringP2P.v2 | 19.77% | default: 19.77% |
| MedrxivClusteringP2P.v2 | 24.87% | default: 24.87% |
| MedrxivClusteringS2S.v2 | 21.53% | default: 21.53% |
| StackExchangeClustering.v2 | 39.58% | default: 39.58% |
| StackExchangeClusteringP2P.v2 | 31.91% | default: 31.91% |
| TwentyNewsgroupsClustering.v2 | 14.86% | default: 14.86% |
| Task | Average | Details |
|---|---|---|
| BIOSSES | 72.19% | default: 72.19% |
| SICK-R | 74.61% | default: 74.61% |
| STS12 | 73.56% | default: 73.56% |
| STS13 | 73.22% | default: 73.22% |
| STS14 | 73.27% | default: 73.27% |
| STS15 | 82.2% | default: 82.2% |
| STS17 | 58.93% | en-en: 84.37%, es-es: 79.99%, ko-ko: 71.8%, ar-ar: 67.21%, fr-en: 64.15% |
| STS22.v2 | 45.36% | fr: 67.64%, es: 64.13%, es-en: 61.91%, en: 60.63%, it: 60.07% |
| STSBenchmark | 77.67% | default: 77.67% |
| STSBenchmarkMultilingualSTS | 69.05% | en: 77.67%, es: 73.78%, fr: 73.75%, pt: 71.23%, it: 70.45% |
intfloat/multilingual-e5-small (12L, 384d)byte_fallback=true<0x00>~`<0xFF>`) added to vocab<unk> tokens for any Unicode input16 languages x 50 parallel sentences from MASSIVE dataset. Ideal: same-meaning sentences cluster together across languages.
ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, pl