SportsBERT Small Embeddings

This is a SportsBERT Small model fined-tuned using sentence-transformers. It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.

The training dataset was generated using a random sample of Wikipedia articles labeled as sports.

The model was trained by distilling embeddings from the larger DenseOn model using EmbedDistillLoss over the generated training dataset.

As noted in the paper Well-Read Students Learn Better: On the Importance of Pre-training Compact Models, it's important that the base model is pretrained on a large corpus of relevant documents prior to distillation.

Usage (txtai)

This model can be used to build embeddings databases with txtai for semantic search and/or as a knowledge source for retrieval augmented generation (RAG).

import txtai

embeddings = txtai.Embeddings(path="neuml/sportsbert-small-embeddings", content=True)
embeddings.index(documents())

# Run a query
embeddings.search("query to run")

Usage (Sentence-Transformers)

Alternatively, the model can be loaded with sentence-transformers.

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer("neuml/sportsbert-small-embeddings")
embeddings = model.encode(sentences)
print(embeddings)

Usage (Hugging Face Transformers)

The model can also be used directly with Transformers.

from transformers import AutoTokenizer, AutoModel
import torch

# Mean Pooling - Take attention mask into account for correct averaging
def meanpooling(output, mask):
    embeddings = output[0] # First element of model_output contains all token embeddings
    mask = mask.unsqueeze(-1).expand(embeddings.size()).float()
    return torch.sum(embeddings * mask, 1) / torch.clamp(mask.sum(1), min=1e-9)

# Sentences we want sentence embeddings for
sentences = ['This is an example sentence', 'Each sentence is converted']

# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained("neuml/sportsbert-small-embeddings")
model = AutoModel.from_pretrained("neuml/sportsbert-small-embeddings")

# Tokenize sentences
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Compute token embeddings
with torch.no_grad():
    output = model(**inputs)

# Perform pooling. In this case, mean pooling.
embeddings = meanpooling(output, inputs['attention_mask'])

print("Sentence embeddings:")
print(embeddings)

Evaluation Results

A BEIR-compatible dataset was generated to facilitate the evaluation process. This is a separate random sample of Wikipedia articles alongside generated user queries.

Evaluation results are shown below. NDCG is used as the evaluation metric.

Model Parameters NDCG Index Time Search Time Disk
SportsBERT Small Embeddings 22.7M 47.68 3.02s 0.33s 16 MB
all-MiniLM-L6-v2 22.7M 41.23 3.40s 0.35s 16 MB
DenseOn 149M 48.86 16.40s 0.71s 31 MB
EmbeddingGemma 300M 50.20 23.65s 1.49s 31 MB
Qwen3-Embedding-0.6B 600M 44.92 28.61s 2.06s 41 MB
Qwen3-Embedding-4B 4000M 49.42 138.28s 9.63s 103 MB

This model is a solid performer at a small size. It beats the same sized all-MiniLM-L6-v2 model by a significant margin. It beats the 600M parameter Qwen3 Embeddings model which is over 25x larger. It scores slightly lower than the model it's distilled from (DenseOn).

This is a great model that can be used in CPU-only setups without trading off much on the accuracy front. It shows how small models can excel at specialized domains, requiring less compute and disk space.

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

More Information

Read more about the model in this article.

Downloads last month
-
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NeuML/sportsbert-small-embeddings

Finetuned
(1)
this model

Paper for NeuML/sportsbert-small-embeddings