metadata
license: apache-2.0
base_model: sentence-transformers/all-MiniLM-L6-v2
library_name: sentence-transformers
pipeline_tag: sentence-similarity
HAI - HelpingAI Semantic Similarity Model
This is a custom Sentence Transformer model fine-tuned from sentence-transformers/all-MiniLM-L6-v2. Designed as part of the HelpingAI ecosystem, it enhances semantic similarity and contextual understanding, with an emphasis on emotionally intelligent responses.
Model Highlights
- Base Model: sentence-transformers/all-MiniLM-L6-v2
Model Details
Features:
- Input Dimensionality: Handles up to 256 tokens per input.
- Output Dimensionality: 384-dimensional dense embeddings.
- Similarity Metric: Cosine Similarity, fine-tuned for nuanced semantic and emotional comparisons.
Full Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False})
(1): Pooling({'pooling_mode_mean_tokens': True})
(2): Normalize()
)
Training Overview
Dataset:
- Size: 75897 samples
- Structure:
<sentence_0, sentence_1, similarity_score>
- Labels: Float values between 0 (no similarity) and 1 (high similarity).
Training Method:
- Loss Function: Cosine Similarity Loss
- Batch Size: 16
- Epochs: 20
- Optimization: AdamW optimizer with a learning rate of
5e-5
.
Getting Started
Installation
Ensure you have the sentence-transformers
library installed:
pip install -U sentence-transformers
Quick Start
Load and use the model in your Python environment:
from sentence_transformers import SentenceTransformer
# Load the HelpingAI semantic similarity model
model = SentenceTransformer("HelpingAI/HAI")
# Encode sentences
sentences = [
"A woman is slicing a pepper.",
"A girl is styling her hair.",
"The sun is shining brightly today."
]
embeddings = model.encode(sentences)
print(embeddings.shape) # Output: (3, 384)
# Calculate similarity
from sklearn.metrics.pairwise import cosine_similarity
similarity_scores = cosine_similarity([embeddings[0]], embeddings[1:])
print(similarity_scores)
high accuracy in sentiment-informed response tests.
Citation
If you use the HAI model, please cite the original Sentence-BERT paper:
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}