Abhaykoul commited on
Commit
e625b2f
1 Parent(s): b34d15c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: sentence-transformers/all-MiniLM-L6-v2
4
+ library_name: sentence-transformers
5
+ pipeline_tag: sentence-similarity
6
+ ---
7
+ # HAI - HelpingAI Semantic Similarity Model
8
+
9
+ This is a **custom Sentence Transformer model** fine-tuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). Designed as part of the **HelpingAI ecosystem**, it enhances **semantic similarity and contextual understanding**, with an emphasis on **emotionally intelligent responses**.
10
+
11
+ ## Model Highlights
12
+
13
+ - **Base Model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
14
+
15
+ ## Model Details
16
+
17
+ ### Features:
18
+ - **Input Dimensionality:** Handles up to 256 tokens per input.
19
+ - **Output Dimensionality:** 384-dimensional dense embeddings.
20
+ - **Similarity Metric:** Cosine Similarity, fine-tuned for nuanced semantic and emotional comparisons.
21
+
22
+ ### Full Architecture
23
+ ```python
24
+ SentenceTransformer(
25
+ (0): Transformer({'max_seq_length': 256, 'do_lower_case': False})
26
+ (1): Pooling({'pooling_mode_mean_tokens': True})
27
+ (2): Normalize()
28
+ )
29
+ ```
30
+
31
+
32
+ ## Training Overview
33
+
34
+ ### Dataset:
35
+ - **Size:** 75897 samples
36
+ - **Structure:** `<sentence_0, sentence_1, similarity_score>`
37
+ - **Labels:** Float values between 0 (no similarity) and 1 (high similarity).
38
+
39
+ ### Training Method:
40
+ - **Loss Function:** Cosine Similarity Loss
41
+ - **Batch Size:** 16
42
+ - **Epochs:** 20
43
+ - **Optimization:** AdamW optimizer with a learning rate of `5e-5`.
44
+
45
+ ## Getting Started
46
+
47
+ ### Installation
48
+ Ensure you have the `sentence-transformers` library installed:
49
+ ```bash
50
+ pip install -U sentence-transformers
51
+ ```
52
+
53
+ ### Quick Start
54
+ Load and use the model in your Python environment:
55
+ ```python
56
+ from sentence_transformers import SentenceTransformer
57
+
58
+ # Load the HelpingAI semantic similarity model
59
+ model = SentenceTransformer("HelpingAI/HAI")
60
+
61
+ # Encode sentences
62
+ sentences = [
63
+ "A woman is slicing a pepper.",
64
+ "A girl is styling her hair.",
65
+ "The sun is shining brightly today."
66
+ ]
67
+ embeddings = model.encode(sentences)
68
+ print(embeddings.shape) # Output: (3, 384)
69
+
70
+ # Calculate similarity
71
+ from sklearn.metrics.pairwise import cosine_similarity
72
+ similarity_scores = cosine_similarity([embeddings[0]], embeddings[1:])
73
+ print(similarity_scores)
74
+ ```
75
+ high accuracy in sentiment-informed response tests.
76
+
77
+ ## Citation
78
+
79
+ If you use the HAI model, please cite the original Sentence-BERT paper:
80
+
81
+ ```bibtex
82
+ @inproceedings{reimers-2019-sentence-bert,
83
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
84
+ author = "Reimers, Nils and Gurevych, Iryna",
85
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
86
+ year = "2019",
87
+ publisher = "Association for Computational Linguistics",
88
+ url = "https://arxiv.org/abs/1908.10084",
89
+ }
90
+ ```