---
language:
- en
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:11002
- loss:MultipleNegativesRankingLoss
base_model: jinaai/jina-embeddings-v2-base-en
widget:
- source_sentence: Man jumps alone on a desert road with mountains in the background.
sentences:
- A man jumps on the desert road
- A man plays a silver electric guitar.
- A man doesnt jump on the desert road
- source_sentence: Players from two teams tangle together in pursuit of a flying rugby
ball.
sentences:
- Two teams playing.
- Two teams not playing.
- Men are dancing in the street.
- source_sentence: The team won the game in the final minute.
sentences:
- In the final minute, the team won the game.
- The team lost the game in the final minute.
- For their anniversary, they took a hike through the mountains, enjoying the peace
and quiet of nature.
- source_sentence: He finished reading the book in one sitting.
sentences:
- He struggled to finish the book and took a week to read it.
- In one sitting, he finished reading the book.
- jazz players create spontaneous superior orchestra
- source_sentence: Paint preserves wood
sentences:
- Coating protects timber
- timber coating protects
- Single cell life came before complex creatures
datasets:
- bwang0911/word-orders-triplet
- jinaai/negation-dataset
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---
# SentenceTransformer based on jinaai/jina-embeddings-v2-base-en
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [jinaai/jina-embeddings-v2-base-en](https://huggingface.co/jinaai/jina-embeddings-v2-base-en) on the [word_orders](https://huggingface.co/datasets/bwang0911/word-orders-triplet) and [negation_dataset](https://huggingface.co/datasets/jinaai/negation-dataset) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [jinaai/jina-embeddings-v2-base-en](https://huggingface.co/jinaai/jina-embeddings-v2-base-en)
- **Maximum Sequence Length:** 128 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
- **Training Datasets:**
- [word_orders](https://huggingface.co/datasets/bwang0911/word-orders-triplet)
- [negation_dataset](https://huggingface.co/datasets/jinaai/negation-dataset)
- **Language:** en
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: JinaBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("bwang0911/word-order-jina")
# Run inference
sentences = [
'Paint preserves wood',
'Coating protects timber',
'timber coating protects',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Training Details
### Training Datasets
#### word_orders
* Dataset: [word_orders](https://huggingface.co/datasets/bwang0911/word-orders-triplet) at [99609ac](https://huggingface.co/datasets/bwang0911/word-orders-triplet/tree/99609ac84ce5ad127591d7e722564a064cf80a76)
* Size: 1,002 training samples
* Columns: anchor
, pos
, and neg
* Approximate statistics based on the first 1000 samples:
| | anchor | pos | neg |
|:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
| type | string | string | string |
| details |
The river flows from the mountains to the sea
| Water travels from mountain peaks to ocean
| The river flows from the sea to the mountains
|
| Train departs London for Paris
| Railway journey from London heading to Paris
| Train departs Paris for London
|
| Cargo ship sails from Shanghai to Singapore
| Maritime route Shanghai to Singapore
| Cargo ship sails from Singapore to Shanghai
|
* Loss: [MultipleNegativesRankingLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20,
"similarity_fct": "cos_sim"
}
```
#### negation_dataset
* Dataset: [negation_dataset](https://huggingface.co/datasets/jinaai/negation-dataset) at [cd02256](https://huggingface.co/datasets/jinaai/negation-dataset/tree/cd02256426cc566d176285a987e5436f1cd01382)
* Size: 10,000 training samples
* Columns: anchor
, entailment
, and negative
* Approximate statistics based on the first 1000 samples:
| | anchor | entailment | negative |
|:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
| type | string | string | string |
| details | Two young girls are playing outside in a non-urban environment.
| Two girls are playing outside.
| Two girls are not playing outside.
|
| A man with a red shirt is watching another man who is standing on top of a attached cart filled to the top.
| A man is standing on top of a cart.
| A man is not standing on top of a cart.
|
| A man in a blue shirt driving a Segway type vehicle.
| A person is riding a motorized vehicle.
| A person is not riding a motorized vehicle.
|
* Loss: [MultipleNegativesRankingLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20,
"similarity_fct": "cos_sim"
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `per_device_train_batch_size`: 128
- `warmup_ratio`: 0.1
- `fp16`: True
- `batch_sampler`: no_duplicates
#### All Hyperparameters