---
language:
- de
library_name: sentence-transformers
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dataset_size:10K<n<100K
- loss:MatryoshkaLoss
- loss:ContrastiveLoss
base_model: aari1995/gbert-large-alibi
metrics:
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
widget:
- source_sentence: Das Tor ist gelb.
  sentences:
  - Das Tor ist blau.
  - Ein Mann mit seinem Hund am Strand.
  - Die Menschen sitzen auf Bänken.
- source_sentence: Das Tor ist blau.
  sentences:
  - Ein blaues Moped parkt auf dem Bürgersteig.
  - Drei Hunde spielen im weißen Schnee.
  - Bombenanschläge töten 19 Menschen im Irak
- source_sentence: Ein Mann übt Boxen
  sentences:
  - Ein Fußballspieler versucht ein Tackling.
  - 1 Getötet bei Protest in Bangladesch
  - Das Mädchen sang in ein Mikrofon.
- source_sentence: Drei Männer tanzen.
  sentences:
  - Ein Mann tanzt.
  - Ein Mann arbeitet an seinem Laptop.
  - Das Mädchen sang in ein Mikrofon.
- source_sentence: Eine Flagge weht.
  sentences:
  - Die Flagge bewegte sich in der Luft.
  - Zwei Personen beobachten das Wasser.
  - Zwei Frauen sitzen in einem Cafe.

- source_sentence: Der Mann heißt Joel.
  sentences:
  - Ein Mann mit einem englischen Namen.
  - Die Frau heißt Joél.
  - Freunde gehen feiern.
pipeline_tag: sentence-similarity
model-index:
- name: SentenceTransformer based on aari1995/gbert-large-nli_mix
  results:
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: sts test 1024
      type: sts-test-1024
    metrics:
    - type: pearson_cosine
      value: 0.8538749625112824
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.8622934726599119
      name: Spearman Cosine
    - type: pearson_manhattan
      value: 0.8554617861095041
      name: Pearson Manhattan
    - type: spearman_manhattan
      value: 0.8632850500504865
      name: Spearman Manhattan
    - type: pearson_euclidean
      value: 0.8554205957277228
      name: Pearson Euclidean
    - type: spearman_euclidean
      value: 0.8630779166725503
      name: Spearman Euclidean
    - type: pearson_dot
      value: 0.8170146846171837
      name: Pearson Dot
    - type: spearman_dot
      value: 0.8149857685956332
      name: Spearman Dot
    - type: pearson_max
      value: 0.8554617861095041
      name: Pearson Max
    - type: spearman_max
      value: 0.8632850500504865
      name: Spearman Max
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: sts test 768
      type: sts-test-768
    metrics:
    - type: pearson_cosine
      value: 0.853820621972726
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.863198271488271
      name: Spearman Cosine
    - type: pearson_manhattan
      value: 0.8558709278385018
      name: Pearson Manhattan
    - type: spearman_manhattan
      value: 0.8637532036004547
      name: Spearman Manhattan
    - type: pearson_euclidean
      value: 0.8558597695346744
      name: Pearson Euclidean
    - type: spearman_euclidean
      value: 0.8634247094122574
      name: Spearman Euclidean
    - type: pearson_dot
      value: 0.8169163431962185
      name: Pearson Dot
    - type: spearman_dot
      value: 0.8156867907361973
      name: Spearman Dot
    - type: pearson_max
      value: 0.8558709278385018
      name: Pearson Max
    - type: spearman_max
      value: 0.8637532036004547
      name: Spearman Max
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: sts test 512
      type: sts-test-512
    metrics:
    - type: pearson_cosine
      value: 0.8502336569709972
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.8623838162450902
      name: Spearman Cosine
    - type: pearson_manhattan
      value: 0.8547121881183612
      name: Pearson Manhattan
    - type: spearman_manhattan
      value: 0.8628698143219098
      name: Spearman Manhattan
    - type: pearson_euclidean
      value: 0.8546114371189246
      name: Pearson Euclidean
    - type: spearman_euclidean
      value: 0.8625109910600326
      name: Spearman Euclidean
    - type: pearson_dot
      value: 0.8108392647310044
      name: Pearson Dot
    - type: spearman_dot
      value: 0.8103261097232485
      name: Spearman Dot
    - type: pearson_max
      value: 0.8547121881183612
      name: Pearson Max
    - type: spearman_max
      value: 0.8628698143219098
      name: Spearman Max
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: sts test 256
      type: sts-test-256
    metrics:
    - type: pearson_cosine
      value: 0.8441242786553879
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.8582717489671877
      name: Spearman Cosine
    - type: pearson_manhattan
      value: 0.8517415030362573
      name: Pearson Manhattan
    - type: spearman_manhattan
      value: 0.8591688553092182
      name: Spearman Manhattan
    - type: pearson_euclidean
      value: 0.8516965854845419
      name: Pearson Euclidean
    - type: spearman_euclidean
      value: 0.8591770194196562
      name: Spearman Euclidean
    - type: pearson_dot
      value: 0.7901870400809775
      name: Pearson Dot
    - type: spearman_dot
      value: 0.7891397281321177
      name: Spearman Dot
    - type: pearson_max
      value: 0.8517415030362573
      name: Pearson Max
    - type: spearman_max
      value: 0.8591770194196562
      name: Spearman Max
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: sts test 128
      type: sts-test-128
    metrics:
    - type: pearson_cosine
      value: 0.8369352495821198
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.8545806562301762
      name: Spearman Cosine
    - type: pearson_manhattan
      value: 0.8474289413580527
      name: Pearson Manhattan
    - type: spearman_manhattan
      value: 0.8546935424655524
      name: Spearman Manhattan
    - type: pearson_euclidean
      value: 0.8478267316251253
      name: Pearson Euclidean
    - type: spearman_euclidean
      value: 0.8550464936365929
      name: Spearman Euclidean
    - type: pearson_dot
      value: 0.7732663297266509
      name: Pearson Dot
    - type: spearman_dot
      value: 0.7720532782903432
      name: Spearman Dot
    - type: pearson_max
      value: 0.8478267316251253
      name: Pearson Max
    - type: spearman_max
      value: 0.8550464936365929
      name: Spearman Max
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: sts test 64
      type: sts-test-64
    metrics:
    - type: pearson_cosine
      value: 0.8282288301025145
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.8507215646125454
      name: Spearman Cosine
    - type: pearson_manhattan
      value: 0.8404915813802649
      name: Pearson Manhattan
    - type: spearman_manhattan
      value: 0.8482910175231816
      name: Spearman Manhattan
    - type: pearson_euclidean
      value: 0.8425986040609018
      name: Pearson Euclidean
    - type: spearman_euclidean
      value: 0.8498681513437906
      name: Spearman Euclidean
    - type: pearson_dot
      value: 0.7518854418344252
      name: Pearson Dot
    - type: spearman_dot
      value: 0.7518133373839283
      name: Spearman Dot
    - type: pearson_max
      value: 0.8425986040609018
      name: Pearson Max
    - type: spearman_max
      value: 0.8507215646125454
      name: Spearman Max
license: apache-2.0
---

# German Semantic V3

The successor of German_Semantic_STS_V2 is here!

## Major updates and USPs:

- **Sequence length:** 8192, (16 times more than V2 and other models) => thanks to the ALiBi implementation of Jina-Team!
- **Matryoshka Embeddings:** The model is trained for embedding sizes from 1024 down to 64, allowing you to store much smaller embeddings with little quality loss.
- **License:** Apache 2.0
- **German only:** This model is German-only, causing the model to learn more efficient and deal better with shorter queries.
- **Flexibility:** Trained with flexible sequence-length and embedding truncation, flexibility is a core feature of the model, while improving on V2-performance. 

## Usage:

```python
from sentence_transformers import SentenceTransformer


matryoshka_dim = 1024 # How big your embeddings should be, choose from: 64, 128, 256, 512, 1024
model = SentenceTransformer("aari1995/German_Semantic_V3", trust_remote_code=True, truncate_dim=matryoshka_dim)

# model.truncate_dim = 64 # truncation dimensions can also be changed after loading
# model.max_seq_length = 512 #optionally, set your maximum sequence length lower if your hardware is limited 

# Run inference
sentences = [
    'Eine Flagge weht.',
    'Die Flagge bewegte sich in der Luft.',
    'Zwei Personen beobachten das Wasser.',
]
embeddings = model.encode(sentences)

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)


```


## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** gbert-large (alibi applied)
- **Maximum Sequence Length:** 8192 tokens
- **Output Dimensionality:** 1024 tokens
- **Similarity Function:** Cosine Similarity
- **Training Dataset:**
    - multiple German datasets
- **Languages:** de
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: JinaBertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("aari1995/German_Semantic_V3", trust_remote_code=True)
# Run inference
sentences = [
    'Eine Flagge weht.',
    'Die Flagge bewegte sich in der Luft.',
    'Zwei Personen beobachten das Wasser.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics


#### Semantic Similarity
* Dataset: `sts-test-1024`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| pearson_cosine      | 0.8539     |
| **spearman_cosine** | **0.8623** |
| pearson_manhattan   | 0.8555     |
| spearman_manhattan  | 0.8633     |
| pearson_euclidean   | 0.8554     |
| spearman_euclidean  | 0.8631     |
| pearson_dot         | 0.817      |
| spearman_dot        | 0.815      |
| pearson_max         | 0.8555     |
| spearman_max        | 0.8633     |

#### Semantic Similarity
* Dataset: `sts-test-768`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| pearson_cosine      | 0.8538     |
| **spearman_cosine** | **0.8632** |
| pearson_manhattan   | 0.8559     |
| spearman_manhattan  | 0.8638     |
| pearson_euclidean   | 0.8559     |
| spearman_euclidean  | 0.8634     |
| pearson_dot         | 0.8169     |
| spearman_dot        | 0.8157     |
| pearson_max         | 0.8559     |
| spearman_max        | 0.8638     |

#### Semantic Similarity
* Dataset: `sts-test-512`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| pearson_cosine      | 0.8502     |
| **spearman_cosine** | **0.8624** |
| pearson_manhattan   | 0.8547     |
| spearman_manhattan  | 0.8629     |
| pearson_euclidean   | 0.8546     |
| spearman_euclidean  | 0.8625     |
| pearson_dot         | 0.8108     |
| spearman_dot        | 0.8103     |
| pearson_max         | 0.8547     |
| spearman_max        | 0.8629     |

#### Semantic Similarity
* Dataset: `sts-test-256`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| pearson_cosine      | 0.8441     |
| **spearman_cosine** | **0.8583** |
| pearson_manhattan   | 0.8517     |
| spearman_manhattan  | 0.8592     |
| pearson_euclidean   | 0.8517     |
| spearman_euclidean  | 0.8592     |
| pearson_dot         | 0.7902     |
| spearman_dot        | 0.7891     |
| pearson_max         | 0.8517     |
| spearman_max        | 0.8592     |

#### Semantic Similarity
* Dataset: `sts-test-128`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| pearson_cosine      | 0.8369     |
| **spearman_cosine** | **0.8546** |
| pearson_manhattan   | 0.8474     |
| spearman_manhattan  | 0.8547     |
| pearson_euclidean   | 0.8478     |
| spearman_euclidean  | 0.855      |
| pearson_dot         | 0.7733     |
| spearman_dot        | 0.7721     |
| pearson_max         | 0.8478     |
| spearman_max        | 0.855      |

#### Semantic Similarity
* Dataset: `sts-test-64`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| pearson_cosine      | 0.8282     |
| **spearman_cosine** | **0.8507** |
| pearson_manhattan   | 0.8405     |
| spearman_manhattan  | 0.8483     |
| pearson_euclidean   | 0.8426     |
| spearman_euclidean  | 0.8499     |
| pearson_dot         | 0.7519     |
| spearman_dot        | 0.7518     |
| pearson_max         | 0.8426     |
| spearman_max        | 0.8507     |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
  ```json
  {
      "loss": "ContrastiveLoss",
      "matryoshka_dims": [
          1024,
          768,
          512,
          256,
          128,
          64
      ],
      "matryoshka_weights": [
          1,
          1,
          1,
          1,
          1,
          1
      ],
      "n_dims_per_step": -1
  }
  ```
## License / Credits and Special thanks to:

- to [Jina AI](https://huggingface.co/jinaai) for the model architecture, especially their ALiBi implementation
- to [deepset](https://huggingface.co/deepset) for gbert-large, which is imho still the greatest German model

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MatryoshkaLoss
```bibtex
@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning}, 
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
```

#### ContrastiveLoss
```bibtex
@inproceedings{hadsell2006dimensionality,
    author={Hadsell, R. and Chopra, S. and LeCun, Y.},
    booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)}, 
    title={Dimensionality Reduction by Learning an Invariant Mapping}, 
    year={2006},
    volume={2},
    number={},
    pages={1735-1742},
    doi={10.1109/CVPR.2006.100}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->