Edit model card

all-mpnet-base-v2-embedding-all

This model is a fine-tuned version of all-mpnet-base-v2 on the following datasets: squad, newsqa, LLukas22/cqadupstack, LLukas22/fiqa, LLukas22/scidocs, deepset/germanquad, LLukas22/nq.

Usage (Sentence-Transformers)

Using this model becomes easy when you have sentence-transformers installed:

pip install -U sentence-transformers

Then you can use the model like this:

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('LLukas22/all-mpnet-base-v2-embedding-all')
embeddings = model.encode(sentences)
print(embeddings)

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1E+00
  • per device batch size: 60
  • effective batch size: 180
  • seed: 42
  • optimizer: AdamW with betas (0.9,0.999) and eps 1E-08
  • weight decay: 2E-02
  • D-Adaptation: True
  • Warmup: True
  • number of epochs: 15
  • mixed_precision_training: bf16

Training results

Epoch Train Loss Validation Loss
0 0.0554 0.047
1 0.044 0.0472
2 0.0374 0.0425
3 0.0322 0.041
4 0.0278 0.0403
5 0.0246 0.0389
6 0.0215 0.0389
7 0.0192 0.0388
8 0.017 0.0379
9 0.0154 0.0375
10 0.0142 0.0381
11 0.0132 0.0372
12 0.0126 0.0377
13 0.012 0.0377

Evaluation results

Epoch top_1 top_3 top_5 top_10 top_25
0 0.373 0.476 0.509 0.544 0.573
1 0.362 0.466 0.501 0.537 0.568
2 0.371 0.476 0.511 0.546 0.576
3 0.369 0.473 0.506 0.54 0.569
4 0.373 0.478 0.512 0.547 0.578
5 0.378 0.483 0.517 0.552 0.58
6 0.371 0.475 0.509 0.543 0.571
7 0.379 0.484 0.517 0.55 0.578
8 0.378 0.482 0.515 0.548 0.575
9 0.383 0.489 0.523 0.556 0.584
10 0.38 0.483 0.517 0.549 0.575
11 0.38 0.485 0.518 0.551 0.577
12 0.383 0.489 0.522 0.556 0.582
13 0.385 0.49 0.523 0.555 0.581

Framework versions

  • Transformers: 4.25.1
  • PyTorch: 2.0.0.dev20230210+cu118
  • PyTorch Lightning: 1.8.6
  • Datasets: 2.7.1
  • Tokenizers: 0.13.1
  • Sentence Transformers: 2.2.2

Additional Information

This model was trained as part of my Master's Thesis 'Evaluation of transformer based language models for use in service information systems'. The source code is available on Github.

Downloads last month
309
Safetensors
Model size
109M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train LLukas22/all-mpnet-base-v2-embedding-all