fin-mpnet-base / README.md
Julian Mukaj
Initial commit + new eval table
2c5f432
metadata
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - feature-extraction
  - sentence-similarity
  - mteb
  - financial
  - fiqa
  - finance
  - retrieval
  - rag
  - esg
  - fixed-income
  - equity
model-index:
  - name: fin-mpnet-base-v0.1
    results:
      - task:
          type: Classification
        dataset:
          type: mteb/amazon_reviews_multi
          name: MTEB AmazonReviewsClassification (en)
          config: en
          split: test
          revision: 1399c76144fd37290681b995c656ef9b2e06e26d
        metrics:
          - type: accuracy
            value: 29.128
          - type: f1
            value: 28.657401543151707
      - task:
          type: Retrieval
        dataset:
          type: arguana
          name: MTEB ArguAna
          config: default
          split: test
          revision: None
        metrics:
          - type: map_at_1
            value: 24.111
          - type: map_at_10
            value: 40.083
          - type: map_at_100
            value: 41.201
          - type: map_at_1000
            value: 41.215
          - type: map_at_3
            value: 35.325
          - type: map_at_5
            value: 37.796
          - type: mrr_at_1
            value: 25.036
          - type: mrr_at_10
            value: 40.436
          - type: mrr_at_100
            value: 41.554
          - type: mrr_at_1000
            value: 41.568
          - type: mrr_at_3
            value: 35.644999999999996
          - type: mrr_at_5
            value: 38.141000000000005
          - type: ndcg_at_1
            value: 24.111
          - type: ndcg_at_10
            value: 49.112
          - type: ndcg_at_100
            value: 53.669999999999995
          - type: ndcg_at_1000
            value: 53.944
          - type: ndcg_at_3
            value: 39.035
          - type: ndcg_at_5
            value: 43.503
          - type: precision_at_1
            value: 24.111
          - type: precision_at_10
            value: 7.817
          - type: precision_at_100
            value: 0.976
          - type: precision_at_1000
            value: 0.1
          - type: precision_at_3
            value: 16.596
          - type: precision_at_5
            value: 12.134
          - type: recall_at_1
            value: 24.111
          - type: recall_at_10
            value: 78.16499999999999
          - type: recall_at_100
            value: 97.58200000000001
          - type: recall_at_1000
            value: 99.57300000000001
          - type: recall_at_3
            value: 49.787
          - type: recall_at_5
            value: 60.669
      - task:
          type: Classification
        dataset:
          type: mteb/banking77
          name: MTEB Banking77Classification
          config: default
          split: test
          revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
        metrics:
          - type: accuracy
            value: 80.25
          - type: f1
            value: 79.64999520103544
      - task:
          type: Retrieval
        dataset:
          type: fiqa
          name: MTEB FiQA2018
          config: default
          split: test
          revision: None
        metrics:
          - type: map_at_1
            value: 37.747
          - type: map_at_10
            value: 72.223
          - type: map_at_100
            value: 73.802
          - type: map_at_1000
            value: 73.80499999999999
          - type: map_at_3
            value: 61.617999999999995
          - type: map_at_5
            value: 67.92200000000001
          - type: mrr_at_1
            value: 71.914
          - type: mrr_at_10
            value: 80.71000000000001
          - type: mrr_at_100
            value: 80.901
          - type: mrr_at_1000
            value: 80.901
          - type: mrr_at_3
            value: 78.935
          - type: mrr_at_5
            value: 80.193
          - type: ndcg_at_1
            value: 71.914
          - type: ndcg_at_10
            value: 79.912
          - type: ndcg_at_100
            value: 82.675
          - type: ndcg_at_1000
            value: 82.702
          - type: ndcg_at_3
            value: 73.252
          - type: ndcg_at_5
            value: 76.36
          - type: precision_at_1
            value: 71.914
          - type: precision_at_10
            value: 23.071
          - type: precision_at_100
            value: 2.62
          - type: precision_at_1000
            value: 0.263
          - type: precision_at_3
            value: 51.235
          - type: precision_at_5
            value: 38.117000000000004
          - type: recall_at_1
            value: 37.747
          - type: recall_at_10
            value: 91.346
          - type: recall_at_100
            value: 99.776
          - type: recall_at_1000
            value: 99.897
          - type: recall_at_3
            value: 68.691
          - type: recall_at_5
            value: 80.742
      - task:
          type: Retrieval
        dataset:
          type: nfcorpus
          name: MTEB NFCorpus
          config: default
          split: test
          revision: None
        metrics:
          - type: map_at_1
            value: 4.124
          - type: map_at_10
            value: 10.206999999999999
          - type: map_at_100
            value: 13.181000000000001
          - type: map_at_1000
            value: 14.568
          - type: map_at_3
            value: 7.2620000000000005
          - type: map_at_5
            value: 8.622
          - type: mrr_at_1
            value: 39.009
          - type: mrr_at_10
            value: 48.144
          - type: mrr_at_100
            value: 48.746
          - type: mrr_at_1000
            value: 48.789
          - type: mrr_at_3
            value: 45.356
          - type: mrr_at_5
            value: 47.152
          - type: ndcg_at_1
            value: 36.533
          - type: ndcg_at_10
            value: 29.643000000000004
          - type: ndcg_at_100
            value: 27.893
          - type: ndcg_at_1000
            value: 37.307
          - type: ndcg_at_3
            value: 33.357
          - type: ndcg_at_5
            value: 32.25
          - type: precision_at_1
            value: 38.7
          - type: precision_at_10
            value: 22.941
          - type: precision_at_100
            value: 7.303
          - type: precision_at_1000
            value: 2.028
          - type: precision_at_3
            value: 31.889
          - type: precision_at_5
            value: 29.04
          - type: recall_at_1
            value: 4.124
          - type: recall_at_10
            value: 14.443
          - type: recall_at_100
            value: 29.765000000000004
          - type: recall_at_1000
            value: 63.074
          - type: recall_at_3
            value: 8.516
          - type: recall_at_5
            value: 10.979
      - task:
          type: Retrieval
        dataset:
          type: scifact
          name: MTEB SciFact
          config: default
          split: test
          revision: None
        metrics:
          - type: map_at_1
            value: 49.010999999999996
          - type: map_at_10
            value: 60.094
          - type: map_at_100
            value: 60.79900000000001
          - type: map_at_1000
            value: 60.828
          - type: map_at_3
            value: 57.175
          - type: map_at_5
            value: 58.748
          - type: mrr_at_1
            value: 51.666999999999994
          - type: mrr_at_10
            value: 61.312
          - type: mrr_at_100
            value: 61.821000000000005
          - type: mrr_at_1000
            value: 61.85000000000001
          - type: mrr_at_3
            value: 59
          - type: mrr_at_5
            value: 60.199999999999996
          - type: ndcg_at_1
            value: 51.666999999999994
          - type: ndcg_at_10
            value: 65.402
          - type: ndcg_at_100
            value: 68.377
          - type: ndcg_at_1000
            value: 69.094
          - type: ndcg_at_3
            value: 60.153999999999996
          - type: ndcg_at_5
            value: 62.455000000000005
          - type: precision_at_1
            value: 51.666999999999994
          - type: precision_at_10
            value: 9.067
          - type: precision_at_100
            value: 1.0670000000000002
          - type: precision_at_1000
            value: 0.11199999999999999
          - type: precision_at_3
            value: 24
          - type: precision_at_5
            value: 15.933
          - type: recall_at_1
            value: 49.010999999999996
          - type: recall_at_10
            value: 80.511
          - type: recall_at_100
            value: 94
          - type: recall_at_1000
            value: 99.5
          - type: recall_at_3
            value: 66.2
          - type: recall_at_5
            value: 71.944

full evaluation not complete

Fin-MPNET-Base (v0.1)

This is a fine-tuned sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.

This model aims to be very strong on Financial Document Retrieval Tasks, while trying to maintain as much generalized performance as possible.

FiQA SciFact AmazonReviews OnlineBankingIntent ArguAna
fin-mpnet-base 79.91 65.40 29.12 80.25 49.11
all-mpnet-base-v2 49.96 65.57 31.92 81.86 46.52
previous SoTA 56.59 - - - -

v0.1 shows SoTA results on FiQA Test set while other non-financial benchmarks only drop a few small % and improvement in others.

Usage (Sentence-Transformers)

Using this model becomes easy when you have sentence-transformers installed:

pip install -U sentence-transformers

Then you can use the model like this:

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('mukaj/fin-mpnet-base')
embeddings = model.encode(sentences)
print(embeddings)

Evaluation Results

Model was evaluated during training only on the new finance QA examples, as such only financial relevant benchmarks were evaluated on for v0.1 [FiQA-2018, BankingClassification77]

The model currently shows the highest FiQA Retrieval score on the test set, on the MTEB Leaderboard (https://huggingface.co/spaces/mteb/leaderboard)

The model will have likely suffered some performance on other benchmarks, i.e. BankingClassification77 has dropped from 81.6 to 80.25, this will be addressed for v0.2 and full evaluation on all sets will be run.

Training

"sentence-transformers/all-mpnet-base-v2" was fine-tuned on 150k+ financial document QA examples using MNR Loss.