metadata
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:185814
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: BAAI/bge-base-en-v1.5
widget:
- source_sentence: ' The passage suggests that the more utility an item has, the more value human beings assign to it, making utility synonymous with subjective human value.'
sentences:
- |2
In the given passage about compound interest, how does the interest earned on a Series EE bond affect its value over time?
- |2
What forms does Section 16 require insiders to file and when is Form 3 typically submitted?
- |2
What is the relationship between an item's utility and its subjective human value according to the passage?
- source_sentence: ' The price per share is determined when a company goes public by giving a valuation to the company with the input of an investment bank. This value is then divided by the total number of shares to be issued.'
sentences:
- |2
How is the price per share determined when a company goes public and involves an investment bank?
- |2
What percentage decrease have Fisker shares experienced in the past year despite a 30% increase on Feb. 27?
- |2
What factors contributed to the strong performance and outstanding returns of residential construction stocks since the March lows as mentioned in the passage?
- source_sentence: ' Municipal bonds are discussed as this type of investment.'
sentences:
- |2
What is the benefit and process of using an income-driven repayment (IDR) plan for federal student loans?
- |2
Which luxury watch and jewelry brands are owned by Swatch?
- |2
What type of investment is discussed as a way to potentially increase after-tax returns by avoiding federal taxes, and is often chosen for its relative safety and steady return?
- source_sentence: ' The rally could potentially fill the Sept. 18 gap between $145 and $150, reaching the .618 sell-off retracement level. The on-balance volume indicator suggests that Roku stock is unlikely to test the September high at this time.'
sentences:
- |2
What rewards and perks can Navy Federal Visa Signature Flagship Rewards credit card users receive upon opening an account and within the first 90 days?
- |2
What aspects contribute to VeriFirst's high marks for comprehensiveness in their services?
- |2
What levels could the rally potentially fill after buyers buy the dip into $120, and what does the on-balance volume indicator suggest about Roku stock's likelihood to test the September high at this time?
- source_sentence: ' Home Depot''s stock closed at $135.39 while being above a "golden cross" on January 19, 2017.'
sentences:
- |2
In the given text passage, when did Home Depot's stock close at $135.39 while being above a "golden cross"?
- |2
What term does JPMorgan use to refer to net interest margin in its financial materials, and what was their net interest margin in FY 2019 before the pandemic started?
- |2
According to Maley, where might the funds from potentially declining sectors like FANGs be directed towards?
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: Regulatory Financial Matryoshka
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 768
type: dim_768
metrics:
- type: cosine_accuracy@1
value: 0.6027025718021989
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.7349251707269822
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.7675691383736136
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.8058313556448878
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.6027025718021989
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.24497505690899404
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.1535138276747227
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.0805831355644888
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.6027025718021989
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.7349251707269822
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.7675691383736136
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.8058313556448878
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.7073258915973659
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.6754839282543154
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6796950515367028
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 512
type: dim_512
metrics:
- type: cosine_accuracy@1
value: 0.5988763500750715
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.7302271516443066
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.7651474790526469
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.8033612631375018
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.5988763500750715
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.2434090505481022
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.15302949581052935
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.08033612631375019
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.5988763500750715
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.7302271516443066
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.7651474790526469
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.8033612631375018
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.703957174859045
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.6718512470776807
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6760676798344978
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 256
type: dim_256
metrics:
- type: cosine_accuracy@1
value: 0.5881241826899791
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.7223325422579552
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.756768537802102
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.7946917227684409
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.5881241826899791
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.24077751408598502
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.1513537075604204
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.07946917227684411
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.5881241826899791
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.7223325422579552
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.756768537802102
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.7946917227684409
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.694546619247058
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.6621607466706108
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6665568671650335
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 128
type: dim_128
metrics:
- type: cosine_accuracy@1
value: 0.5718990652395021
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.7057683925025428
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.7405918535380442
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.7818569283673172
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.5718990652395021
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.2352561308341809
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.14811837070760883
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.07818569283673171
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.5718990652395021
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.7057683925025428
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.7405918535380442
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.7818569283673172
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.6793129712551184
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.6462535008352889
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6508176832915368
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 64
type: dim_64
metrics:
- type: cosine_accuracy@1
value: 0.5443405821669007
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.6759335496682327
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.7138567346345716
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.7583183997675207
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.5443405821669007
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.22531118322274424
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.14277134692691432
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.07583183997675207
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.5443405821669007
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.6759335496682327
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.7138567346345716
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.7583183997675207
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.6522706632460163
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.6182239473662035
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6229041572175256
name: Cosine Map@100
Regulatory Financial Matryoshka
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-base-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- json
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("hshashank06/regulatory-model")
# Run inference
sentences = [
' Home Depot\'s stock closed at $135.39 while being above a "golden cross" on January 19, 2017.',
' In the given text passage, when did Home Depot\'s stock close at $135.39 while being above a "golden cross"? \n',
' According to Maley, where might the funds from potentially declining sectors like FANGs be directed towards? \n',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Datasets:
dim_768
,dim_512
,dim_256
,dim_128
anddim_64
- Evaluated with
InformationRetrievalEvaluator
Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
---|---|---|---|---|---|
cosine_accuracy@1 | 0.6027 | 0.5989 | 0.5881 | 0.5719 | 0.5443 |
cosine_accuracy@3 | 0.7349 | 0.7302 | 0.7223 | 0.7058 | 0.6759 |
cosine_accuracy@5 | 0.7676 | 0.7651 | 0.7568 | 0.7406 | 0.7139 |
cosine_accuracy@10 | 0.8058 | 0.8034 | 0.7947 | 0.7819 | 0.7583 |
cosine_precision@1 | 0.6027 | 0.5989 | 0.5881 | 0.5719 | 0.5443 |
cosine_precision@3 | 0.245 | 0.2434 | 0.2408 | 0.2353 | 0.2253 |
cosine_precision@5 | 0.1535 | 0.153 | 0.1514 | 0.1481 | 0.1428 |
cosine_precision@10 | 0.0806 | 0.0803 | 0.0795 | 0.0782 | 0.0758 |
cosine_recall@1 | 0.6027 | 0.5989 | 0.5881 | 0.5719 | 0.5443 |
cosine_recall@3 | 0.7349 | 0.7302 | 0.7223 | 0.7058 | 0.6759 |
cosine_recall@5 | 0.7676 | 0.7651 | 0.7568 | 0.7406 | 0.7139 |
cosine_recall@10 | 0.8058 | 0.8034 | 0.7947 | 0.7819 | 0.7583 |
cosine_ndcg@10 | 0.7073 | 0.704 | 0.6945 | 0.6793 | 0.6523 |
cosine_mrr@10 | 0.6755 | 0.6719 | 0.6622 | 0.6463 | 0.6182 |
cosine_map@100 | 0.6797 | 0.6761 | 0.6666 | 0.6508 | 0.6229 |
Training Details
Training Dataset
json
- Dataset: json
- Size: 185,814 training samples
- Columns:
positive
andanchor
- Approximate statistics based on the first 1000 samples:
positive anchor type string string details - min: 3 tokens
- mean: 43.18 tokens
- max: 200 tokens
- min: 10 tokens
- mean: 23.08 tokens
- max: 63 tokens
- Samples:
positive anchor The BVPS (Book Value Per Share) is calculated by dividing a company's common equity value by its total number of shares outstanding. In the given example, if a company has a common equity value of $100 million and 10 million shares outstanding, its BVPS would be $10 ($100 million / 10 million). You can calculate a company's BVPS using Microsoft Excel by entering the values of common stock, retained earnings, and additional paid-in capital into cells A1 through A3.
What is the BVPS and how is it calculated?
They facilitate commodities trading using their resources, can take delivery of commodities if needed, provide advisory services for clients, and act as market makers by buying and selling futures contracts to add liquidity to the marketplace. The passage uses the example of a commercial baking firm to demonstrate how their impact can be seen in the market.
What role do eligible commercial entities play in commodities trading and market liquidity?
Naive diversification is a type of diversification strategy where an investor randomly selects different securities, hoping to lower the risk of the portfolio due to the varied nature of the chosen securities. It is less sophisticated than diversification methods using statistical modeling, but when guided by experience, careful security examination, and common sense, it remains an effective strategy for reducing portfolio risk.
What is the concept of naive diversification in investing and how does it compare to more sophisticated diversification methods?
- Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 32per_device_eval_batch_size
: 16gradient_accumulation_steps
: 16learning_rate
: 2e-05num_train_epochs
: 4lr_scheduler_type
: cosinewarmup_ratio
: 0.1bf16
: Trueload_best_model_at_end
: Trueoptim
: adamw_torch_fusedbatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 16eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: cosinelr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
---|---|---|---|---|---|---|---|
0.0276 | 10 | 43.573 | - | - | - | - | - |
0.0551 | 20 | 42.1758 | - | - | - | - | - |
0.0827 | 30 | 37.6368 | - | - | - | - | - |
0.1102 | 40 | 34.5743 | - | - | - | - | - |
0.1378 | 50 | 29.5956 | - | - | - | - | - |
0.1653 | 60 | 23.4468 | - | - | - | - | - |
0.1929 | 70 | 19.7425 | - | - | - | - | - |
0.2204 | 80 | 16.9744 | - | - | - | - | - |
0.2480 | 90 | 15.2437 | - | - | - | - | - |
0.2755 | 100 | 13.9444 | - | - | - | - | - |
0.3031 | 110 | 12.067 | - | - | - | - | - |
0.3306 | 120 | 11.1149 | - | - | - | - | - |
0.3582 | 130 | 10.4083 | - | - | - | - | - |
0.3857 | 140 | 8.915 | - | - | - | - | - |
0.4133 | 150 | 9.4964 | - | - | - | - | - |
0.4408 | 160 | 8.0434 | - | - | - | - | - |
0.4684 | 170 | 8.1963 | - | - | - | - | - |
0.4960 | 180 | 8.5704 | - | - | - | - | - |
0.5235 | 190 | 7.711 | - | - | - | - | - |
0.5511 | 200 | 7.6676 | - | - | - | - | - |
0.5786 | 210 | 6.9899 | - | - | - | - | - |
0.6062 | 220 | 7.6195 | - | - | - | - | - |
0.6337 | 230 | 7.0456 | - | - | - | - | - |
0.6613 | 240 | 7.5541 | - | - | - | - | - |
0.6888 | 250 | 6.6543 | - | - | - | - | - |
0.7164 | 260 | 6.8849 | - | - | - | - | - |
0.7439 | 270 | 7.6635 | - | - | - | - | - |
0.7715 | 280 | 7.2155 | - | - | - | - | - |
0.7990 | 290 | 6.3284 | - | - | - | - | - |
0.8266 | 300 | 6.577 | - | - | - | - | - |
0.8541 | 310 | 5.0835 | - | - | - | - | - |
0.8817 | 320 | 6.1866 | - | - | - | - | - |
0.9092 | 330 | 5.9467 | - | - | - | - | - |
0.9368 | 340 | 5.663 | - | - | - | - | - |
0.9644 | 350 | 5.417 | - | - | - | - | - |
0.9919 | 360 | 6.0331 | - | - | - | - | - |
0.9974 | 362 | - | 0.6940 | 0.6900 | 0.6791 | 0.6603 | 0.6273 |
1.0220 | 370 | 5.5374 | - | - | - | - | - |
1.0496 | 380 | 4.5917 | - | - | - | - | - |
1.0771 | 390 | 4.6483 | - | - | - | - | - |
1.1047 | 400 | 4.96 | - | - | - | - | - |
1.1323 | 410 | 4.6808 | - | - | - | - | - |
1.1598 | 420 | 5.2396 | - | - | - | - | - |
1.1874 | 430 | 4.651 | - | - | - | - | - |
1.2149 | 440 | 4.4875 | - | - | - | - | - |
1.2425 | 450 | 4.6877 | - | - | - | - | - |
1.2700 | 460 | 4.2209 | - | - | - | - | - |
1.2976 | 470 | 4.678 | - | - | - | - | - |
1.3251 | 480 | 4.6774 | - | - | - | - | - |
1.3527 | 490 | 4.4409 | - | - | - | - | - |
1.3802 | 500 | 4.4464 | - | - | - | - | - |
1.4078 | 510 | 4.2724 | - | - | - | - | - |
1.4353 | 520 | 4.5017 | - | - | - | - | - |
1.4629 | 530 | 4.3469 | - | - | - | - | - |
1.4904 | 540 | 4.4925 | - | - | - | - | - |
1.5180 | 550 | 3.922 | - | - | - | - | - |
1.5455 | 560 | 4.6949 | - | - | - | - | - |
1.5731 | 570 | 4.0364 | - | - | - | - | - |
1.6007 | 580 | 4.3846 | - | - | - | - | - |
1.6282 | 590 | 3.7526 | - | - | - | - | - |
1.6558 | 600 | 4.0508 | - | - | - | - | - |
1.6833 | 610 | 4.6315 | - | - | - | - | - |
1.7109 | 620 | 3.7683 | - | - | - | - | - |
1.7384 | 630 | 4.6994 | - | - | - | - | - |
1.7660 | 640 | 4.1994 | - | - | - | - | - |
1.7935 | 650 | 4.3915 | - | - | - | - | - |
1.8211 | 660 | 4.2947 | - | - | - | - | - |
1.8486 | 670 | 4.6972 | - | - | - | - | - |
1.8762 | 680 | 4.1664 | - | - | - | - | - |
1.9037 | 690 | 4.1861 | - | - | - | - | - |
1.9313 | 700 | 3.6879 | - | - | - | - | - |
1.9588 | 710 | 4.3767 | - | - | - | - | - |
1.9864 | 720 | 4.48 | - | - | - | - | - |
1.9974 | 724 | - | 0.7013 | 0.6971 | 0.6885 | 0.6716 | 0.6414 |
2.0165 | 730 | 3.6164 | - | - | - | - | - |
2.0441 | 740 | 3.3361 | - | - | - | - | - |
2.0716 | 750 | 3.4175 | - | - | - | - | - |
2.0992 | 760 | 3.9006 | - | - | - | - | - |
2.1267 | 770 | 3.0823 | - | - | - | - | - |
2.1543 | 780 | 3.029 | - | - | - | - | - |
2.1818 | 790 | 3.8081 | - | - | - | - | - |
2.2094 | 800 | 3.4486 | - | - | - | - | - |
2.2370 | 810 | 3.6064 | - | - | - | - | - |
2.2645 | 820 | 3.0896 | - | - | - | - | - |
2.2921 | 830 | 3.3233 | - | - | - | - | - |
2.3196 | 840 | 2.9528 | - | - | - | - | - |
2.3472 | 850 | 3.0482 | - | - | - | - | - |
2.3747 | 860 | 3.2795 | - | - | - | - | - |
2.4023 | 870 | 2.9218 | - | - | - | - | - |
2.4298 | 880 | 3.4518 | - | - | - | - | - |
2.4574 | 890 | 3.6095 | - | - | - | - | - |
2.4849 | 900 | 3.2002 | - | - | - | - | - |
2.5125 | 910 | 3.368 | - | - | - | - | - |
2.5400 | 920 | 3.0623 | - | - | - | - | - |
2.5676 | 930 | 3.3495 | - | - | - | - | - |
2.5951 | 940 | 3.7123 | - | - | - | - | - |
2.6227 | 950 | 3.7795 | - | - | - | - | - |
2.6502 | 960 | 3.5567 | - | - | - | - | - |
2.6778 | 970 | 3.3498 | - | - | - | - | - |
2.7054 | 980 | 3.3141 | - | - | - | - | - |
2.7329 | 990 | 2.9425 | - | - | - | - | - |
2.7605 | 1000 | 2.9978 | - | - | - | - | - |
2.7880 | 1010 | 3.2468 | - | - | - | - | - |
2.8156 | 1020 | 2.5252 | - | - | - | - | - |
2.8431 | 1030 | 3.3108 | - | - | - | - | - |
2.8707 | 1040 | 3.195 | - | - | - | - | - |
2.8982 | 1050 | 3.1019 | - | - | - | - | - |
2.9258 | 1060 | 3.7059 | - | - | - | - | - |
2.9533 | 1070 | 3.1952 | - | - | - | - | - |
2.9809 | 1080 | 3.2454 | - | - | - | - | - |
2.9974 | 1086 | - | 0.7056 | 0.7030 | 0.6939 | 0.6779 | 0.6505 |
3.0110 | 1090 | 3.3788 | - | - | - | - | - |
3.0386 | 1100 | 2.9617 | - | - | - | - | - |
3.0661 | 1110 | 3.4313 | - | - | - | - | - |
3.0937 | 1120 | 2.5883 | - | - | - | - | - |
3.1212 | 1130 | 2.8836 | - | - | - | - | - |
3.1488 | 1140 | 2.3895 | - | - | - | - | - |
3.1763 | 1150 | 2.5155 | - | - | - | - | - |
3.2039 | 1160 | 3.3168 | - | - | - | - | - |
3.2314 | 1170 | 3.0286 | - | - | - | - | - |
3.2590 | 1180 | 3.1494 | - | - | - | - | - |
3.2866 | 1190 | 2.87 | - | - | - | - | - |
3.3141 | 1200 | 2.591 | - | - | - | - | - |
3.3417 | 1210 | 2.8437 | - | - | - | - | - |
3.3692 | 1220 | 3.0344 | - | - | - | - | - |
3.3968 | 1230 | 3.0685 | - | - | - | - | - |
3.4243 | 1240 | 3.4623 | - | - | - | - | - |
3.4519 | 1250 | 3.4256 | - | - | - | - | - |
3.4794 | 1260 | 2.7349 | - | - | - | - | - |
3.5070 | 1270 | 2.8587 | - | - | - | - | - |
3.5345 | 1280 | 2.729 | - | - | - | - | - |
3.5621 | 1290 | 3.0288 | - | - | - | - | - |
3.5896 | 1300 | 2.6599 | - | - | - | - | - |
3.6172 | 1310 | 2.4755 | - | - | - | - | - |
3.6447 | 1320 | 3.0501 | - | - | - | - | - |
3.6723 | 1330 | 2.545 | - | - | - | - | - |
3.6998 | 1340 | 2.5919 | - | - | - | - | - |
3.7274 | 1350 | 2.9026 | - | - | - | - | - |
3.7550 | 1360 | 2.7362 | - | - | - | - | - |
3.7825 | 1370 | 3.3311 | - | - | - | - | - |
3.8101 | 1380 | 2.8415 | - | - | - | - | - |
3.8376 | 1390 | 3.2033 | - | - | - | - | - |
3.8652 | 1400 | 2.7483 | - | - | - | - | - |
3.8927 | 1410 | 3.0403 | - | - | - | - | - |
3.9203 | 1420 | 3.0724 | - | - | - | - | - |
3.9478 | 1430 | 2.9797 | - | - | - | - | - |
3.9754 | 1440 | 2.6779 | - | - | - | - | - |
3.9974 | 1448 | - | 0.7073 | 0.704 | 0.6945 | 0.6793 | 0.6523 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 3.4.1
- Transformers: 4.48.3
- PyTorch: 2.6.0+cu124
- Accelerate: 1.3.0
- Datasets: 3.4.1
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}