Edit model card

SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    '九月辛未太祖曾孙舒国公从式进封安定郡王',
    '九月初二太祖曾孙舒国公从式进封安定郡王',
    '杨难当在汉中大肆烧杀抢劫然后率众离开了汉中向西返回仇池留下赵温据守梁州又派他的魏兴太守薛健屯驻黄金山',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 756,057 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 4 tokens
    • mean: 20.76 tokens
    • max: 199 tokens
    • min: 4 tokens
    • mean: 31.27 tokens
    • max: 384 tokens
  • Samples:
    anchor positive
    虏怀兼弱之威挟广地之计强兵大众亲自凌殄旍鼓弥年矢石不息 魏人怀有兼并弱小的威严胸藏拓展土地的计谋强人的军队亲自出征侵逼消灭旌旗战鼓连年出动战事不停息
    孟子曰 以善服人者未有能服人者也以善养人然后能服天下 孟子说 用自己的善良使人们服从的人没有能使人服从的用善良影响教导人们才能使天下的人们都信服
    开庆初大元兵渡江理宗议迁都平江庆元后谏不可恐摇动民心乃止 开庆初年大元朝部队渡过长江理宗打算迁都到平江庆元皇后劝谏不可迁都深恐动摇民心理宗才作罢
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 84,007 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 4 tokens
    • mean: 20.23 tokens
    • max: 138 tokens
    • min: 4 tokens
    • mean: 31.42 tokens
    • max: 384 tokens
  • Samples:
    anchor positive
    雒阳户五万二千八百三十九 雒阳有五万二千八百三十九户
    拜南青州刺史在任有政绩 任南青州刺史很有政绩
    第六品以下加不得服金钅奠绫锦锦绣七缘绮貂豽裘金叉环铒及以金校饰器物张绛帐 官位在第六品以下的官员再增加不得穿用金钿绫锦锦绣七缘绮貂钠皮衣金叉缳饵以及用金装饰的器物张绛帐等衣服物品
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss
0.0021 100 0.6475 -
0.0042 200 0.5193 -
0.0063 300 0.4132 -
0.0085 400 0.3981 -
0.0106 500 0.4032 -
0.0127 600 0.3627 -
0.0148 700 0.3821 -
0.0169 800 0.3767 -
0.0190 900 0.3731 -
0.0212 1000 0.3744 -
0.0233 1100 0.3115 -
0.0254 1200 0.3998 -
0.0275 1300 0.3103 -
0.0296 1400 0.3251 -
0.0317 1500 0.2833 -
0.0339 1600 0.3335 -
0.0360 1700 0.3281 -
0.0381 1800 0.423 -
0.0402 1900 0.3687 -
0.0423 2000 0.3452 -
0.0444 2100 0.8643 -
0.0466 2200 0.4279 -
0.0487 2300 0.4188 -
0.0508 2400 0.3676 -
0.0529 2500 0.3279 -
0.0550 2600 0.3415 -
0.0571 2700 1.5834 -
0.0593 2800 2.7778 -
0.0614 2900 2.7734 -
0.0635 3000 2.7732 -
0.0656 3100 2.7751 -
0.0677 3200 2.7731 -
0.0698 3300 2.773 -
0.0720 3400 2.7727 -
0.0741 3500 2.7534 -
0.0762 3600 2.2219 -
0.0783 3700 0.5137 -
0.0804 3800 0.4143 -
0.0825 3900 0.4002 -
0.0846 4000 0.368 -
0.0868 4100 0.3879 -
0.0889 4200 0.3519 -
0.0910 4300 0.364 -
0.0931 4400 0.3618 -
0.0952 4500 0.3545 -
0.0973 4600 0.379 -
0.0995 4700 0.3837 -
0.1016 4800 0.3553 -
0.1037 4900 0.3519 -
0.1058 5000 0.3416 0.3487
0.1079 5100 0.3763 -
0.1100 5200 0.3748 -
0.1122 5300 0.3564 -
0.1143 5400 0.336 -
0.1164 5500 0.3601 -
0.1185 5600 0.3521 -
0.1206 5700 0.376 -
0.1227 5800 0.3011 -
0.1249 5900 0.345 -
0.1270 6000 0.3211 -
0.1291 6100 0.3673 -
0.1312 6200 0.3762 -
0.1333 6300 0.3562 -
0.1354 6400 0.2761 -
0.1376 6500 0.3186 -
0.1397 6600 0.3582 -
0.1418 6700 0.3454 -
0.1439 6800 0.3429 -
0.1460 6900 0.2932 -
0.1481 7000 0.3357 -
0.1503 7100 0.2979 -
0.1524 7200 0.313 -
0.1545 7300 0.3364 -
0.1566 7400 0.3459 -
0.1587 7500 0.279 -
0.1608 7600 0.3274 -
0.1629 7700 0.3367 -
0.1651 7800 0.2935 -
0.1672 7900 0.3415 -
0.1693 8000 0.2838 -
0.1714 8100 0.2667 -
0.1735 8200 0.3051 -
0.1756 8300 0.3197 -
0.1778 8400 0.3086 -
0.1799 8500 0.3186 -
0.1820 8600 0.3063 -
0.1841 8700 0.2967 -
0.1862 8800 0.3069 -
0.1883 8900 0.3391 -
0.1905 9000 0.335 -
0.1926 9100 0.3115 -
0.1947 9200 0.3214 -
0.1968 9300 0.278 -
0.1989 9400 0.2833 -
0.2010 9500 0.303 -
0.2032 9600 0.3238 -
0.2053 9700 0.2622 -
0.2074 9800 0.3295 -
0.2095 9900 0.2699 -
0.2116 10000 0.2426 0.2962
0.2137 10100 0.262 -
0.2159 10200 0.3199 -
0.2180 10300 0.3677 -
0.2201 10400 0.2423 -
0.2222 10500 0.3446 -
0.2243 10600 0.3002 -
0.2264 10700 0.2863 -
0.2286 10800 0.2692 -
0.2307 10900 0.3157 -
0.2328 11000 0.3172 -
0.2349 11100 0.3622 -
0.2370 11200 0.3019 -
0.2391 11300 0.2789 -
0.2412 11400 0.2872 -
0.2434 11500 0.2823 -
0.2455 11600 0.3017 -
0.2476 11700 0.2573 -
0.2497 11800 0.3104 -
0.2518 11900 0.2857 -
0.2539 12000 0.2898 -
0.2561 12100 0.2389 -
0.2582 12200 0.3137 -
0.2603 12300 0.3029 -
0.2624 12400 0.2894 -
0.2645 12500 0.2665 -
0.2666 12600 0.2705 -
0.2688 12700 0.2673 -
0.2709 12800 0.248 -
0.2730 12900 0.2417 -
0.2751 13000 0.2852 -
0.2772 13100 0.2619 -
0.2793 13200 0.3157 -
0.2815 13300 0.2464 -
0.2836 13400 0.2837 -
0.2857 13500 0.3202 -
0.2878 13600 0.2618 -
0.2899 13700 0.2823 -
0.2920 13800 0.2634 -
0.2942 13900 0.2747 -
0.2963 14000 0.2835 -
0.2984 14100 0.2594 -
0.3005 14200 0.2744 -
0.3026 14300 0.2722 -
0.3047 14400 0.2514 -
0.3069 14500 0.2809 -
0.3090 14600 0.2949 -
0.3111 14700 0.2687 -
0.3132 14800 0.3 -
0.3153 14900 0.2684 -
0.3174 15000 0.2894 0.2790
0.3195 15100 0.2676 -
0.3217 15200 0.2519 -
0.3238 15300 0.2698 -
0.3259 15400 0.2898 -
0.3280 15500 0.2359 -
0.3301 15600 0.2866 -
0.3322 15700 0.3098 -
0.3344 15800 0.2809 -
0.3365 15900 0.3081 -
0.3386 16000 0.266 -
0.3407 16100 0.2523 -
0.3428 16200 0.3215 -
0.3449 16300 0.2883 -
0.3471 16400 0.2897 -
0.3492 16500 0.3174 -
0.3513 16600 0.2878 -
0.3534 16700 0.267 -
0.3555 16800 0.2452 -
0.3576 16900 0.2429 -
0.3598 17000 0.2178 -
0.3619 17100 0.2798 -
0.3640 17200 0.2367 -
0.3661 17300 0.2554 -
0.3682 17400 0.2883 -
0.3703 17500 0.2567 -
0.3725 17600 0.27 -
0.3746 17700 0.2837 -
0.3767 17800 0.2783 -
0.3788 17900 0.2517 -
0.3809 18000 0.2545 -
0.3830 18100 0.2632 -
0.3852 18200 0.2074 -
0.3873 18300 0.2276 -
0.3894 18400 0.3022 -
0.3915 18500 0.2381 -
0.3936 18600 0.2552 -
0.3957 18700 0.2579 -
0.3978 18800 0.2655 -
0.4000 18900 0.252 -
0.4021 19000 0.2876 -
0.4042 19100 0.2037 -
0.4063 19200 0.251 -
0.4084 19300 0.2588 -
0.4105 19400 0.201 -
0.4127 19500 0.2828 -
0.4148 19600 0.2637 -
0.4169 19700 0.3233 -
0.4190 19800 0.2475 -
0.4211 19900 0.2618 -
0.4232 20000 0.3272 0.2519
0.4254 20100 0.3074 -
0.4275 20200 0.2994 -
0.4296 20300 0.2624 -
0.4317 20400 0.2389 -
0.4338 20500 0.2809 -
0.4359 20600 0.2659 -
0.4381 20700 0.2508 -
0.4402 20800 0.2542 -
0.4423 20900 0.2525 -
0.4444 21000 0.257 -
0.4465 21100 0.2242 -
0.4486 21200 0.2307 -
0.4508 21300 0.2721 -
0.4529 21400 0.2489 -
0.4550 21500 0.2933 -
0.4571 21600 0.2448 -
0.4592 21700 0.2619 -
0.4613 21800 0.2488 -
0.4635 21900 0.2411 -
0.4656 22000 0.2964 -
0.4677 22100 0.2062 -
0.4698 22200 0.2665 -
0.4719 22300 0.263 -
0.4740 22400 0.2418 -
0.4762 22500 0.2879 -
0.4783 22600 0.2406 -
0.4804 22700 0.2448 -
0.4825 22800 0.243 -
0.4846 22900 0.2863 -
0.4867 23000 0.2833 -
0.4888 23100 0.2784 -
0.4910 23200 0.2789 -
0.4931 23300 0.2495 -
0.4952 23400 0.2872 -
0.4973 23500 0.2487 -
0.4994 23600 0.2669 -
0.5015 23700 0.2748 -
0.5037 23800 0.246 -
0.5058 23900 0.2512 -
0.5079 24000 0.222 -
0.5100 24100 0.2662 -
0.5121 24200 0.2238 -
0.5142 24300 0.2399 -
0.5164 24400 0.2595 -
0.5185 24500 0.3002 -
0.5206 24600 0.2553 -
0.5227 24700 0.226 -
0.5248 24800 0.2823 -
0.5269 24900 0.2737 -
0.5291 25000 0.2237 0.2492
0.5312 25100 0.2642 -
0.5333 25200 0.2486 -
0.5354 25300 0.2527 -
0.5375 25400 0.2363 -
0.5396 25500 0.2443 -
0.5418 25600 0.2485 -
0.5439 25700 0.2434 -
0.5460 25800 0.2631 -
0.5481 25900 0.284 -
0.5502 26000 0.217 -
0.5523 26100 0.2246 -
0.5545 26200 0.2614 -
0.5566 26300 0.2722 -
0.5587 26400 0.2114 -
0.5608 26500 0.2623 -
0.5629 26600 0.2475 -
0.5650 26700 0.2449 -
0.5671 26800 0.2423 -
0.5693 26900 0.2435 -
0.5714 27000 0.2446 -
0.5735 27100 0.2248 -
0.5756 27200 0.2159 -
0.5777 27300 0.2415 -
0.5798 27400 0.2257 -
0.5820 27500 0.2775 -
0.5841 27600 0.2533 -
0.5862 27700 0.2893 -
0.5883 27800 0.2095 -
0.5904 27900 0.2156 -
0.5925 28000 0.2315 -
0.5947 28100 0.2865 -
0.5968 28200 0.262 -
0.5989 28300 0.2506 -
0.6010 28400 0.2472 -
0.6031 28500 0.2395 -
0.6052 28600 0.2269 -
0.6074 28700 0.2639 -
0.6095 28800 0.2674 -
0.6116 28900 0.2521 -
0.6137 29000 0.2553 -
0.6158 29100 0.2526 -
0.6179 29200 0.231 -
0.6201 29300 0.2622 -
0.6222 29400 0.237 -
0.6243 29500 0.2475 -
0.6264 29600 0.2435 -
0.6285 29700 0.2109 -
0.6306 29800 0.2376 -
0.6328 29900 0.2202 -
0.6349 30000 0.2147 0.2370
0.6370 30100 0.2306 -
0.6391 30200 0.2249 -
0.6412 30300 0.3027 -
0.6433 30400 0.2115 -
0.6454 30500 0.2597 -
0.6476 30600 0.2483 -
0.6497 30700 0.2719 -
0.6518 30800 0.2162 -
0.6539 30900 0.2947 -
0.6560 31000 0.2144 -
0.6581 31100 0.2391 -
0.6603 31200 0.2572 -
0.6624 31300 0.1977 -
0.6645 31400 0.2678 -
0.6666 31500 0.2353 -
0.6687 31600 0.1911 -
0.6708 31700 0.2844 -
0.6730 31800 0.2689 -
0.6751 31900 0.2491 -
0.6772 32000 0.2259 -
0.6793 32100 0.2248 -
0.6814 32200 0.2462 -
0.6835 32300 0.2135 -
0.6857 32400 0.2085 -
0.6878 32500 0.227 -
0.6899 32600 0.2488 -
0.6920 32700 0.2614 -
0.6941 32800 0.2274 -
0.6962 32900 0.2389 -
0.6984 33000 0.2573 -
0.7005 33100 0.245 -
0.7026 33200 0.21 -
0.7047 33300 0.2196 -
0.7068 33400 0.2218 -
0.7089 33500 0.2092 -
0.7111 33600 0.2526 -
0.7132 33700 0.2275 -
0.7153 33800 0.2622 -
0.7174 33900 0.2469 -
0.7195 34000 0.2157 -
0.7216 34100 0.2326 -
0.7237 34200 0.268 -
0.7259 34300 0.2628 -
0.7280 34400 0.2503 -
0.7301 34500 0.2101 -
0.7322 34600 0.237 -
0.7343 34700 0.233 -
0.7364 34800 0.2077 -
0.7386 34900 0.259 -
0.7407 35000 0.2312 0.2284
0.7428 35100 0.287 -
0.7449 35200 0.2278 -
0.7470 35300 0.2618 -
0.7491 35400 0.2298 -
0.7513 35500 0.195 -
0.7534 35600 0.2248 -
0.7555 35700 0.2234 -
0.7576 35800 0.2218 -
0.7597 35900 0.2002 -
0.7618 36000 0.2158 -
0.7640 36100 0.1919 -
0.7661 36200 0.2972 -
0.7682 36300 0.2665 -
0.7703 36400 0.2114 -
0.7724 36500 0.1879 -
0.7745 36600 0.2137 -
0.7767 36700 0.2847 -
0.7788 36800 0.2372 -
0.7809 36900 0.2058 -
0.7830 37000 0.2205 -
0.7851 37100 0.2012 -
0.7872 37200 0.2057 -
0.7894 37300 0.1932 -
0.7915 37400 0.2261 -
0.7936 37500 0.2633 -
0.7957 37600 0.1558 -
0.7978 37700 0.2064 -
0.7999 37800 0.2166 -
0.8020 37900 0.2249 -
0.8042 38000 0.2626 -
0.8063 38100 0.1945 -
0.8084 38200 0.2611 -
0.8105 38300 0.199 -
0.8126 38400 0.2004 -
0.8147 38500 0.2506 -
0.8169 38600 0.1722 -
0.8190 38700 0.1959 -
0.8211 38800 0.2505 -
0.8232 38900 0.2343 -
0.8253 39000 0.2353 -
0.8274 39100 0.22 -
0.8296 39200 0.2089 -
0.8317 39300 0.2416 -
0.8338 39400 0.1916 -
0.8359 39500 0.2387 -
0.8380 39600 0.2475 -
0.8401 39700 0.2189 -
0.8423 39800 0.2141 -
0.8444 39900 0.2008 -
0.8465 40000 0.2489 0.2253
0.8486 40100 0.2258 -
0.8507 40200 0.2341 -
0.8528 40300 0.2377 -
0.8550 40400 0.194 -
0.8571 40500 0.2144 -
0.8592 40600 0.2605 -
0.8613 40700 0.2517 -
0.8634 40800 0.2044 -
0.8655 40900 0.2259 -
0.8677 41000 0.2141 -
0.8698 41100 0.1895 -
0.8719 41200 0.2361 -
0.8740 41300 0.1978 -
0.8761 41400 0.2089 -
0.8782 41500 0.2258 -
0.8803 41600 0.2368 -
0.8825 41700 0.2473 -
0.8846 41800 0.2185 -
0.8867 41900 0.212 -
0.8888 42000 0.2469 -
0.8909 42100 0.1817 -
0.8930 42200 0.1884 -
0.8952 42300 0.207 -
0.8973 42400 0.2422 -
0.8994 42500 0.2606 -
0.9015 42600 0.2266 -
0.9036 42700 0.2103 -
0.9057 42800 0.2712 -
0.9079 42900 0.1944 -
0.9100 43000 0.2003 -
0.9121 43100 0.1991 -
0.9142 43200 0.2129 -
0.9163 43300 0.2465 -
0.9184 43400 0.1764 -
0.9206 43500 0.2365 -
0.9227 43600 0.2054 -
0.9248 43700 0.2551 -
0.9269 43800 0.2322 -
0.9290 43900 0.2213 -
0.9311 44000 0.1962 -
0.9333 44100 0.1988 -
0.9354 44200 0.1982 -
0.9375 44300 0.2193 -
0.9396 44400 0.2378 -
0.9417 44500 0.2244 -
0.9438 44600 0.2296 -
0.9460 44700 0.2446 -
0.9481 44800 0.2206 -
0.9502 44900 0.1815 -
0.9523 45000 0.2385 0.22
0.9544 45100 0.2106 -
0.9565 45200 0.1929 -
0.9586 45300 0.181 -
0.9608 45400 0.1908 -
0.9629 45500 0.1926 -
0.9650 45600 0.1922 -
0.9671 45700 0.2003 -
0.9692 45800 0.2377 -
0.9713 45900 0.2069 -
0.9735 46000 0.2024 -
0.9756 46100 0.1795 -
0.9777 46200 0.2372 -
0.9798 46300 0.2135 -
0.9819 46400 0.2396 -
0.9840 46500 0.2295 -
0.9862 46600 0.2235 -
0.9883 46700 0.2427 -
0.9904 46800 0.2145 -
0.9925 46900 0.2231 -
0.9946 47000 0.2401 -
0.9967 47100 0.1764 -
0.9989 47200 0.1943 -
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.4
  • Sentence Transformers: 3.1.0.dev0
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cpu
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
4
Safetensors
Model size
109M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.