SentenceTransformer based on Snowflake/snowflake-arctic-embed-l-v2.0

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l-v2.0 on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-l-v2.0
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("coffeepowered/arctic-l-enhanced")
# Run inference
sentences = [
    'The weather is lovely today.',
    "It's so sunny outside!",
    'He drove to the stadium.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

json

  • Dataset: json
  • Columns: sentence1, sentence2, score, and split
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

json

  • Dataset: json
  • Columns: sentence1, sentence2, score, and split
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • learning_rate: 4.000000000000001e-06
  • max_steps: 13938
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 4.000000000000001e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: 13938
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0007 10 0.1359 -
0.0014 20 0.1202 -
0.0022 30 0.1314 -
0.0029 40 0.1302 -
0.0036 50 0.1158 -
0.0043 60 0.1158 -
0.0050 70 0.1114 -
0.0057 80 0.1316 -
0.0065 90 0.133 -
0.0072 100 0.0958 -
0.0079 110 0.0973 -
0.0086 120 0.0949 -
0.0093 130 0.0892 -
0.0100 140 0.0867 -
0.0108 150 0.0717 0.0712
0.0115 160 0.0762 -
0.0122 170 0.0828 -
0.0129 180 0.0775 -
0.0136 190 0.0614 -
0.0143 200 0.0748 -
0.0151 210 0.0545 -
0.0158 220 0.0725 -
0.0165 230 0.0627 -
0.0172 240 0.0612 -
0.0179 250 0.0508 -
0.0187 260 0.0592 -
0.0194 270 0.0489 -
0.0201 280 0.0545 -
0.0208 290 0.0598 -
0.0215 300 0.0641 0.0566
0.0222 310 0.0554 -
0.0230 320 0.0523 -
0.0237 330 0.0568 -
0.0244 340 0.0594 -
0.0251 350 0.0488 -
0.0258 360 0.06 -
0.0265 370 0.0737 -
0.0273 380 0.0565 -
0.0280 390 0.0456 -
0.0287 400 0.0489 -
0.0294 410 0.0478 -
0.0301 420 0.0481 -
0.0309 430 0.0541 -
0.0316 440 0.0581 -
0.0323 450 0.0549 0.0578
0.0330 460 0.042 -
0.0337 470 0.0451 -
0.0344 480 0.0537 -
0.0352 490 0.0487 -
0.0359 500 0.0444 -
0.0366 510 0.0443 -
0.0373 520 0.0418 -
0.0380 530 0.0447 -
0.0387 540 0.0453 -
0.0395 550 0.0465 -
0.0402 560 0.0623 -
0.0409 570 0.0507 -
0.0416 580 0.0428 -
0.0423 590 0.0494 -
0.0430 600 0.0407 0.0519
0.0438 610 0.0452 -
0.0445 620 0.0439 -
0.0452 630 0.0534 -
0.0459 640 0.0422 -
0.0466 650 0.0514 -
0.0474 660 0.0406 -
0.0481 670 0.0428 -
0.0488 680 0.0346 -
0.0495 690 0.0485 -
0.0502 700 0.0549 -
0.0509 710 0.0399 -
0.0517 720 0.0362 -
0.0524 730 0.0507 -
0.0531 740 0.0434 -
0.0538 750 0.0363 0.0553
0.0545 760 0.0426 -
0.0552 770 0.0375 -
0.0560 780 0.0501 -
0.0567 790 0.043 -
0.0574 800 0.0476 -
0.0581 810 0.037 -
0.0588 820 0.0317 -
0.0595 830 0.0387 -
0.0603 840 0.0348 -
0.0610 850 0.0379 -
0.0617 860 0.0506 -
0.0624 870 0.0419 -
0.0631 880 0.0431 -
0.0639 890 0.0478 -
0.0646 900 0.0394 0.0571
0.0653 910 0.0334 -
0.0660 920 0.0384 -
0.0667 930 0.0381 -
0.0674 940 0.0343 -
0.0682 950 0.0355 -
0.0689 960 0.0384 -
0.0696 970 0.0352 -
0.0703 980 0.0479 -
0.0710 990 0.0362 -
0.0717 1000 0.0357 -
0.0725 1010 0.0414 -
0.0732 1020 0.0346 -
0.0739 1030 0.039 -
0.0746 1040 0.0317 -
0.0753 1050 0.0502 0.0523
0.0761 1060 0.0407 -
0.0768 1070 0.0377 -
0.0775 1080 0.0404 -
0.0782 1090 0.0297 -
0.0789 1100 0.0344 -
0.0796 1110 0.0277 -
0.0804 1120 0.0453 -
0.0811 1130 0.0469 -
0.0818 1140 0.044 -
0.0825 1150 0.0339 -
0.0832 1160 0.0331 -
0.0839 1170 0.0327 -
0.0847 1180 0.0445 -
0.0854 1190 0.0392 -
0.0861 1200 0.048 0.0531
0.0868 1210 0.0375 -
0.0875 1220 0.0311 -
0.0882 1230 0.037 -
0.0890 1240 0.0369 -
0.0897 1250 0.0398 -
0.0904 1260 0.0363 -
0.0911 1270 0.0425 -
0.0918 1280 0.0355 -
0.0926 1290 0.0362 -
0.0933 1300 0.0354 -
0.0940 1310 0.0252 -
0.0947 1320 0.03 -
0.0954 1330 0.03 -
0.0961 1340 0.0218 -
0.0969 1350 0.0248 0.0448
0.0976 1360 0.0287 -
0.0983 1370 0.0322 -
0.0990 1380 0.0359 -
0.0997 1390 0.0287 -
0.1004 1400 0.034 -
0.1012 1410 0.0305 -
0.1019 1420 0.0241 -
0.1026 1430 0.0227 -
0.1033 1440 0.0279 -
0.1040 1450 0.0298 -
0.1047 1460 0.0382 -
0.1055 1470 0.0269 -
0.1062 1480 0.0243 -
0.1069 1490 0.0266 -
0.1076 1500 0.016 0.0456
0.1083 1510 0.0233 -
0.1091 1520 0.0247 -
0.1098 1530 0.0233 -
0.1105 1540 0.0214 -
0.1112 1550 0.0229 -
0.1119 1560 0.0145 -
0.1126 1570 0.0187 -
0.1134 1580 0.0231 -
0.1141 1590 0.0268 -
0.1148 1600 0.027 -
0.1155 1610 0.0354 -
0.1162 1620 0.0398 -
0.1169 1630 0.0431 -
0.1177 1640 0.0391 -
0.1184 1650 0.056 0.0422
0.1191 1660 0.0394 -
0.1198 1670 0.0338 -
0.1205 1680 0.0417 -
0.1213 1690 0.0479 -
0.1220 1700 0.0348 -
0.1227 1710 0.0366 -
0.1234 1720 0.0335 -
0.1241 1730 0.0367 -
0.1248 1740 0.0355 -
0.1256 1750 0.0319 -
0.1263 1760 0.0324 -
0.1270 1770 0.0345 -
0.1277 1780 0.0352 -
0.1284 1790 0.0362 -
0.1291 1800 0.0319 0.0388
0.1299 1810 0.0381 -
0.1306 1820 0.0368 -
0.1313 1830 0.0318 -
0.1320 1840 0.0296 -
0.1327 1850 0.0331 -
0.1334 1860 0.0381 -
0.1342 1870 0.0237 -
0.1349 1880 0.0284 -
0.1356 1890 0.0353 -
0.1363 1900 0.0408 -
0.1370 1910 0.0277 -
0.1378 1920 0.0268 -
0.1385 1930 0.0348 -
0.1392 1940 0.0348 -
0.1399 1950 0.0269 0.0351
0.1406 1960 0.0317 -
0.1413 1970 0.0375 -
0.1421 1980 0.0321 -
0.1428 1990 0.0302 -
0.1435 2000 0.0411 -
0.1442 2010 0.0329 -
0.1449 2020 0.0246 -
0.1456 2030 0.0288 -
0.1464 2040 0.0297 -
0.1471 2050 0.0285 -
0.1478 2060 0.0326 -
0.1485 2070 0.035 -
0.1492 2080 0.0208 -
0.1499 2090 0.0209 -
0.1507 2100 0.0303 0.0364
0.1514 2110 0.028 -
0.1521 2120 0.0366 -
0.1528 2130 0.033 -
0.1535 2140 0.0353 -
0.1543 2150 0.0292 -
0.1550 2160 0.0202 -
0.1557 2170 0.0233 -
0.1564 2180 0.0346 -
0.1571 2190 0.0309 -
0.1578 2200 0.0241 -
0.1586 2210 0.0317 -
0.1593 2220 0.0331 -
0.1600 2230 0.0273 -
0.1607 2240 0.0219 -
0.1614 2250 0.0256 0.0338
0.1621 2260 0.024 -
0.1629 2270 0.0216 -
0.1636 2280 0.0253 -
0.1643 2290 0.0271 -
0.1650 2300 0.0336 -
0.1657 2310 0.0273 -
0.1665 2320 0.0267 -
0.1672 2330 0.0292 -
0.1679 2340 0.0269 -
0.1686 2350 0.0342 -
0.1693 2360 0.0361 -
0.1700 2370 0.0224 -
0.1708 2380 0.016 -
0.1715 2390 0.0209 -
0.1722 2400 0.0162 0.0329
0.1729 2410 0.019 -
0.1736 2420 0.0181 -
0.1743 2430 0.0332 -
0.1751 2440 0.0189 -
0.1758 2450 0.0292 -
0.1765 2460 0.0193 -
0.1772 2470 0.024 -
0.1779 2480 0.0245 -
0.1786 2490 0.02 -
0.1794 2500 0.0187 -
0.1801 2510 0.0224 -
0.1808 2520 0.028 -
0.1815 2530 0.0219 -
0.1822 2540 0.0212 -
0.1830 2550 0.0208 0.0354
0.1837 2560 0.0299 -
0.1844 2570 0.0208 -
0.1851 2580 0.0271 -
0.1858 2590 0.0146 -
0.1865 2600 0.0163 -
0.1873 2610 0.0149 -
0.1880 2620 0.0145 -
0.1887 2630 0.0196 -
0.1894 2640 0.0252 -
0.1901 2650 0.0173 -
0.1908 2660 0.0202 -
0.1916 2670 0.0171 -
0.1923 2680 0.023 -
0.1930 2690 0.02 -
0.1937 2700 0.0171 0.0387
0.1944 2710 0.0239 -
0.1951 2720 0.0261 -
0.1959 2730 0.0197 -
0.1966 2740 0.0195 -
0.1973 2750 0.0132 -
0.1980 2760 0.0226 -
0.1987 2770 0.0172 -
0.1995 2780 0.0214 -
0.2002 2790 0.0173 -
0.2009 2800 0.0179 -
0.2016 2810 0.0184 -
0.2023 2820 0.0162 -
0.2030 2830 0.0126 -
0.2038 2840 0.0188 -
0.2045 2850 0.0177 0.0385
0.2052 2860 0.0153 -
0.2059 2870 0.0137 -
0.2066 2880 0.0174 -
0.2073 2890 0.0148 -
0.2081 2900 0.0194 -
0.2088 2910 0.0133 -
0.2095 2920 0.0152 -
0.2102 2930 0.0179 -
0.2109 2940 0.0187 -
0.2117 2950 0.0161 -
0.2124 2960 0.0185 -
0.2131 2970 0.0151 -
0.2138 2980 0.0194 -
0.2145 2990 0.0179 -
0.2152 3000 0.0115 0.0434
0.2160 3010 0.0183 -
0.2167 3020 0.0168 -
0.2174 3030 0.0137 -
0.2181 3040 0.0196 -
0.2188 3050 0.0145 -
0.2195 3060 0.0164 -
0.2203 3070 0.0132 -
0.2210 3080 0.0133 -
0.2217 3090 0.0154 -
0.2224 3100 0.0145 -
0.2231 3110 0.0135 -
0.2238 3120 0.0143 -
0.2246 3130 0.0197 -
0.2253 3140 0.0163 -
0.2260 3150 0.017 0.0409
0.2267 3160 0.0257 -
0.2274 3170 0.0165 -
0.2282 3180 0.0189 -
0.2289 3190 0.0207 -
0.2296 3200 0.0179 -
0.2303 3210 0.0152 -
0.2310 3220 0.0179 -
0.2317 3230 0.0187 -
0.2325 3240 0.0189 -
0.2332 3250 0.0151 -
0.2339 3260 0.0137 -
0.2346 3270 0.0122 -
0.2353 3280 0.0134 -
0.2360 3290 0.0134 -
0.2368 3300 0.0162 0.0455
0.2375 3310 0.0142 -
0.2382 3320 0.0203 -
0.2389 3330 0.0148 -
0.2396 3340 0.0181 -
0.2404 3350 0.0178 -
0.2411 3360 0.018 -
0.2418 3370 0.012 -
0.2425 3380 0.0201 -
0.2432 3390 0.0193 -
0.2439 3400 0.013 -
0.2447 3410 0.0114 -
0.2454 3420 0.0184 -
0.2461 3430 0.0153 -
0.2468 3440 0.0172 -
0.2475 3450 0.0141 0.0458
0.2482 3460 0.0165 -
0.2490 3470 0.0174 -
0.2497 3480 0.0109 -
0.2504 3490 0.0177 -
0.2511 3500 0.0178 -
0.2518 3510 0.0101 -
0.2525 3520 0.0154 -
0.2533 3530 0.0204 -
0.2540 3540 0.0119 -
0.2547 3550 0.0158 -
0.2554 3560 0.0161 -
0.2561 3570 0.0116 -
0.2569 3580 0.0183 -
0.2576 3590 0.0125 -
0.2583 3600 0.0148 0.0475
0.2590 3610 0.0184 -
0.2597 3620 0.0169 -
0.2604 3630 0.0135 -
0.2612 3640 0.0138 -
0.2619 3650 0.0133 -
0.2626 3660 0.0148 -
0.2633 3670 0.0121 -
0.2640 3680 0.0119 -
0.2647 3690 0.0183 -
0.2655 3700 0.0178 -
0.2662 3710 0.0139 -
0.2669 3720 0.0171 -
0.2676 3730 0.0155 -
0.2683 3740 0.0173 -
0.2690 3750 0.0158 0.0488
0.2698 3760 0.0125 -
0.2705 3770 0.0137 -
0.2712 3780 0.0184 -
0.2719 3790 0.0117 -
0.2726 3800 0.0189 -
0.2734 3810 0.0163 -
0.2741 3820 0.0163 -
0.2748 3830 0.0212 -
0.2755 3840 0.0133 -
0.2762 3850 0.0168 -
0.2769 3860 0.0197 -
0.2777 3870 0.0184 -
0.2784 3880 0.017 -
0.2791 3890 0.0173 -
0.2798 3900 0.0165 0.0511
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.14
  • Sentence Transformers: 3.4.1
  • Transformers: 4.49.0
  • PyTorch: 2.2.2
  • Accelerate: 1.4.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
21
Safetensors
Model size
568M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for coffeepowered/arctic-l-enhanced

Finetuned
(15)
this model