SentenceTransformer based on Snowflake/snowflake-arctic-embed-l

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-l
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the ๐Ÿค— Hub
model = SentenceTransformer("philocifer/banner-flip-arctic-embed-l")
# Run inference
sentences = [
    'How many full-time staff members are employed at the 54 Royal Market?',
    "STORE ANALYSIS: 54 Royal Market (728885)\nLocation: 817 9th Ave, New York\n\nStore 728885 - 54 Royal Market operates as a Open Store Superette establishment at 817 9th Ave, New York, NY 100194401 (FIPS 36-61). Geographically precise at coordinates 40.7662,-73.9872 (Geocoded to specific address), this location generates $1,768,000 in annual sales ($1,500,001 to $2,000,000) from its 2000.0 square foot space. The operation employs 11 full-time staff across 3 checkout lanes, yielding a sales density of $884.00/sqft. Owned by Independent (Family ID: 99999) as part of a 1 Store-location network, the store sources inventory through Small Supplier (Supplier ID: 888888, Family ID: 88888) to maintain its position in the Grocery sector's Superette segment.",
    "STORE ANALYSIS: 1683 Jimmy Deli Grocery (1816263)\nLocation: 1683 Woodbine St, Ridgewood\n\nStore 1816263 - 1683 Jimmy Deli Grocery operates as a Open Store Superette establishment at 1683 Woodbine St, Ridgewood, NY 113853546 (FIPS 36-81). Geographically precise at coordinates 40.7012,-73.9083 (Geocoded to specific address), this location generates $1,456,000 in annual sales ($1,000,001 to $1,500,000) from its 1000.0 square foot space. The operation employs 8 full-time staff across 2 checkout lanes, yielding a sales density of $1,456.00/sqft. Owned by Independent (Family ID: 99999) as part of a 1 Store-location network, the store sources inventory through Small Supplier (Supplier ID: 888888, Family ID: 88888) to maintain its position in the Grocery sector's Superette segment.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.85
cosine_accuracy@3 0.93
cosine_accuracy@5 0.96
cosine_accuracy@10 0.98
cosine_precision@1 0.85
cosine_precision@3 0.31
cosine_precision@5 0.192
cosine_precision@10 0.098
cosine_recall@1 0.85
cosine_recall@3 0.93
cosine_recall@5 0.96
cosine_recall@10 0.98
cosine_ndcg@10 0.9172
cosine_mrr@10 0.8968
cosine_map@100 0.8976

Training Details

Training Dataset

Unnamed Dataset

  • Size: 300 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 300 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 13 tokens
    • mean: 19.23 tokens
    • max: 29 tokens
    • min: 200 tokens
    • mean: 215.01 tokens
    • max: 232 tokens
  • Samples:
    sentence_0 sentence_1
    How many full-time staff members are employed at 3 Rivers Grocery Market? STORE ANALYSIS: 3 Rivers Grocery Market (432489)
    Location: 9400 US Highway 60 W, Kevil

    Store 432489 - 3 Rivers Grocery Market operates as a Open Store Supermarket-Conventional establishment at 9400 US Highway 60 W, Kevil, KY 420539678 (FIPS 21-145). Geographically precise at coordinates 37.0624,-88.8028 (Geocoded to specific address), this location generates $4,160,000 in annual sales ($4,000,001 to $6,000,000) from its 13000.0 square foot space. The operation employs 18 full-time staff across 4 checkout lanes, yielding a sales density of $320.00/sqft. Owned by Independent (Family ID: 99999) as part of a 1 Store-location network, the store sources inventory through Assoc Wholesale/Nashville Div (Supplier ID: 12115, Family ID: 4110) to maintain its position in the Grocery sector's Supermarket-Conventional segment.
    How many full-time staff members are employed at the 28th Street Supermarket? STORE ANALYSIS: 28th Street Supermarket (737932)
    Location: 2747 Cedar Ave, Cleveland

    Store 737932 - 28th Street Supermarket operates as a Open Store Superette establishment at 2747 Cedar Ave, Cleveland, OH 441152908 (FIPS 39-35). Geographically precise at coordinates 41.4988,-81.6687 (Geocoded to specific address), this location generates $1,404,000 in annual sales ($1,000,001 to $1,500,000) from its 1000.0 square foot space. The operation employs 4 full-time staff across 3 checkout lanes, yielding a sales density of $1,404.00/sqft. Owned by Independent (Family ID: 99999) as part of a 1 Store-location network, the store sources inventory through H T Hackney Co/Dist Ctr (Supplier ID: 36166, Family ID: 41880) to maintain its position in the Grocery sector's Superette segment.
    How many full-time staff members are employed at the 4th Street Market? STORE ANALYSIS: 4th Street Market (772013)
    Location: 301 4th St, Richmond

    Store 772013 - 4th Street Market operates as a Open Store Superette establishment at 301 4th St, Richmond, CA 948013001 (FIPS 6-13). Geographically precise at coordinates 37.9362,-122.3657 (Geocoded to specific address), this location generates $1,560,000 in annual sales ($1,500,001 to $2,000,000) from its 1000.0 square foot space. The operation employs 7 full-time staff across 2 checkout lanes, yielding a sales density of $1,560.00/sqft. Owned by Independent (Family ID: 99999) as part of a 1 Store-location network, the store sources inventory through Small Supplier (Supplier ID: 888888, Family ID: 88888) to maintain its position in the Grocery sector's Superette segment.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step cosine_ndcg@10
1.0 30 0.9111
1.6667 50 0.9106
2.0 60 0.9058
3.0 90 0.9149
3.3333 100 0.9199
4.0 120 0.9185
5.0 150 0.9208
6.0 180 0.9172
6.6667 200 0.9172
7.0 210 0.9172
8.0 240 0.9172
8.3333 250 0.9172
9.0 270 0.9172
10.0 300 0.9172

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
10
Safetensors
Model size
334M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for philocifer/banner-flip-arctic-embed-l

Finetuned
(83)
this model

Space using philocifer/banner-flip-arctic-embed-l 1

Evaluation results