SentenceTransformer based on xmanii/maux-gte-persian-v3

This is a sentence-transformers model finetuned from xmanii/maux-gte-persian-v3. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: xmanii/maux-gte-persian-v3
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'NewModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("axiomepic/gte-persian-seo-keyword-embedding")
# Run inference
sentences = [
    'چراغ قوه\u200c ضد انفجار',
    'دودی روشن',
    'ففدول',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7218, 0.5096],
#         [0.7218, 1.0000, 0.5915],
#         [0.5096, 0.5915, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 8,721 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 3 tokens
    • mean: 6.51 tokens
    • max: 14 tokens
    • min: 3 tokens
    • mean: 6.19 tokens
    • max: 18 tokens
    • 0: ~35.90%
    • 1: ~64.10%
  • Samples:
    sentence1 sentence2 label
    قیمت لباس بچه بهترین سایت لباس کودک 1
    قیمت رژلب قیمت ژل ابرو 1
    سیگار برگ‌ سیگاری 1
  • Loss: OnlineContrastiveLoss

Evaluation Dataset

Unnamed Dataset

  • Size: 969 evaluation samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 969 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 3 tokens
    • mean: 6.49 tokens
    • max: 15 tokens
    • min: 3 tokens
    • mean: 6.18 tokens
    • max: 19 tokens
    • 0: ~33.54%
    • 1: ~66.46%
  • Samples:
    sentence1 sentence2 label
    خواص دنبه خواص روغن دنبه 1
    چای ماچا چای نعناع 1
    اشتراک اسپاتیفای خرید اکانت پرمیوم اسپاتیفای 1
  • Loss: OnlineContrastiveLoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 64
  • gradient_accumulation_steps: 2
  • learning_rate: 1e-05
  • num_train_epochs: 10
  • warmup_ratio: 0.05
  • log_level_replica: passive
  • log_on_each_node: False
  • logging_nan_inf_filter: False
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.05
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: passive
  • log_on_each_node: False
  • logging_nan_inf_filter: False
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0037 1 0.5054 -
0.0073 2 0.5856 -
0.0110 3 0.5052 -
0.0147 4 1.1225 -
0.0183 5 0.9045 -
0.0220 6 0.9862 -
0.0256 7 1.1367 -
0.0293 8 1.032 -
0.0330 9 1.1067 -
0.0366 10 0.5211 -
0.0403 11 1.3348 -
0.0440 12 0.7714 -
0.0476 13 0.5516 -
0.0513 14 0.8319 -
0.0549 15 0.8509 -
0.0586 16 1.1887 -
0.0623 17 1.4326 -
0.0659 18 0.8932 -
0.0696 19 0.7684 -
0.0733 20 0.6201 -
0.0769 21 1.6244 -
0.0806 22 0.786 -
0.0842 23 0.77 -
0.0879 24 0.8046 -
0.0916 25 0.7242 -
0.0952 26 0.8478 -
0.0989 27 1.0491 -
0.1026 28 0.4904 -
0.1062 29 1.277 -
0.1099 30 1.0306 -
0.1136 31 0.8367 -
0.1172 32 0.9737 -
0.1209 33 0.6155 -
0.1245 34 0.2685 -
0.1282 35 0.8825 -
0.1319 36 1.0767 -
0.1355 37 1.1151 -
0.1392 38 0.9151 -
0.1429 39 1.1762 -
0.1465 40 0.6362 -
0.1502 41 0.8624 -
0.1538 42 0.7831 -
0.1575 43 0.5686 -
0.1612 44 0.8203 -
0.1648 45 0.6298 -
0.1685 46 0.7037 -
0.1722 47 0.7446 -
0.1758 48 0.8063 -
0.1795 49 0.918 -
0.1832 50 1.2139 -
0.1868 51 0.7787 -
0.1905 52 0.4508 -
0.1941 53 0.8636 -
0.1978 54 0.8607 -
0.2015 55 1.1511 -
0.2051 56 0.7653 -
0.2088 57 0.441 -
0.2125 58 0.6974 -
0.2161 59 0.7481 -
0.2198 60 0.727 -
0.2234 61 0.8182 -
0.2271 62 0.4998 -
0.2308 63 0.949 -
0.2344 64 0.5796 -
0.2381 65 0.7822 -
0.2418 66 1.1591 -
0.2454 67 0.7478 -
0.2491 68 0.8698 -
0.2527 69 0.5906 -
0.2564 70 0.9387 -
0.2601 71 0.5571 -
0.2637 72 0.3962 -
0.2674 73 0.7237 -
0.2711 74 0.6404 -
0.2747 75 0.8193 -
0.2784 76 0.5669 -
0.2821 77 0.75 -
0.2857 78 0.6934 -
0.2894 79 0.6464 -
0.2930 80 0.6564 -
0.2967 81 0.6188 -
0.3004 82 0.6652 -
0.3040 83 0.4678 -
0.3077 84 0.7576 -
0.3114 85 0.5472 -
0.3150 86 0.7941 -
0.3187 87 0.6895 -
0.3223 88 0.6192 -
0.3260 89 0.6249 -
0.3297 90 0.6652 -
0.3333 91 0.8822 -
0.3370 92 1.0119 -
0.3407 93 0.8161 -
0.3443 94 0.6366 -
0.3480 95 0.7421 -
0.3516 96 0.8887 -
0.3553 97 0.8511 -
0.3590 98 0.5364 -
0.3626 99 0.7103 -
0.3663 100 0.4809 -
0.3700 101 0.4972 -
0.3736 102 0.711 -
0.3773 103 0.4176 -
0.3810 104 0.6275 -
0.3846 105 0.4639 -
0.3883 106 0.6699 -
0.3919 107 0.8027 -
0.3956 108 0.7053 -
0.3993 109 0.097 -
0.4029 110 0.7775 -
0.4066 111 0.4949 -
0.4103 112 0.7027 -
0.4139 113 0.3667 -
0.4176 114 0.6542 -
0.4212 115 0.5256 -
0.4249 116 0.5562 -
0.4286 117 0.365 -
0.4322 118 0.5834 -
0.4359 119 0.6584 -
0.4396 120 0.6638 -
0.4432 121 0.489 -
0.4469 122 0.5541 -
0.4505 123 0.1923 -
0.4542 124 0.4627 -
0.4579 125 0.4169 -
0.4615 126 0.3824 -
0.4652 127 0.5774 -
0.4689 128 0.3938 -
0.4725 129 0.5052 -
0.4762 130 0.6401 -
0.4799 131 0.5691 -
0.4835 132 0.5058 -
0.4872 133 0.5309 -
0.4908 134 0.4821 -
0.4945 135 0.5954 -
0.4982 136 0.3729 -
0.5018 137 0.6607 -
0.5055 138 0.5283 -
0.5092 139 0.6103 -
0.5128 140 0.456 -
0.5165 141 0.7122 -
0.5201 142 0.6458 -
0.5238 143 0.4434 -
0.5275 144 0.6982 -
0.5311 145 0.7074 -
0.5348 146 0.6441 -
0.5385 147 0.1969 -
0.5421 148 0.2974 -
0.5458 149 0.3946 -
0.5495 150 0.4603 -
0.5531 151 0.6021 -
0.5568 152 0.3643 -
0.5604 153 0.2497 -
0.5641 154 0.4532 -
0.5678 155 0.5185 -
0.5714 156 0.457 -
0.5751 157 0.4512 -
0.5788 158 0.48 -
0.5824 159 0.2682 -
0.5861 160 0.594 -
0.5897 161 0.6727 -
0.5934 162 0.7087 -
0.5971 163 0.4186 -
0.6007 164 0.4273 -
0.6044 165 0.5857 -
0.6081 166 0.2617 -
0.6117 167 0.4383 -
0.6154 168 0.4867 -
0.6190 169 0.4619 -
0.6227 170 0.1319 -
0.6264 171 0.2212 -
0.6300 172 0.5229 -
0.6337 173 0.6967 -
0.6374 174 0.338 -
0.6410 175 0.1651 -
0.6447 176 0.2449 -
0.6484 177 0.3473 -
0.6520 178 0.2902 -
0.6557 179 0.4093 -
0.6593 180 0.4406 -
0.6630 181 0.443 -
0.6667 182 0.4409 -
0.6703 183 0.7087 -
0.6740 184 0.4577 -
0.6777 185 0.3511 -
0.6813 186 0.3783 -
0.6850 187 0.5639 -
0.6886 188 0.4599 -
0.6923 189 0.4282 -
0.6960 190 0.242 -
0.6996 191 0.587 -
0.7033 192 0.67 -
0.7070 193 0.2562 -
0.7106 194 0.5278 -
0.7143 195 0.2321 -
0.7179 196 0.745 -
0.7216 197 0.6735 -
0.7253 198 0.4361 -
0.7289 199 0.3047 -
0.7326 200 0.3714 -
0.7363 201 0.8609 -
0.7399 202 0.4459 -
0.7436 203 0.1546 -
0.7473 204 0.4546 -
0.7509 205 0.4743 -
0.7546 206 0.3223 -
0.7582 207 0.4644 -
0.7619 208 0.6073 -
0.7656 209 0.5021 -
0.7692 210 0.5722 -
0.7729 211 0.237 -
0.7766 212 0.3782 -
0.7802 213 0.4302 -
0.7839 214 0.5929 -
0.7875 215 0.0646 -
0.7912 216 0.3934 -
0.7949 217 0.3317 -
0.7985 218 0.5997 -
0.8022 219 0.511 -
0.8059 220 0.384 -
0.8095 221 0.3319 -
0.8132 222 0.4738 -
0.8168 223 0.2536 -
0.8205 224 0.3429 -
0.8242 225 0.5208 -
0.8278 226 0.3044 -
0.8315 227 0.5025 -
0.8352 228 0.2541 -
0.8388 229 0.4347 -
0.8425 230 0.5067 -
0.8462 231 0.3975 -
0.8498 232 0.3168 -
0.8535 233 0.4299 -
0.8571 234 0.3067 -
0.8608 235 0.1385 -
0.8645 236 0.45 -
0.8681 237 0.7386 -
0.8718 238 0.4154 -
0.8755 239 0.287 -
0.8791 240 0.3703 -
0.8828 241 0.5419 -
0.8864 242 0.3498 -
0.8901 243 0.3481 -
0.8938 244 0.7203 -
0.8974 245 0.4363 -
0.9011 246 0.2272 -
0.9048 247 0.6132 -
0.9084 248 0.5764 -
0.9121 249 0.4819 -
0.9158 250 0.3273 -
0.9194 251 0.4039 -
0.9231 252 0.5303 -
0.9267 253 0.6131 -
0.9304 254 0.448 -
0.9341 255 0.0888 -
0.9377 256 0.4092 -
0.9414 257 0.196 -
0.9451 258 0.6282 -
0.9487 259 0.6653 -
0.9524 260 0.4198 -
0.9560 261 0.4985 -
0.9597 262 0.0 -
0.9634 263 0.2706 -
0.9670 264 0.5704 -
0.9707 265 0.4269 -
0.9744 266 0.2325 -
0.9780 267 0.4256 -
0.9817 268 0.4286 -
0.9853 269 0.3987 -
0.9890 270 0.4431 -
0.9927 271 0.578 -
0.9963 272 0.2845 -
1.0 273 0.1293 1.9891

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 5.1.2
  • Transformers: 4.53.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.9.0
  • Datasets: 4.4.1
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
128
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for axiomepic/gte-persian-seo-keyword-embedding

Finetuned
(1)
this model

Paper for axiomepic/gte-persian-seo-keyword-embedding