Edit model card

SentenceTransformer based on Qwen/Qwen2-1.5B-instruct

This is a sentence-transformers model finetuned from Qwen/Qwen2-1.5B-instruct. It maps sentences & paragraphs to a 1536-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Qwen/Qwen2-1.5B-instruct
  • Maximum Sequence Length: 32768 tokens
  • Output Dimensionality: 1536 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 32768, 'do_lower_case': False}) with Transformer model: Qwen2Model 
  (1): Pooling({'word_embedding_dimension': 1536, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("asbabiy/crm-mail-embedder-cosent")
# Run inference
sentences = [
    'Mail Queue: ratehawk-b2b\nMail From: aa3b09f5a33cf090e29667bf72936a77@travelclub.ae\nMail To: support@ratehawk.com\n\nMail Subject: Ticket Closed - URGENT : Reconfirmation & HCN for ATS160057 : 139201464/Check-in date - 12 Mar 2024\n\nMail Body:\n"""\nDear Support,  Your ticket - URGENT : Reconfirmation & HCN for ATS160057 : 139201464/Check-in date - 12 Mar 2024 -  has been closed.  We hope that the ticket was resolved to your satisfaction. If you feel that the ticket should not be closed or if the ticket has not been resolved, please reply to this email.  Sincerely, Travelclub Support Team https://blue7tech-help.freshdesk.com/helpdesk/tickets/63824\n"""',
    "Email category: 'TPP -- Auto template'. Email category description: 'This is an automated email from the supplier acknowledging receipt of a previous communication or providing a status update on a pending request without any specific update on the request. It solely includes a phrase indicating that the request has been acknowledged. Such emails may contain messages such as: information that the request has been taken or in process; that the ticket for the request has been created; that it is a holiday and the office hours have changed; that the company's working hours have been adjusted; that a number has been assigned to the request and updates will be provided once available; that the information has been received and transffered to the guest or hotel; or that they will contact us shortly. Also this can be message from any of our supplier stating that our account recently attempted to log in from New Browser. The purpose of this email is to let you know that your message has been received and is being handled.Email lacks personalized details specific to the recipient's situation or references to a unique order or request, which may indicate it is a generic automated response. Auto-emails are often rich with html formatting, tabular data and have a lot of tags or links.'",
    "Email category: 'TPP -- Additional request of arrival time'. Email category description: 'A request from the supplier asking for the client to provide the exact or approximate check-in/arrival time as this is requested by the hotel due to different reasons. For example, the hotel does not have 24 hour reception and for this reason is asking for the arrival time. Information about the check-in helps the hotel better prepare for the guest's arrival and plan the schedule of the hotel staff.'",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1536]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • gradient_accumulation_steps: 16
  • learning_rate: 1e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • bf16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss
0.0031 5 1.8139 -
0.0062 10 1.699 -
0.0093 15 1.6467 -
0.0124 20 1.7853 -
0.0155 25 1.7918 -
0.0186 30 1.9042 -
0.0217 35 1.7087 -
0.0248 40 1.7143 -
0.0279 45 1.7357 -
0.0310 50 1.5956 1.6129
0.0341 55 1.7191 -
0.0372 60 1.5434 -
0.0403 65 1.6527 -
0.0434 70 1.6267 -
0.0465 75 1.5512 -
0.0497 80 1.4611 -
0.0528 85 1.49 -
0.0559 90 1.4336 -
0.0590 95 1.3646 -
0.0621 100 1.5523 1.4122
0.0652 105 1.4359 -
0.0683 110 1.4459 -
0.0714 115 1.4872 -
0.0745 120 1.3775 -
0.0776 125 1.3807 -
0.0807 130 1.3692 -
0.0838 135 1.3156 -
0.0869 140 1.328 -
0.0900 145 1.5123 -
0.0931 150 1.4037 1.3554
0.0962 155 1.4797 -
0.0993 160 1.4434 -
0.1024 165 1.3876 -
0.1055 170 1.3611 -
0.1086 175 1.3986 -
0.1117 180 1.3135 -
0.1148 185 1.3268 -
0.1179 190 1.2853 -
0.1210 195 1.3606 -
0.1241 200 1.4254 1.3225
0.1272 205 1.3152 -
0.1303 210 1.3482 -
0.1334 215 1.347 -
0.1365 220 1.3722 -
0.1396 225 1.3877 -
0.1428 230 1.3635 -
0.1459 235 1.4738 -
0.1490 240 1.4063 -
0.1521 245 1.3481 -
0.1552 250 1.3221 1.2848
0.1583 255 1.1117 -
0.1614 260 1.33 -
0.1645 265 1.3461 -
0.1676 270 1.2067 -
0.1707 275 1.3238 -
0.1738 280 1.4214 -
0.1769 285 1.3172 -
0.1800 290 1.2829 -
0.1831 295 1.3561 -
0.1862 300 1.2153 1.2869
0.1893 305 1.3482 -
0.1924 310 1.4491 -
0.1955 315 1.296 -
0.1986 320 1.5481 -
0.2017 325 1.3483 -
0.2048 330 1.2984 -
0.2079 335 1.2619 -
0.2110 340 1.2424 -
0.2141 345 1.3138 -
0.2172 350 1.4771 1.2831
0.2203 355 1.4589 -
0.2234 360 1.2647 -
0.2265 365 1.3268 -
0.2296 370 1.2185 -
0.2327 375 1.2264 -
0.2359 380 1.4256 -
0.2390 385 1.5409 -
0.2421 390 1.3106 -
0.2452 395 1.3129 -
0.2483 400 1.4063 1.2688
0.2514 405 1.1013 -
0.2545 410 1.3415 -
0.2576 415 1.4586 -
0.2607 420 1.2412 -
0.2638 425 1.3019 -
0.2669 430 1.2388 -
0.2700 435 1.3902 -
0.2731 440 1.3822 -
0.2762 445 1.2138 -
0.2793 450 1.4039 1.2490
0.2824 455 1.1758 -
0.2855 460 1.306 -
0.2886 465 1.4698 -
0.2917 470 1.2116 -
0.2948 475 1.2531 -
0.2979 480 1.3357 -
0.3010 485 1.1919 -
0.3041 490 1.3818 -
0.3072 495 1.2979 -
0.3103 500 1.2832 1.2466
0.3134 505 1.1689 -
0.3165 510 1.2198 -
0.3196 515 1.2775 -
0.3227 520 1.1344 -
0.3258 525 1.4492 -
0.3289 530 1.2328 -
0.3321 535 1.3306 -
0.3352 540 1.1076 -
0.3383 545 1.285 -
0.3414 550 1.2523 1.2435
0.3445 555 1.1712 -
0.3476 560 1.4021 -
0.3507 565 1.3476 -
0.3538 570 1.1485 -
0.3569 575 1.2621 -
0.3600 580 1.2829 -
0.3631 585 1.274 -
0.3662 590 1.2649 -
0.3693 595 1.2262 -
0.3724 600 1.1743 1.2378
0.3755 605 1.1773 -
0.3786 610 1.1977 -
0.3817 615 1.3976 -
0.3848 620 1.1817 -
0.3879 625 1.1928 -
0.3910 630 1.2338 -
0.3941 635 1.1803 -
0.3972 640 1.3811 -
0.4003 645 1.3125 -
0.4034 650 1.1878 1.2311
0.4065 655 1.4805 -
0.4096 660 1.1262 -
0.4127 665 1.1919 -
0.4158 670 1.2076 -
0.4189 675 1.2401 -
0.4220 680 1.3019 -
0.4252 685 1.3285 -
0.4283 690 1.1257 -
0.4314 695 1.2628 -
0.4345 700 1.1846 1.2354
0.4376 705 1.0939 -
0.4407 710 1.2502 -
0.4438 715 1.3645 -
0.4469 720 1.2408 -
0.4500 725 1.3127 -
0.4531 730 1.2795 -
0.4562 735 1.3127 -
0.4593 740 1.2164 -
0.4624 745 1.2942 -
0.4655 750 1.1968 1.2342
0.4686 755 1.2426 -
0.4717 760 1.2269 -
0.4748 765 1.3602 -
0.4779 770 1.2335 -
0.4810 775 1.3015 -
0.4841 780 1.1144 -
0.4872 785 1.3083 -
0.4903 790 1.273 -
0.4934 795 1.1784 -
0.4965 800 1.204 1.2348

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.44.0
  • PyTorch: 2.2.0+cu121
  • Accelerate: 0.33.0
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}
Downloads last month
2
Safetensors
Model size
1.54B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.