--- tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:498970 - loss:BPRLoss base_model: answerdotai/ModernBERT-large widget: - source_sentence: lang last name sentences: - Lang is a moderately common surname in the United States. When the United States Census was taken in 2010, there were about 61,529 individuals with the last name Lang, ranking it number 545 for all surnames. Historically, the name has been most prevalent in the Midwest, especially in North Dakota. Lang is least common in the southeastern states. - Flood Warning ...The National Weather Service in Houston/Galveston has issued a flood warning for the following rivers... Long King Creek At Livingston affecting the following counties in Texas... Polk...San Jacinto For the Long King Creek, at Livingston, Minor flooding is occuring and is expected to continue. - "Langston Name Meaning. English (mainly West Midlands): habitational name from\ \ any of various places, for example Langstone in Devon and Hampshire, named with\ \ Old English lang â\x80\x98longâ\x80\x99, â\x80\x98tallâ\x80\x99 + stan â\x80\ \x98stoneâ\x80\x99, i.e. a menhir." - source_sentence: average salary of a program manager in healthcare sentences: - 'What is the average annual salary for Compliance Manager-Healthcare? The annual salary for someone with the job title Compliance Manager-Healthcare may vary depending on a number of factors including industry, company size, location, years of experience and level of education.or example the median expected annual pay for a typical Compliance Manager-Healthcare in the United States is $92,278 so 50% of the people who perform the job of Compliance Manager-Healthcare in the United States are expected to make less than $92,278. Source: HR Reported data as of October 2015.' - Average Program Manager Healthcare Salaries. The average salary for program manager healthcare jobs is $62,000. Average program manager healthcare salaries can vary greatly due to company, location, industry, experience and benefits. This salary was calculated using the average salary for all jobs with the term program manager healthcare anywhere in the job listing. - 'To apply for your IDNYC card, please follow these simple steps: Confirm you have the correct documents to apply. The IDNYC program uses a point system to determine if applicants are able to prove identity and residency in New York City. You will need three points worth of documents to prove your identity and a one point document to prove your residency.' - source_sentence: when did brad paisley she's everything to me come out sentences: - 'Jump to: Overview (3) | Mini Bio (1) | Spouse (1) | Trivia (16) | Personal Quotes (59) Brad Paisley was born on October 28, 1972 in Glen Dale, West Virginia, USA as Brad Douglas Paisley. He has been married to Kimberly Williams-Paisley since March 15, 2003. They have two children.' - A parasitic disease is an infectious disease caused or transmitted by a parasite. Many parasites do not cause diseases. Parasitic diseases can affect practically all living organisms, including plants and mammals. The study of parasitic diseases is called parasitology.erminology [edit]. Although organisms such as bacteria function as parasites, the usage of the term parasitic disease is usually more restricted. The three main types of organisms causing these conditions are protozoa (causing protozoan infection), helminths (helminthiasis), and ectoparasites. - She's Everything. She's Everything is a song co-written and recorded by American country music artist Brad Paisley. It reached the top of the Billboard Hot Country Songs Chart. It was released in August 2006 as the fourth and final single from Paisley's album Time Well Wasted. It was Paisley's seventh number one single. - source_sentence: who did lynda carter voice in elder scrolls sentences: - 'By Wade Steel. Bethesda Softworks announced today that actress Lynda Carter will join the voice cast for to its upcoming epic RPG The Elder Scrolls IV: Oblivion. The actress, best known for her television role as Wonder Woman, had previously provided her vocal talents for Elder Scrolls III: Morrowind and its Bloodmoon expansion.' - "revise verb (STUDY). B1 [I or T] UK (US review) to â\x80\x8Bstudy again something\ \ you have already â\x80\x8Blearned, in â\x80\x8Bpreparation for an â\x80\x8B\ exam: We're revising (â\x80\x8Balgebra) for the â\x80\x8Btest â\x80\x8Btomorrow.\ \ (Definition of revise from the Cambridge Advanced Learnerâ\x80\x99s Dictionary\ \ & Thesaurus © Cambridge University Press)." - Lynda Carter (born Linda Jean Córdova Carter; July 24, 1951) is an American actress, singer, songwriter and beauty pageant titleholder who was crowned Miss World America 1972 and also the star of the TV series Wonder Woman from 1975 to 1979. - source_sentence: what county is phillips wi sentences: - 'Motto: It''s not what you show, it''s what you grow.. Location within Phillips County and Colorado. Holyoke is the Home Rule Municipality that is the county seat and the most populous municipality of Phillips County, Colorado, United States. The city population was 2,313 at the 2010 census.' - "Phillips is a city in Price County, Wisconsin, United States. The population\ \ was 1,675 at the 2000 census. It is the county seat of Price County. Phillips\ \ is located at 45°41â\x80²30â\x80³N 90°24â\x80²7â\x80³W / 45.69167°N 90.40194°W\ \ / 45.69167; -90.40194 (45.691560, -90.401915). It is on highway SR 13, 77 miles\ \ north of Marshfield, and 74 miles south of Ashland." - Various spellings from the numerous languages for Miller include Mueller, Mahler, Millar, Molenaar, Mills, Moeller, and Mullar. In Italian the surname is spelled Molinaro and in Spanish it is Molinero. The surname of Miller is most common in England, Scotland, United States, Germany, Spain and Italy. In the United States the name is seventh most common surname in the country. pipeline_tag: sentence-similarity library_name: sentence-transformers --- # SentenceTransformer based on answerdotai/ModernBERT-large This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) - **Maximum Sequence Length:** 8192 tokens - **Output Dimensionality:** 1024 dimensions - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("BlackBeenie/ModernBERT-large-msmarco-v3-bpr") # Run inference sentences = [ 'what county is phillips wi', 'Phillips is a city in Price County, Wisconsin, United States. The population was 1,675 at the 2000 census. It is the county seat of Price County. Phillips is located at 45°41â\x80²30â\x80³N 90°24â\x80²7â\x80³W / 45.69167°N 90.40194°W / 45.69167; -90.40194 (45.691560, -90.401915). It is on highway SR 13, 77 miles north of Marshfield, and 74 miles south of Ashland.', "Motto: It's not what you show, it's what you grow.. Location within Phillips County and Colorado. Holyoke is the Home Rule Municipality that is the county seat and the most populous municipality of Phillips County, Colorado, United States. The city population was 2,313 at the 2010 census.", ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 1024] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 498,970 training samples * Columns: sentence_0, sentence_1, and sentence_2 * Approximate statistics based on the first 1000 samples: | | sentence_0 | sentence_1 | sentence_2 | |:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| | type | string | string | string | | details | | | | * Samples: | sentence_0 | sentence_1 | sentence_2 | |:----------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | what is tongkat ali | Tongkat Ali is a very powerful herb that acts as a sex enhancer by naturally increasing the testosterone levels, and revitalizing sexual impotence, performance and pleasure. Tongkat Ali is also effective in building muscular volume & strength resulting to a healthy physique. | However, unlike tongkat ali extract, tongkat ali chipped root and root powder are not sterile. Thus, the raw consumption of root powder is not recommended. The traditional preparation in Indonesia and Malaysia is to boil chipped roots as a tea. | | cost to install engineered hardwood flooring | Burton says his customers typically spend about $8 per square foot for engineered hardwood flooring; add an additional $2 per square foot for installation. Minion says consumers should expect to pay $7 to $12 per square foot for quality hardwood flooring. “If the homeowner buys the wood and you need somebody to install it, usually an installation goes for about $2 a square foot,” Bill LeBeau, owner of LeBeau’s Hardwood Floors of Huntersville, North Carolina, says. | Engineered Wood Flooring Installation - Average Cost Per Square Foot. Expect to pay in the higher end of the price range for a licensed, insured and reputable pro - and for complex or rush projects. To lower Engineered Wood Flooring Installation costs: combine related projects, minimize options/extras and be flexible about project scheduling. | | define pollute | pollutes; polluted; polluting. Learner's definition of POLLUTE. [+ object] : to make (land, water, air, etc.) dirty and not safe or suitable to use. Waste from the factory had polluted [=contaminated] the river. Miles of beaches were polluted by the oil spill. Car exhaust pollutes the air. | Chemical water pollution. Industrial and agricultural work involves the use of many different chemicals that can run-off into water and pollute it.1 Metals and solvents from industrial work can pollute rivers and lakes.2 These are poisonous to many forms of aquatic life and may slow their development, make them infertile or even result in death.ndustrial and agricultural work involves the use of many different chemicals that can run-off into water and pollute it. 1 Metals and solvents from industrial work can pollute rivers and lakes. | * Loss: beir.losses.bpr_loss.BPRLoss ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: steps - `per_device_train_batch_size`: 64 - `per_device_eval_batch_size`: 64 - `num_train_epochs`: 6 - `multi_dataset_batch_sampler`: round_robin #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: steps - `prediction_loss_only`: True - `per_device_train_batch_size`: 64 - `per_device_eval_batch_size`: 64 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1 - `num_train_epochs`: 6 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.0 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `eval_use_gather_object`: False - `average_tokens_across_devices`: False - `prompts`: None - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: round_robin
### Training Logs | Epoch | Step | Training Loss | |:------:|:-----:|:-------------:| | 0.0641 | 500 | 1.4036 | | 0.1283 | 1000 | 0.36 | | 0.1924 | 1500 | 0.3305 | | 0.2565 | 2000 | 0.2874 | | 0.3206 | 2500 | 0.2732 | | 0.3848 | 3000 | 0.2446 | | 0.4489 | 3500 | 0.2399 | | 0.5130 | 4000 | 0.2302 | | 0.5771 | 4500 | 0.231 | | 0.6413 | 5000 | 0.2217 | | 0.7054 | 5500 | 0.2192 | | 0.7695 | 6000 | 0.2087 | | 0.8337 | 6500 | 0.2104 | | 0.8978 | 7000 | 0.2069 | | 0.9619 | 7500 | 0.2071 | | 1.0 | 7797 | - | | 1.0260 | 8000 | 0.1663 | | 1.0902 | 8500 | 0.1213 | | 1.1543 | 9000 | 0.1266 | | 1.2184 | 9500 | 0.1217 | | 1.2825 | 10000 | 0.1193 | | 1.3467 | 10500 | 0.1198 | | 1.4108 | 11000 | 0.1258 | | 1.4749 | 11500 | 0.1266 | | 1.5391 | 12000 | 0.1334 | | 1.6032 | 12500 | 0.1337 | | 1.6673 | 13000 | 0.1258 | | 1.7314 | 13500 | 0.1268 | | 1.7956 | 14000 | 0.1249 | | 1.8597 | 14500 | 0.1256 | | 1.9238 | 15000 | 0.1238 | | 1.9879 | 15500 | 0.1274 | | 2.0 | 15594 | - | | 2.0521 | 16000 | 0.0776 | | 2.1162 | 16500 | 0.0615 | | 2.1803 | 17000 | 0.0647 | | 2.2445 | 17500 | 0.0651 | | 2.3086 | 18000 | 0.0695 | | 2.3727 | 18500 | 0.0685 | | 2.4368 | 19000 | 0.0685 | | 2.5010 | 19500 | 0.0707 | | 2.5651 | 20000 | 0.073 | | 2.6292 | 20500 | 0.0696 | | 2.6933 | 21000 | 0.0694 | | 2.7575 | 21500 | 0.0701 | | 2.8216 | 22000 | 0.0668 | | 2.8857 | 22500 | 0.07 | | 2.9499 | 23000 | 0.0649 | | 3.0 | 23391 | - | | 3.0140 | 23500 | 0.0589 | | 3.0781 | 24000 | 0.0316 | | 3.1422 | 24500 | 0.0377 | | 3.2064 | 25000 | 0.039 | | 3.2705 | 25500 | 0.0335 | | 3.3346 | 26000 | 0.0387 | | 3.3987 | 26500 | 0.0367 | | 3.4629 | 27000 | 0.0383 | | 3.5270 | 27500 | 0.0407 | | 3.5911 | 28000 | 0.0372 | | 3.6553 | 28500 | 0.0378 | | 3.7194 | 29000 | 0.0359 | | 3.7835 | 29500 | 0.0394 | | 3.8476 | 30000 | 0.0388 | | 3.9118 | 30500 | 0.0422 | | 3.9759 | 31000 | 0.0391 | | 4.0 | 31188 | - | | 4.0400 | 31500 | 0.0251 | | 4.1041 | 32000 | 0.0199 | | 4.1683 | 32500 | 0.0261 | | 4.2324 | 33000 | 0.021 | | 4.2965 | 33500 | 0.0196 | | 4.3607 | 34000 | 0.0181 | | 4.4248 | 34500 | 0.0228 | | 4.4889 | 35000 | 0.0195 | | 4.5530 | 35500 | 0.02 | | 4.6172 | 36000 | 0.0251 | | 4.6813 | 36500 | 0.0213 | | 4.7454 | 37000 | 0.0208 | | 4.8095 | 37500 | 0.0192 | | 4.8737 | 38000 | 0.0204 | | 4.9378 | 38500 | 0.0176 | | 5.0 | 38985 | - | | 5.0019 | 39000 | 0.0184 | | 5.0661 | 39500 | 0.0136 | | 5.1302 | 40000 | 0.0102 | | 5.1943 | 40500 | 0.0122 | | 5.2584 | 41000 | 0.0124 | | 5.3226 | 41500 | 0.013 | | 5.3867 | 42000 | 0.0105 | | 5.4508 | 42500 | 0.0135 | | 5.5149 | 43000 | 0.0158 | | 5.5791 | 43500 | 0.015 | | 5.6432 | 44000 | 0.0128 | | 5.7073 | 44500 | 0.0105 | | 5.7715 | 45000 | 0.014 | | 5.8356 | 45500 | 0.0125 | | 5.8997 | 46000 | 0.0139 | | 5.9638 | 46500 | 0.0137 | | 6.0 | 46782 | - | ### Framework Versions - Python: 3.10.12 - Sentence Transformers: 3.3.1 - Transformers: 4.48.0.dev0 - PyTorch: 2.5.1+cu121 - Accelerate: 1.2.1 - Datasets: 3.2.0 - Tokenizers: 0.21.0 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ```