metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:498970
- loss:BPRLoss
base_model: answerdotai/ModernBERT-large
widget:
- source_sentence: lang last name
sentences:
- >-
Lang is a moderately common surname in the United States. When the
United States Census was taken in 2010, there were about 61,529
individuals with the last name Lang, ranking it number 545 for all
surnames. Historically, the name has been most prevalent in the Midwest,
especially in North Dakota. Lang is least common in the southeastern
states.
- >-
Flood Warning ...The National Weather Service in Houston/Galveston has
issued a flood warning for the following rivers... Long King Creek At
Livingston affecting the following counties in Texas... Polk...San
Jacinto For the Long King Creek, at Livingston, Minor flooding is
occuring and is expected to continue.
- "Langston Name Meaning. English (mainly West Midlands): habitational name from any of various places, for example Langstone in Devon and Hampshire, named with Old English lang â\x80\x98longâ\x80\x99, â\x80\x98tallâ\x80\x99 + stan â\x80\x98stoneâ\x80\x99, i.e. a menhir."
- source_sentence: average salary of a program manager in healthcare
sentences:
- >-
What is the average annual salary for Compliance Manager-Healthcare? The
annual salary for someone with the job title Compliance
Manager-Healthcare may vary depending on a number of factors including
industry, company size, location, years of experience and level of
education.or example the median expected annual pay for a typical
Compliance Manager-Healthcare in the United States is $92,278 so 50% of
the people who perform the job of Compliance Manager-Healthcare in the
United States are expected to make less than $92,278. Source: HR
Reported data as of October 2015.
- >-
Average Program Manager Healthcare Salaries. The average salary for
program manager healthcare jobs is $62,000. Average program manager
healthcare salaries can vary greatly due to company, location, industry,
experience and benefits. This salary was calculated using the average
salary for all jobs with the term program manager healthcare anywhere in
the job listing.
- >-
To apply for your IDNYC card, please follow these simple steps: Confirm
you have the correct documents to apply. The IDNYC program uses a point
system to determine if applicants are able to prove identity and
residency in New York City. You will need three points worth of
documents to prove your identity and a one point document to prove your
residency.
- source_sentence: when did brad paisley she's everything to me come out
sentences:
- >-
Jump to: Overview (3) | Mini Bio (1) | Spouse (1) | Trivia (16) |
Personal Quotes (59) Brad Paisley was born on October 28, 1972 in Glen
Dale, West Virginia, USA as Brad Douglas Paisley. He has been married to
Kimberly Williams-Paisley since March 15, 2003. They have two children.
- >-
A parasitic disease is an infectious disease caused or transmitted by a
parasite. Many parasites do not cause diseases. Parasitic diseases can
affect practically all living organisms, including plants and mammals.
The study of parasitic diseases is called parasitology.erminology
[edit]. Although organisms such as bacteria function as parasites, the
usage of the term parasitic disease is usually more restricted. The
three main types of organisms causing these conditions are protozoa
(causing protozoan infection), helminths (helminthiasis), and
ectoparasites.
- >-
She's Everything. She's Everything is a song co-written and recorded by
American country music artist Brad Paisley. It reached the top of the
Billboard Hot Country Songs Chart. It was released in August 2006 as the
fourth and final single from Paisley's album Time Well Wasted. It was
Paisley's seventh number one single.
- source_sentence: who did lynda carter voice in elder scrolls
sentences:
- >-
By Wade Steel. Bethesda Softworks announced today that actress Lynda
Carter will join the voice cast for to its upcoming epic RPG The Elder
Scrolls IV: Oblivion. The actress, best known for her television role as
Wonder Woman, had previously provided her vocal talents for Elder
Scrolls III: Morrowind and its Bloodmoon expansion.
- "revise verb (STUDY). B1 [I or T] UK (US review) to â\x80\x8Bstudy again something you have already â\x80\x8Blearned, in â\x80\x8Bpreparation for an â\x80\x8Bexam: We're revising (â\x80\x8Balgebra) for the â\x80\x8Btest â\x80\x8Btomorrow. (Definition of revise from the Cambridge Advanced Learnerâ\x80\x99s Dictionary & Thesaurus © Cambridge University Press)."
- >-
Lynda Carter (born Linda Jean Córdova Carter; July 24, 1951) is an
American actress, singer, songwriter and beauty pageant titleholder who
was crowned Miss World America 1972 and also the star of the TV series
Wonder Woman from 1975 to 1979.
- source_sentence: what county is phillips wi
sentences:
- >-
Motto: It's not what you show, it's what you grow.. Location within
Phillips County and Colorado. Holyoke is the Home Rule Municipality that
is the county seat and the most populous municipality of Phillips
County, Colorado, United States. The city population was 2,313 at the
2010 census.
- "Phillips is a city in Price County, Wisconsin, United States. The population was 1,675 at the 2000 census. It is the county seat of Price County. Phillips is located at 45°41â\x80²30â\x80³N 90°24â\x80²7â\x80³W / 45.69167°N 90.40194°W / 45.69167; -90.40194 (45.691560, -90.401915). It is on highway SR 13, 77 miles north of Marshfield, and 74 miles south of Ashland."
- >-
Various spellings from the numerous languages for Miller include
Mueller, Mahler, Millar, Molenaar, Mills, Moeller, and Mullar. In
Italian the surname is spelled Molinaro and in Spanish it is Molinero.
The surname of Miller is most common in England, Scotland, United
States, Germany, Spain and Italy. In the United States the name is
seventh most common surname in the country.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
SentenceTransformer based on answerdotai/ModernBERT-large
This is a sentence-transformers model finetuned from answerdotai/ModernBERT-large. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: answerdotai/ModernBERT-large
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 1024 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("BlackBeenie/ModernBERT-large-msmarco-v3-bpr")
# Run inference
sentences = [
'what county is phillips wi',
'Phillips is a city in Price County, Wisconsin, United States. The population was 1,675 at the 2000 census. It is the county seat of Price County. Phillips is located at 45°41â\x80²30â\x80³N 90°24â\x80²7â\x80³W / 45.69167°N 90.40194°W / 45.69167; -90.40194 (45.691560, -90.401915). It is on highway SR 13, 77 miles north of Marshfield, and 74 miles south of Ashland.',
"Motto: It's not what you show, it's what you grow.. Location within Phillips County and Colorado. Holyoke is the Home Rule Municipality that is the county seat and the most populous municipality of Phillips County, Colorado, United States. The city population was 2,313 at the 2010 census.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 498,970 training samples
- Columns:
sentence_0
,sentence_1
, andsentence_2
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 sentence_2 type string string string details - min: 4 tokens
- mean: 9.24 tokens
- max: 27 tokens
- min: 23 tokens
- mean: 83.71 tokens
- max: 279 tokens
- min: 16 tokens
- mean: 80.18 tokens
- max: 262 tokens
- Samples:
sentence_0 sentence_1 sentence_2 what is tongkat ali
Tongkat Ali is a very powerful herb that acts as a sex enhancer by naturally increasing the testosterone levels, and revitalizing sexual impotence, performance and pleasure. Tongkat Ali is also effective in building muscular volume & strength resulting to a healthy physique.
However, unlike tongkat ali extract, tongkat ali chipped root and root powder are not sterile. Thus, the raw consumption of root powder is not recommended. The traditional preparation in Indonesia and Malaysia is to boil chipped roots as a tea.
cost to install engineered hardwood flooring
Burton says his customers typically spend about $8 per square foot for engineered hardwood flooring; add an additional $2 per square foot for installation. Minion says consumers should expect to pay $7 to $12 per square foot for quality hardwood flooring. âIf the homeowner buys the wood and you need somebody to install it, usually an installation goes for about $2 a square foot,â Bill LeBeau, owner of LeBeauâs Hardwood Floors of Huntersville, North Carolina, says.
Engineered Wood Flooring Installation - Average Cost Per Square Foot. Expect to pay in the higher end of the price range for a licensed, insured and reputable pro - and for complex or rush projects. To lower Engineered Wood Flooring Installation costs: combine related projects, minimize options/extras and be flexible about project scheduling.
define pollute
pollutes; polluted; polluting. Learner's definition of POLLUTE. [+ object] : to make (land, water, air, etc.) dirty and not safe or suitable to use. Waste from the factory had polluted [=contaminated] the river. Miles of beaches were polluted by the oil spill. Car exhaust pollutes the air.
Chemical water pollution. Industrial and agricultural work involves the use of many different chemicals that can run-off into water and pollute it.1 Metals and solvents from industrial work can pollute rivers and lakes.2 These are poisonous to many forms of aquatic life and may slow their development, make them infertile or even result in death.ndustrial and agricultural work involves the use of many different chemicals that can run-off into water and pollute it. 1 Metals and solvents from industrial work can pollute rivers and lakes.
- Loss:
beir.losses.bpr_loss.BPRLoss
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 64per_device_eval_batch_size
: 64num_train_epochs
: 6multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 64per_device_eval_batch_size
: 64per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 6max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss |
---|---|---|
0.0641 | 500 | 1.4036 |
0.1283 | 1000 | 0.36 |
0.1924 | 1500 | 0.3305 |
0.2565 | 2000 | 0.2874 |
0.3206 | 2500 | 0.2732 |
0.3848 | 3000 | 0.2446 |
0.4489 | 3500 | 0.2399 |
0.5130 | 4000 | 0.2302 |
0.5771 | 4500 | 0.231 |
0.6413 | 5000 | 0.2217 |
0.7054 | 5500 | 0.2192 |
0.7695 | 6000 | 0.2087 |
0.8337 | 6500 | 0.2104 |
0.8978 | 7000 | 0.2069 |
0.9619 | 7500 | 0.2071 |
1.0 | 7797 | - |
1.0260 | 8000 | 0.1663 |
1.0902 | 8500 | 0.1213 |
1.1543 | 9000 | 0.1266 |
1.2184 | 9500 | 0.1217 |
1.2825 | 10000 | 0.1193 |
1.3467 | 10500 | 0.1198 |
1.4108 | 11000 | 0.1258 |
1.4749 | 11500 | 0.1266 |
1.5391 | 12000 | 0.1334 |
1.6032 | 12500 | 0.1337 |
1.6673 | 13000 | 0.1258 |
1.7314 | 13500 | 0.1268 |
1.7956 | 14000 | 0.1249 |
1.8597 | 14500 | 0.1256 |
1.9238 | 15000 | 0.1238 |
1.9879 | 15500 | 0.1274 |
2.0 | 15594 | - |
2.0521 | 16000 | 0.0776 |
2.1162 | 16500 | 0.0615 |
2.1803 | 17000 | 0.0647 |
2.2445 | 17500 | 0.0651 |
2.3086 | 18000 | 0.0695 |
2.3727 | 18500 | 0.0685 |
2.4368 | 19000 | 0.0685 |
2.5010 | 19500 | 0.0707 |
2.5651 | 20000 | 0.073 |
2.6292 | 20500 | 0.0696 |
2.6933 | 21000 | 0.0694 |
2.7575 | 21500 | 0.0701 |
2.8216 | 22000 | 0.0668 |
2.8857 | 22500 | 0.07 |
2.9499 | 23000 | 0.0649 |
3.0 | 23391 | - |
3.0140 | 23500 | 0.0589 |
3.0781 | 24000 | 0.0316 |
3.1422 | 24500 | 0.0377 |
3.2064 | 25000 | 0.039 |
3.2705 | 25500 | 0.0335 |
3.3346 | 26000 | 0.0387 |
3.3987 | 26500 | 0.0367 |
3.4629 | 27000 | 0.0383 |
3.5270 | 27500 | 0.0407 |
3.5911 | 28000 | 0.0372 |
3.6553 | 28500 | 0.0378 |
3.7194 | 29000 | 0.0359 |
3.7835 | 29500 | 0.0394 |
3.8476 | 30000 | 0.0388 |
3.9118 | 30500 | 0.0422 |
3.9759 | 31000 | 0.0391 |
4.0 | 31188 | - |
4.0400 | 31500 | 0.0251 |
4.1041 | 32000 | 0.0199 |
4.1683 | 32500 | 0.0261 |
4.2324 | 33000 | 0.021 |
4.2965 | 33500 | 0.0196 |
4.3607 | 34000 | 0.0181 |
4.4248 | 34500 | 0.0228 |
4.4889 | 35000 | 0.0195 |
4.5530 | 35500 | 0.02 |
4.6172 | 36000 | 0.0251 |
4.6813 | 36500 | 0.0213 |
4.7454 | 37000 | 0.0208 |
4.8095 | 37500 | 0.0192 |
4.8737 | 38000 | 0.0204 |
4.9378 | 38500 | 0.0176 |
5.0 | 38985 | - |
5.0019 | 39000 | 0.0184 |
5.0661 | 39500 | 0.0136 |
5.1302 | 40000 | 0.0102 |
5.1943 | 40500 | 0.0122 |
5.2584 | 41000 | 0.0124 |
5.3226 | 41500 | 0.013 |
5.3867 | 42000 | 0.0105 |
5.4508 | 42500 | 0.0135 |
5.5149 | 43000 | 0.0158 |
5.5791 | 43500 | 0.015 |
5.6432 | 44000 | 0.0128 |
5.7073 | 44500 | 0.0105 |
5.7715 | 45000 | 0.014 |
5.8356 | 45500 | 0.0125 |
5.8997 | 46000 | 0.0139 |
5.9638 | 46500 | 0.0137 |
6.0 | 46782 | - |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.3.1
- Transformers: 4.48.0.dev0
- PyTorch: 2.5.1+cu121
- Accelerate: 1.2.1
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}