SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-mpnet-base-v2
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
- Training Dataset:
- json
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("stephenhib/all-mpnet-base-v2-patabs-1epoc-batch32-100000")
# Run inference
sentences = [
'<p id="pa01" num="0001">An application apparatus (100) includes: an application needle (24) that applies, to a target, an application material having its viscosity changing under shear; a drive unit (90) that moves the application needle (24) up and down; and a controller (80) that controls the drive unit (90) to move the application needle such that shear is applied to the application material at a shear speed depending on a type of the application material and depending on a target application amount or a target application diameter.<img id="iaf01" file="imgaf001.tif" wi="78" he="56" img-content="drawing" img-format="tif"/></p>',
'COATING APPARATUS AND COATING METHOD',
'Electric motor',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Dataset:
sentence-transformers/all-mpnet-base-v2
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.592 |
cosine_accuracy@3 | 0.711 |
cosine_accuracy@5 | 0.751 |
cosine_accuracy@10 | 0.814 |
cosine_precision@1 | 0.592 |
cosine_precision@3 | 0.237 |
cosine_precision@5 | 0.1502 |
cosine_precision@10 | 0.0814 |
cosine_recall@1 | 0.592 |
cosine_recall@3 | 0.711 |
cosine_recall@5 | 0.751 |
cosine_recall@10 | 0.814 |
cosine_ndcg@10 | 0.6988 |
cosine_mrr@10 | 0.6625 |
cosine_map@100 | 0.6665 |
Training Details
Training Dataset
json
- Dataset: json
- Size: 807,656 training samples
- Columns:
positive
andanchor
- Approximate statistics based on the first 1000 samples:
positive anchor type string string details - min: 45 tokens
- mean: 237.14 tokens
- max: 384 tokens
- min: 3 tokens
- mean: 12.34 tokens
- max: 101 tokens
- Samples:
positive anchor The invention relates to an image fusion method and device, which includes: obtaining a first short-focus image and a first long-focus image acquired by a short-focus sensor and a long-focus sensor at the same time; according to the focal lengths of a short-focus lens and a long-focus lens, calculating a reduction coefficient corresponding to the first long-focus image when the sizes of the same target in the first long-focus image and the first short-focus image are matched; performing a reduction processing on the first long-focus image according to the reduction coefficient to obtain a second long-focus image; according to a relative angle of the current long-focus lens and short-focus lens, calculating a position of the second long-focus image in the first short-focus image when the positions of the same target in the second long-focus image and the first short-focus image are matched; and according to the position of the second long-focus image in the first short-focus image, covering the first short-focus image by the second long-focus image to obtain a fused image. According to embodiments of the present application, on the premise of considering both the monitoring range and the definition, the monitoring cost is reduced, and the monitoring efficiency is improved.
IMAGE FUSION METHOD AND DEVICE
The present invention discloses an ex vivo method for the diagnostic and/or prognostic assessment of the acute-on-chronic liver failure (ACLF) syndrome in a patient with a liver disorder characterized in that it comprises the steps of: (a) measuring a panel of metabolites related with acylcarnitines-sialic acid-acetylated amino acids and/or sugar alcohols and derivatives-tryptophan metabolism-catecholamines derivatives in a biological sample of said patient; and (b) comparing the level of said metabolites in the sample with the level of said metabolites in healthy patients; and wherein an increase of at least 1.2 times of the level of said metabolites is indicative of ACLF syndrome.
METHOD FOR THE DIAGNOSTIC AND/OR PROGNOSTIC ASSESSMENT OF ACUTE-ON-CHRONIC LIVER FAILURE SYNDROME IN PATIENTS WITH LIVER DISORDERS
A valve housing receives a spool 34 and the spool has a regulating chamber 52 selectively communicating a supply line to a return line. The spool 34 is biased in one direction by a spring force and there is a second force biasing the spool in an opposed direction whith the second bias force being provided by a fluid pressure within a hydraulic system associated which the pressure regulating valve. The amount of communication between the supply port 111 and the return port 99 is regulated by a position of the spool 34 as the bias force from the fluid pressure change. Damper chambers are provided on opposed sides of the spool and serve to dampen a speed of movement of the spool and a supply line for supplying fluid into the damper chambers through check valves 44, 64. The supply line serves to assist in purging air outwardly of the damper chambers.
Air purging pressure regulating valve
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 4per_device_eval_batch_size
: 2learning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1bf16
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 4per_device_eval_batch_size
: 2per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | sentence-transformers/all-mpnet-base-v2_cosine_map@100 |
---|---|---|---|
0.032 | 100 | 0.1433 | 0.6217 |
0.064 | 200 | 0.0953 | 0.6447 |
0.096 | 300 | 0.1084 | 0.6612 |
0.128 | 400 | 0.0817 | 0.6546 |
0.16 | 500 | 0.0768 | 0.6512 |
0.192 | 600 | 0.0779 | 0.6466 |
0.224 | 700 | 0.0709 | 0.6594 |
0.256 | 800 | 0.0813 | 0.6441 |
0.288 | 900 | 0.0597 | 0.6454 |
0.32 | 1000 | 0.0744 | 0.6496 |
0.352 | 1100 | 0.0669 | 0.6608 |
0.384 | 1200 | 0.0657 | 0.6566 |
0.416 | 1300 | 0.0489 | 0.6660 |
0.448 | 1400 | 0.0643 | 0.6597 |
0.48 | 1500 | 0.0593 | 0.6587 |
0.512 | 1600 | 0.0598 | 0.6613 |
0.544 | 1700 | 0.0737 | 0.6570 |
0.576 | 1800 | 0.0661 | 0.6655 |
0.608 | 1900 | 0.0499 | 0.6613 |
0.64 | 2000 | 0.0641 | 0.6616 |
0.672 | 2100 | 0.0679 | 0.6662 |
0.704 | 2200 | 0.0521 | 0.6715 |
0.736 | 2300 | 0.0569 | 0.6651 |
0.768 | 2400 | 0.0507 | 0.6679 |
0.8 | 2500 | 0.0405 | 0.6678 |
0.832 | 2600 | 0.0548 | 0.6690 |
0.864 | 2700 | 0.0403 | 0.6692 |
0.896 | 2800 | 0.0613 | 0.6649 |
0.928 | 2900 | 0.0485 | 0.6673 |
0.96 | 3000 | 0.0495 | 0.6674 |
0.992 | 3100 | 0.0546 | 0.6665 |
Framework Versions
- Python: 3.11.9
- Sentence Transformers: 3.2.1
- Transformers: 4.45.2
- PyTorch: 2.3.1.post300
- Accelerate: 1.0.1
- Datasets: 3.0.1
- Tokenizers: 0.20.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 2
Model tree for stephenhib/all-mpnet-base-v2-patabs-1epoc-batch32-100000
Base model
sentence-transformers/all-mpnet-base-v2Evaluation results
- Cosine Accuracy@1 on sentence transformers/all mpnet base v2self-reported0.592
- Cosine Accuracy@3 on sentence transformers/all mpnet base v2self-reported0.711
- Cosine Accuracy@5 on sentence transformers/all mpnet base v2self-reported0.751
- Cosine Accuracy@10 on sentence transformers/all mpnet base v2self-reported0.814
- Cosine Precision@1 on sentence transformers/all mpnet base v2self-reported0.592
- Cosine Precision@3 on sentence transformers/all mpnet base v2self-reported0.237
- Cosine Precision@5 on sentence transformers/all mpnet base v2self-reported0.150
- Cosine Precision@10 on sentence transformers/all mpnet base v2self-reported0.081
- Cosine Recall@1 on sentence transformers/all mpnet base v2self-reported0.592
- Cosine Recall@3 on sentence transformers/all mpnet base v2self-reported0.711