SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2 on the skill_sentence and skill_skill datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-mpnet-base-v2
- Maximum Sequence Length: 96 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
- Training Datasets:
- skill_sentence
- skill_skill
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 96, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): SmartTokenPooling({'word_embedding_dimension': 768, 'window_size': -1})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("jensjorisdecorte/ConTeXT-Skill-Extraction-base")
# Run inference
sentences = [
'Must have the ability to read and interpret schematics and effectively install and calibrate lift governors to ensure compliance with safety standards. The ideal candidate must have an ear for identifying music with commercial potential and understand the current market trends.',
'install lift governor',
'skill_sentence',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Datasets
skill_sentence
- Dataset: skill_sentence
- Size: 138,260 training samples
- Columns:
anchor
,positive
, andtype
- Approximate statistics based on the first 1000 samples:
anchor positive type type string string string details - min: 9 tokens
- mean: 35.67 tokens
- max: 63 tokens
- min: 3 tokens
- mean: 6.12 tokens
- max: 15 tokens
- min: 5 tokens
- mean: 5.0 tokens
- max: 5 tokens
- Samples:
anchor positive type duties for this role will include conducting water chemistry analysis and managing the laboratory. seeking a seasoned print manufacturing manager with knowledge of printing materials, processes and equipment.
water chemistry analysis
skill_sentence
divers must understand how to calculate dive times and limits to ensure they return safely. We are searching for a multimedia software expert with experience in sound, lighting and recording software.
comply with the planned time for the depth of the dive
skill_sentence
A successful candidate will possess the ability to calibrate laboratory equipment according to industry standards. we are seeking a candidate with experience in preparing government funding dossiers
prepare government funding dossiers
skill_sentence
- Loss:
custom_losses.HardMultipleNegativesRankingLoss
with these parameters:{ "scale": 20, "similarity_fct": "<lambda>" }
skill_skill
- Dataset: skill_skill
- Size: 13,891 training samples
- Columns:
anchor
,positive
, andtype
- Approximate statistics based on the first 1000 samples:
anchor positive type type string string string details - min: 6 tokens
- mean: 29.09 tokens
- max: 96 tokens
- min: 3 tokens
- mean: 6.24 tokens
- max: 16 tokens
- min: 5 tokens
- mean: 5.0 tokens
- max: 5 tokens
- Samples:
anchor positive type Adapt and move set pieces during rehearsals and live performances.
adapt sets
skill_skill
Prepare bread and bread products such as sandwiches for consumption.
prepare bread products
skill_skill
The strategies, methods and techniques that increase the organisation's capacity to protect and sustain the services and operations that fulfil the organisational mission and create lasting values by effectively addressing the combined issues of security, preparedness, risk and disaster recovery.
organisational resilience
skill_skill
- Loss:
CachedMultipleNegativesSymmetricRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "mini_batch_size": 64 }
Training Hyperparameters
Non-Default Hyperparameters
overwrite_output_dir
: Trueeval_strategy
: stepsper_device_train_batch_size
: 4096per_device_eval_batch_size
: 4096num_train_epochs
: 1warmup_ratio
: 0.1fp16
: Trueload_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Truedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 4096per_device_eval_batch_size
: 4096per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step |
---|---|
0.1053 | 4 |
0.2105 | 8 |
0.3158 | 12 |
0.4211 | 16 |
0.5263 | 20 |
0.6316 | 24 |
0.7368 | 28 |
0.8421 | 32 |
0.9474 | 36 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.9.19
- Sentence Transformers: 3.1.0
- Transformers: 4.44.2
- PyTorch: 2.4.1+cu118
- Accelerate: 0.34.2
- Datasets: 3.0.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- 17
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for jensjorisdecorte/ConTeXT-Skill-Extraction-base
Base model
sentence-transformers/all-mpnet-base-v2