SentenceTransformer based on avsolatorio/GIST-small-Embedding-v0
This is a sentence-transformers model finetuned from avsolatorio/GIST-small-Embedding-v0. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: avsolatorio/GIST-small-Embedding-v0
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 384 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("GIST-small-Embedding-v0-4_batch_10_epoch_all_data_en_unique_split")
# Run inference
sentences = [
'When administrators are released from their tasks, all personal administration identifiers assigned to them must be blocked. It must be checked which passwords the outgoing employees still know. Such passwords must be changed. Furthermore, it must be checked whether the outgoing employees have been appointed as contact persons to third parties, e.g. in contracts or as an admin-C entry at Internet-Domains. In this case, new contact persons must be identified and the interested third parties informed. Users of the affected IT systems and applications must be informed that the previous administrator has left.',
'The full life cycle of identities shall be managed.',
'A.5.16',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Dataset:
GIST-small-Embedding-v0-4_batch_10_epoch_all_data_en_unique_split_robustness_43_eval
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.8701 |
dot_accuracy | 0.1037 |
manhattan_accuracy | 0.8745 |
euclidean_accuracy | 0.8701 |
max_accuracy | 0.8745 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 3,435 training samples
- Columns:
anchor
,positive
,ISO_ID
, andnegative
- Approximate statistics based on the first 1000 samples:
anchor positive ISO_ID negative type string string string string details - min: 10 tokens
- mean: 78.38 tokens
- max: 512 tokens
- min: 10 tokens
- mean: 23.38 tokens
- max: 192 tokens
- min: 5 tokens
- mean: 6.99 tokens
- max: 7 tokens
- min: 10 tokens
- mean: 22.45 tokens
- max: 192 tokens
- Samples:
anchor positive ISO_ID negative The Cloud Service Provider applies appropriate measures to check the cloud service for vulnerabilities which might have been integrated into the cloud service during the software development process.
The procedures for identifying such vulnerabilities are part of the software development process and, depending on a risk assessment, include the following activities:
• Static Application Security Testing;
• Dynamic Application Security Testing;
• Code reviews by the Cloud Service Provider's subject matter experts; and
• Obtaining information about confirmed vulnerabilities in software libraries provided by third parties and used in their own cloud service.
The severity of identified vulnerabilities is assessed according to defined criteria and measures are taken to immediately eliminate or mitigate them.Information about technical vulnerabilities of information systems in use shall be obtained, the organization’s exposure to such vulnerabilities shall be evaluated and appropriate measures shall be taken.
A.8.8
Backup copies of information, software and systems shall be maintained and regularly tested in accordance with the agreed topic-specific policy on backup.
Policies and instructions for planning and conducting audits are documented, communicated and made available in accordance with SP-01 and address the following aspects:
• Restriction to read-only access to system components in accordance with the agreed audit plan and as necessary to perform the activities;
• Activities that may result in malfunctions to the cloud service or breaches of contractual requirements are performed during scheduled maintenance windows or outside peak periods; and
• Logging and monitoring of activities.Audit tests and other assurance activities involving assessment of operational systems shall be planned and agreed between the tester and appropriate management.
A.8.34
The organization shall provide a mechanism for personnel to report observed or suspected information security events through appropriate channels in a timely manner.
System components in the Cloud Service Provider's area of responsibility that are used to provide the cloud service, authenticate users of the Cloud Service Provider's internal and external employees as well as system components that are involved in the Cloud Service Provider's automated authorisation processes. Access to the production environment requires two-factor or multi-factor authentication. Within the production environment, user authentication takes place through passwords, digitally signed certificates or procedures that achieve at least an equivalent level of security. If digitally signed certificates are used, administration is carried out in accordance with the Guideline for Key Management (cf. CRY-01). The password requirements are derived from a risk assessment and documented, communicated and provided in a password policy according to SP-01. Compliance with the requirements is enforced by the configuration of the system components, as far as technically possible.
Allocation and management of authentication information shall be controlled by a management process, including advising personnel on appropriate handling of authentication information.
A.5.17
Networks and network devices shall be secured, managed and controlled to protect information in systems and applications.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 916 evaluation samples
- Columns:
anchor
,positive
,ISO_ID
, andnegative
- Approximate statistics based on the first 916 samples:
anchor positive ISO_ID negative type string string string string details - min: 10 tokens
- mean: 76.12 tokens
- max: 512 tokens
- min: 10 tokens
- mean: 43.39 tokens
- max: 495 tokens
- min: 5 tokens
- mean: 6.91 tokens
- max: 7 tokens
- min: 10 tokens
- mean: 42.2 tokens
- max: 495 tokens
- Samples:
anchor positive ISO_ID negative The Cloud Service Provider provides a training program for regular, target group-oriented security training and awareness for internal and external employees on standards and methods of secure software development and provision as well as on how to use the tools used for this purpose. The program is regularly reviewed and updated with regard to the applicable policies and instructions, the assigned roles and responsibilities and the tools used.
The organization shall:
a) determine the necessary competence of person(s) doing work under its control that affects its information security performance;
b) ensure that these persons are competent on the basis of appropriate education, training, or experience;
c) where applicable, take actions to acquire the necessary competence, and evaluate the effectiveness of the actions taken; and
d) retain appropriate documented information as evidence of competence.
NOTE Applicable actions can include, for example: the provision of training to, the mentoring of, or the re- assignment of current employees; or the hiring or contracting of competent persons.7.2
Knowledge gained from information security incidents shall be used to strengthen and improve the information security controls.
The Cloud Service Provider provides a training program for regular, target group-oriented security training and awareness for internal and external employees on standards and methods of secure software development and provision as well as on how to use the tools used for this purpose. The program is regularly reviewed and updated with regard to the applicable policies and instructions, the assigned roles and responsibilities and the tools used.
Personnel of the organization and relevant interested parties shall receive appropriate information security awareness, education and training and regular updates of the organization's information security policy, topic-specific policies and procedures, as relevant for their job function.
A.6.3
Rules for the effective use of cryptography, including cryptographic key management, shall be defined and implemented.
The Cloud Service Provider provides a training program for regular, target group-oriented security training and awareness for internal and external employees on standards and methods of secure software development and provision as well as on how to use the tools used for this purpose. The program is regularly reviewed and updated with regard to the applicable policies and instructions, the assigned roles and responsibilities and the tools used.
Changes to information processing facilities and information systems shall be subject to change management procedures.
A.8.32
Security perimeters shall be defined and used to protect areas that contain information and other associated assets.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 4per_device_eval_batch_size
: 4num_train_epochs
: 10warmup_ratio
: 0.1bf16
: Trueddp_find_unused_parameters
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 4per_device_eval_batch_size
: 4per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 10max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Truedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Trueddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | loss | GIST-small-Embedding-v0-4_batch_10_epoch_all_data_en_unique_split_robustness_43_eval_cosine_accuracy |
---|---|---|---|---|
1.0 | 429 | 1.78 | 1.3904 | 0.7980 |
2.0 | 858 | 1.2868 | 1.0807 | 0.8406 |
3.0 | 1287 | 1.0504 | 1.0073 | 0.8570 |
4.0 | 1716 | 0.9558 | 0.9815 | 0.8657 |
5.0 | 2145 | 0.8899 | 0.9706 | 0.8701 |
6.0 | 2574 | 0.8561 | 0.9637 | 0.8723 |
7.0 | 3003 | 0.8405 | 0.9612 | 0.8712 |
8.0 | 3432 | 0.8302 | 0.9603 | 0.8690 |
9.0 | 3861 | 0.8271 | 0.9598 | 0.8701 |
10.0 | 4290 | 0.8107 | 0.9600 | 0.8701 |
Framework Versions
- Python: 3.10.14
- Sentence Transformers: 3.1.0
- Transformers: 4.45.1
- PyTorch: 2.4.1+cu121
- Accelerate: 0.34.2
- Datasets: 3.0.1
- Tokenizers: 0.20.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for Behrni/GIST-small-Embedding-v0-4_batch_10_epoch_all_data_en_unique_split_robustness_43
Base model
avsolatorio/GIST-small-Embedding-v0Evaluation results
- Cosine Accuracy on GIST small Embedding v0 4 batch 10 epoch all data en unique split robustness 43 evalself-reported0.870
- Dot Accuracy on GIST small Embedding v0 4 batch 10 epoch all data en unique split robustness 43 evalself-reported0.104
- Manhattan Accuracy on GIST small Embedding v0 4 batch 10 epoch all data en unique split robustness 43 evalself-reported0.874
- Euclidean Accuracy on GIST small Embedding v0 4 batch 10 epoch all data en unique split robustness 43 evalself-reported0.870
- Max Accuracy on GIST small Embedding v0 4 batch 10 epoch all data en unique split robustness 43 evalself-reported0.874