metadata
base_model: google/flan-t5-base
datasets:
- PiC/phrase_similarity
language:
- en
library_name: sentence-transformers
metrics:
- cosine_accuracy
- cosine_accuracy_threshold
- cosine_f1
- cosine_f1_threshold
- cosine_precision
- cosine_recall
- cosine_ap
- dot_accuracy
- dot_accuracy_threshold
- dot_f1
- dot_f1_threshold
- dot_precision
- dot_recall
- dot_ap
- manhattan_accuracy
- manhattan_accuracy_threshold
- manhattan_f1
- manhattan_f1_threshold
- manhattan_precision
- manhattan_recall
- manhattan_ap
- euclidean_accuracy
- euclidean_accuracy_threshold
- euclidean_f1
- euclidean_f1_threshold
- euclidean_precision
- euclidean_recall
- euclidean_ap
- max_accuracy
- max_accuracy_threshold
- max_f1
- max_f1_threshold
- max_precision
- max_recall
- max_ap
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:7004
- loss:SoftmaxLoss
widget:
- source_sentence: >-
The valve will open 100% when the set point is reached and will remain
open until a certain blow down factor is reached.
sentences:
- >-
Having raised $17,000,000 in a standard matter, one of the first
speculative IPOs, Tucker needed more money to continue development of
the car.
- >-
The valve will open 100% when the tennis scoring protocol is reached and
will remain open until a certain blow down factor is reached.
- >-
But the government of PML (N) gave it the complete exponential of a
Tehsil.
- source_sentence: >-
Java BluePrints was the first source to promote Model View Controller
(MVC) and Data Access Object (DAO) for Java EE application development.
sentences:
- >-
Java BluePrints was the pioneer authority to promote Model View
Controller (MVC) and Data Access Object (DAO) for Java EE application
development.
- >-
One of the primary job of IIUG is to publish news through a monthly
newsletter ("The Insider").
- >-
Opera Dragonfly must be downloaded on original practice, and functions
offline thereafter.
- source_sentence: It also appears immediately after the first shower of the monsoon.
sentences:
- >-
The latter can be minimised by meticulous precision to the wheel
bearings, tyre sizes and pressures, and brakes (to avoid parasitic brake
drag).
- It also appears immediately after the initial rain of the monsoon.
- >-
McCullough filed a second appeal that could not be denied without a
hearing from the State Attorney's Office.
- source_sentence: >-
This type places the shifters closer to the hand positions, but still
offer a simple reliable system, especially for touring cyclist.
sentences:
- >-
This type places the shifters closer to the palm placement, but still
offer a simple reliable system, especially for touring cyclist.
- >-
All square dancers learn standard "definitions" of calls, which they
recall and use when the caller issues a certain directive.
- >-
Mainos-TV operated by leasing atmospheric duration from Yleisradio,
broadcasting in reserved blocks between Yleisradio's own programming on
its two channels.
- source_sentence: >-
He also played with the Turkish 2nd Division team Pertevniyal, which was
at the time the farm team of Efes, via a dual license.
sentences:
- >-
The group is still active, producing a monthly action points on the
women, peace, and authentication blocks affecting countries on Council's
agenda.
- >-
Storage/centre tracks are found in the vicinity of the following
stations:
Other song highlights.
- >-
He also played with the Turkish 2nd Division team Pertevniyal, which was
at the time the farm team of Efes, via a two-part authorization.
model-index:
- name: SentenceTransformer based on google/flan-t5-base
results:
- task:
type: binary-classification
name: Binary Classification
dataset:
name: quora duplicates dev
type: quora-duplicates-dev
metrics:
- type: cosine_accuracy
value: 0.614
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.964033842086792
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.6810035842293908
name: Cosine F1
- type: cosine_f1_threshold
value: 0.9199645519256592
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.5307262569832403
name: Cosine Precision
- type: cosine_recall
value: 0.95
name: Cosine Recall
- type: cosine_ap
value: 0.6300846409755155
name: Cosine Ap
- type: dot_accuracy
value: 0.53
name: Dot Accuracy
- type: dot_accuracy_threshold
value: 3.121588945388794
name: Dot Accuracy Threshold
- type: dot_f1
value: 0.6666666666666666
name: Dot F1
- type: dot_f1_threshold
value: 1.7629711627960205
name: Dot F1 Threshold
- type: dot_precision
value: 0.5010060362173038
name: Dot Precision
- type: dot_recall
value: 0.996
name: Dot Recall
- type: dot_ap
value: 0.4998742415317971
name: Dot Ap
- type: manhattan_accuracy
value: 0.616
name: Manhattan Accuracy
- type: manhattan_accuracy_threshold
value: 9.075502395629883
name: Manhattan Accuracy Threshold
- type: manhattan_f1
value: 0.6818181818181819
name: Manhattan F1
- type: manhattan_f1_threshold
value: 16.287639617919922
name: Manhattan F1 Threshold
- type: manhattan_precision
value: 0.5286343612334802
name: Manhattan Precision
- type: manhattan_recall
value: 0.96
name: Manhattan Recall
- type: manhattan_ap
value: 0.6295013501048787
name: Manhattan Ap
- type: euclidean_accuracy
value: 0.614
name: Euclidean Accuracy
- type: euclidean_accuracy_threshold
value: 0.41224515438079834
name: Euclidean Accuracy Threshold
- type: euclidean_f1
value: 0.6818181818181819
name: Euclidean F1
- type: euclidean_f1_threshold
value: 0.7474431991577148
name: Euclidean F1 Threshold
- type: euclidean_precision
value: 0.5286343612334802
name: Euclidean Precision
- type: euclidean_recall
value: 0.96
name: Euclidean Recall
- type: euclidean_ap
value: 0.6282929125909658
name: Euclidean Ap
- type: max_accuracy
value: 0.616
name: Max Accuracy
- type: max_accuracy_threshold
value: 9.075502395629883
name: Max Accuracy Threshold
- type: max_f1
value: 0.6818181818181819
name: Max F1
- type: max_f1_threshold
value: 16.287639617919922
name: Max F1 Threshold
- type: max_precision
value: 0.5307262569832403
name: Max Precision
- type: max_recall
value: 0.996
name: Max Recall
- type: max_ap
value: 0.6300846409755155
name: Max Ap
SentenceTransformer based on google/flan-t5-base
This is a sentence-transformers model finetuned from google/flan-t5-base on the PiC/phrase_similarity dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: google/flan-t5-base
- Maximum Sequence Length: None tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
- Training Dataset:
- Language: en
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': None, 'do_lower_case': False}) with Transformer model: T5EncoderModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Deehan1866/finetuned-flan-t5-base")
# Run inference
sentences = [
'He also played with the Turkish 2nd Division team Pertevniyal, which was at the time the farm team of Efes, via a dual license.',
'He also played with the Turkish 2nd Division team Pertevniyal, which was at the time the farm team of Efes, via a two-part authorization.',
'Storage/centre tracks are found in the vicinity of the following stations:\nOther song highlights.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Binary Classification
- Dataset:
quora-duplicates-dev
- Evaluated with
BinaryClassificationEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.614 |
cosine_accuracy_threshold | 0.964 |
cosine_f1 | 0.681 |
cosine_f1_threshold | 0.92 |
cosine_precision | 0.5307 |
cosine_recall | 0.95 |
cosine_ap | 0.6301 |
dot_accuracy | 0.53 |
dot_accuracy_threshold | 3.1216 |
dot_f1 | 0.6667 |
dot_f1_threshold | 1.763 |
dot_precision | 0.501 |
dot_recall | 0.996 |
dot_ap | 0.4999 |
manhattan_accuracy | 0.616 |
manhattan_accuracy_threshold | 9.0755 |
manhattan_f1 | 0.6818 |
manhattan_f1_threshold | 16.2876 |
manhattan_precision | 0.5286 |
manhattan_recall | 0.96 |
manhattan_ap | 0.6295 |
euclidean_accuracy | 0.614 |
euclidean_accuracy_threshold | 0.4122 |
euclidean_f1 | 0.6818 |
euclidean_f1_threshold | 0.7474 |
euclidean_precision | 0.5286 |
euclidean_recall | 0.96 |
euclidean_ap | 0.6283 |
max_accuracy | 0.616 |
max_accuracy_threshold | 9.0755 |
max_f1 | 0.6818 |
max_f1_threshold | 16.2876 |
max_precision | 0.5307 |
max_recall | 0.996 |
max_ap | 0.6301 |
Training Details
Training Dataset
PiC/phrase_similarity
- Dataset: PiC/phrase_similarity at fc67ce7
- Size: 7,004 training samples
- Columns:
sentence1
,sentence2
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 label type string string int details - min: 11 tokens
- mean: 28.1 tokens
- max: 73 tokens
- min: 11 tokens
- mean: 28.82 tokens
- max: 74 tokens
- 0: ~48.80%
- 1: ~51.20%
- Samples:
sentence1 sentence2 label newly formed camp is released from the membrane and diffuses across the intracellular space where it serves to activate pka.
recently made encampment is released from the membrane and diffuses across the intracellular space where it serves to activate pka.
0
According to one data, in 1910, on others – in 1915, the mansion became Natalya Dmitriyevna Shchuchkina's property.
According to a particular statistic, in 1910, on others – in 1915, the mansion became Natalya Dmitriyevna Shchuchkina's property.
1
Note that Fact 1 does not assume any particular structure on the set formula_65.
Note that Fact 1 does not assume any specific edifice on the set formula_65.
0
- Loss:
SoftmaxLoss
Evaluation Dataset
PiC/phrase_similarity
- Dataset: PiC/phrase_similarity at fc67ce7
- Size: 1,000 evaluation samples
- Columns:
sentence1
,sentence2
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 label type string string int details - min: 9 tokens
- mean: 27.86 tokens
- max: 66 tokens
- min: 11 tokens
- mean: 28.62 tokens
- max: 66 tokens
- 0: ~50.00%
- 1: ~50.00%
- Samples:
sentence1 sentence2 label after theo's apparent death, she decides to leave first colony and ends up traveling with the apostles.
after theo's apparent death, she decides to leave original settlement and ends up traveling with the apostles.
0
The guard assigned to Vivian leaves her to prevent the robbery, allowing her to connect to the bank's network.
The guard assigned to Vivian leaves her to prevent the robbery, allowing her to connect to the bank's locations.
0
Two days later Louis XVI banished Necker by a "lettre de cachet" for his very public exchange of pamphlets.
Two days later Louis XVI banished Necker by a "lettre de cachet" for his very free forum of pamphlets.
0
- Loss:
SoftmaxLoss
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16learning_rate
: 2e-05num_train_epochs
: 5warmup_ratio
: 0.1load_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 5max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | loss | quora-duplicates-dev_max_ap |
---|---|---|---|---|
0 | 0 | - | - | 0.6114 |
0.2283 | 100 | - | 0.6937 | 0.6118 |
0.4566 | 200 | - | 0.6934 | 0.6120 |
0.6849 | 300 | - | 0.6933 | 0.6118 |
0.9132 | 400 | - | 0.6934 | 0.6123 |
1.1416 | 500 | 0.6931 | 0.6933 | 0.6117 |
1.3699 | 600 | - | 0.6933 | 0.6117 |
1.5982 | 700 | - | 0.6933 | 0.6118 |
1.8265 | 800 | - | 0.6933 | 0.6130 |
2.0548 | 900 | - | 0.6932 | 0.6130 |
2.2831 | 1000 | 0.6922 | 0.6931 | 0.6137 |
2.5114 | 1100 | - | 0.6931 | 0.6129 |
2.7397 | 1200 | - | 0.6930 | 0.6143 |
2.9680 | 1300 | - | 0.6928 | 0.6165 |
3.1963 | 1400 | - | 0.6926 | 0.6174 |
3.4247 | 1500 | 0.6907 | 0.6924 | 0.6193 |
3.6530 | 1600 | - | 0.6920 | 0.6228 |
3.8813 | 1700 | - | 0.6918 | 0.6238 |
4.1096 | 1800 | - | 0.6915 | 0.6256 |
4.3379 | 1900 | - | 0.6912 | 0.6273 |
4.5662 | 2000 | 0.6888 | 0.6910 | 0.6292 |
4.7945 | 2100 | - | 0.6908 | 0.6301 |
5.0 | 2190 | - | - | 0.6301 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.10
- Sentence Transformers: 3.0.1
- Transformers: 4.42.3
- PyTorch: 2.2.1+cu121
- Accelerate: 0.32.1
- Datasets: 2.20.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers and SoftmaxLoss
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}