metadata
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:1440
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: nomic-ai/modernbert-embed-base
widget:
- source_sentence: What section of the Code of Federal Regulations is quoted?
sentences:
- >-
and other legal relations of any interested party seeking such
declaration.” 28 U.S.C. § 2201(a).
This statute “is not an independent source of federal jurisdiction”;
rather, “the availability of
such relief presupposes the existence of a judicially remediable
right.” Schilling v. Rogers, 363
U.S. 666, 677 (1960). The Court independently has jurisdiction here
under the mandamus
- >-
appropriate only when the nature of the work is sporadic and
unpredictable so that a tour of duty
cannot be regularly scheduled in advance.” Pl.’s Mem. at 18 (quoting 5
C.F.R. § 340.403(a)).
This regulation explicitly distinguishes “intermittent” status from
“part-time” status, as it says
that “[w]hen an agency is able to schedule work in advance on a regular
basis, it has an
- >-
its discretion, a reviewing court looks to the trial court’s “stated
justification for refusing to
modify” the order. Skolnick, 191 Ill. 2d at 226.
In the case at bar, the one-sentence April 25 order did not provide any
reasons at all. The
losing party drafted the order without any stated reasons, although a
lack of stated reasons may
- source_sentence: Which office was determined to be an agency in the Soucie case?
sentences:
- >-
inquiry”); Doe v. Skyline Automobiles, Inc., 375 F. Supp. 3d 401, 405-06
(S.D.N.Y. 2019)
(“other factors must be taken into consideration and analyzed in
comparison to the public’s
interest and the interests of the opposing parties”).
Illinois has taken steps to protect individuals’ private information.
Examples include the
- >-
Aside from whether the Department’s “approach to artificial intelligence
development and
implementation” should be considered “critical infrastructure,” the
Department’s affidavit is
5
deficient in showing that its withholdings qualify as “critical
infrastructure security information”
in other ways. For example, the affidavit fails to explain how the
disclosure of the withheld infor-
- >-
whether an entity wields “substantial independent authority”:
investigative power and authority
to make final and binding decisions.
Consider first Soucie. The Circuit held that the Office of Science and
Technology
(“OST”) was an agency because, beyond advising the President, it had the
“independent function
- source_sentence: What is the appellant's burden on appeal?
sentences:
- >-
Defs.’ Reply at 7–8, 8 n.1. It cites Judicial Watch, Inc. v. Department
of Energy, 412 F.3d 125
(D.C. Cir. 2005), which dealt with the records of employees that the
Department of Energy
(“DOE”) had detailed to the National Energy Policy Development Group
(“NEPDG”). Id. at
132. The Government quotes the court’s statement that “the records
those employees created or
- >-
records available for inspection and copying is a violation of 5 U.S.C.
app. 2 § 10(b) and
constitutes a failure to perform a duty owed to EPIC within the meaning
of 28 U.S.C. § 1361.”
Id. . Both counts seek “a writ of mandamus” compelling the Commission
and its officers to
comply with FACA. Id. , 139. These counts make clear that EPIC seeks
mandamus relief
- >-
counsel now cannot fairly contend that the trial court did not consider
all the facts, especially
when [d]efendant’s counsel offers no court transcript to show
otherwise.” On appeal, it is
generally the appellant’s burden to provide the reviewing court with a
sufficient record to
establish the error that he complains of. Webster v. Hartman, 195 Ill.
2d 426, 436 (2001). “[A]
- source_sentence: What does the text refer to as a 'statutory distinction'?
sentences:
- >-
inconsistency in deeming the same entity an advisory committee and an
agency.” Defs.’ Reply
at 8. The problem, according to the Government, is that FACA generally
requires disclosure of
records, yet Exemption 5 would shield a portion of these records from
public view, which would
undermine FACA’s “purpose.” Id. at 8–9. Gates, Wolfe, and the 1988
OLC opinion echo this
- >-
agencies are operating arms of government characterized by ‘substantial
independent authority in
the exercise of specific functions.’” Disclosure of Advisory Comm.
Deliberative Materials, 12
Op. O.L.C. 73, 81 (1988). This “statutory distinction,” it concludes,
signifies that “advisory
committees are not agencies.” Id.
- |-
the Hon. Israel A. Desierto, Judge, presiding.
Judgment
Affirmed.
Counsel on
Appeal
Victor P. Henderson and Colin Quinn Commito, of Henderson Parks,
LLC, of Chicago, for appellant.
Tamara N. Holder, Law Firm of Tamara N. Holder LLC, of Chicago,
for appellee.
Panel
PRESIDING JUSTICE ODEN JOHNSON delivered the judgment of
the court, with opinion.
- source_sentence: >-
What do the newly enacted laws prohibit hospitals from doing regarding
sexual assault victims?
sentences:
- >-
exclusion for committees “composed wholly of . . . permanent part-time .
. . employees.” 5
U.S.C. app. 2 § 3(2).
32
A second, independent reason why the Commission does not fall within
this exclusion is
that its members are not “part-time” federal employees. Instead, they
are “intermittent”
employees. EPIC points to a regulation stating that “[a]n intermittent
work schedule is
- >-
committee, board, commission, council, conference, panel, task force, or
other similar group, or
any subcommittee or other subgroup thereof.” Id. § 3(2). Second, it
must be “established by
statute or reorganization plan,” “established or utilized by the
President,” or “established or
utilized by one or more agencies.” Id. Third, it must be “established”
or “utilized” “in the
- >-
confidential advisors (735 ILCS 5/8-804(c) (West 2022)) and prohibit
hospitals treating sexual
assault victims from directly billing the victims for the services,
communicating with victims
about a bill, or referring overdue bills to collection agencies or
credit reporting agencies. 410
ILCS 70/7.5(a)(1)-(4) (West 2022). These recently enacted laws encourage
victims to report
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: Fine-tuned with [QuicKB](https://github.com/ALucek/QuicKB)
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 768
type: dim_768
metrics:
- type: cosine_accuracy@1
value: 0.51875
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.69375
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.75
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.83125
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.51875
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.23125
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.14999999999999997
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.08312499999999999
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.51875
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.69375
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.75
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.83125
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.671534966140965
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.6211160714285715
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6261949467277568
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 512
type: dim_512
metrics:
- type: cosine_accuracy@1
value: 0.49375
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.7
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.73125
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.825
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.49375
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.2333333333333333
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.14625
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.08249999999999999
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.49375
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.7
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.73125
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.825
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.6607544642083831
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.6085367063492064
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6146313607229802
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 256
type: dim_256
metrics:
- type: cosine_accuracy@1
value: 0.4375
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.6875
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.725
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.79375
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.4375
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.22916666666666666
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.145
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.079375
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.4375
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.6875
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.725
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.79375
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.6224957341997419
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.566939484126984
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.5740997074969412
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 128
type: dim_128
metrics:
- type: cosine_accuracy@1
value: 0.40625
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.625
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.69375
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.775
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.40625
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.20833333333333331
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.13874999999999998
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.07749999999999999
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.40625
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.625
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.69375
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.775
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.5931742895464828
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.5348859126984128
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.5417826806767716
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 64
type: dim_64
metrics:
- type: cosine_accuracy@1
value: 0.30625
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.4875
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.6
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.6875
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.30625
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.16249999999999998
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.12
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.06875
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.30625
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.4875
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.6
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.6875
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.4854299754851493
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.42175347222222237
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.4326739799760461
name: Cosine Map@100
Fine-tuned with QuicKB
This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: nomic-ai/modernbert-embed-base
- Maximum Sequence Length: 1024 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("AdamLucek/modernbert-embed-quickb-video")
# Run inference
sentences = [
'What do the newly enacted laws prohibit hospitals from doing regarding sexual assault victims?',
'confidential advisors (735 ILCS 5/8-804(c) (West 2022)) and prohibit hospitals treating sexual \nassault victims from directly billing the victims for the services, communicating with victims \nabout a bill, or referring overdue bills to collection agencies or credit reporting agencies. 410 \nILCS 70/7.5(a)(1)-(4) (West 2022). These recently enacted laws encourage victims to report',
'exclusion for committees “composed wholly of . . . permanent part-time . . . employees.” 5 \nU.S.C. app. 2 § 3(2). \n32 \nA second, independent reason why the Commission does not fall within this exclusion is \nthat its members are not “part-time” federal employees. Instead, they are “intermittent” \nemployees. EPIC points to a regulation stating that “[a]n intermittent work schedule is',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Datasets:
dim_768
,dim_512
,dim_256
,dim_128
anddim_64
- Evaluated with
InformationRetrievalEvaluator
Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
---|---|---|---|---|---|
cosine_accuracy@1 | 0.5188 | 0.4938 | 0.4375 | 0.4062 | 0.3063 |
cosine_accuracy@3 | 0.6937 | 0.7 | 0.6875 | 0.625 | 0.4875 |
cosine_accuracy@5 | 0.75 | 0.7312 | 0.725 | 0.6937 | 0.6 |
cosine_accuracy@10 | 0.8313 | 0.825 | 0.7937 | 0.775 | 0.6875 |
cosine_precision@1 | 0.5188 | 0.4938 | 0.4375 | 0.4062 | 0.3063 |
cosine_precision@3 | 0.2313 | 0.2333 | 0.2292 | 0.2083 | 0.1625 |
cosine_precision@5 | 0.15 | 0.1462 | 0.145 | 0.1387 | 0.12 |
cosine_precision@10 | 0.0831 | 0.0825 | 0.0794 | 0.0775 | 0.0688 |
cosine_recall@1 | 0.5188 | 0.4938 | 0.4375 | 0.4062 | 0.3063 |
cosine_recall@3 | 0.6937 | 0.7 | 0.6875 | 0.625 | 0.4875 |
cosine_recall@5 | 0.75 | 0.7312 | 0.725 | 0.6937 | 0.6 |
cosine_recall@10 | 0.8313 | 0.825 | 0.7937 | 0.775 | 0.6875 |
cosine_ndcg@10 | 0.6715 | 0.6608 | 0.6225 | 0.5932 | 0.4854 |
cosine_mrr@10 | 0.6211 | 0.6085 | 0.5669 | 0.5349 | 0.4218 |
cosine_map@100 | 0.6262 | 0.6146 | 0.5741 | 0.5418 | 0.4327 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,440 training samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 7 tokens
- mean: 15.14 tokens
- max: 29 tokens
- min: 57 tokens
- mean: 97.82 tokens
- max: 161 tokens
- Samples:
anchor positive What must the advisory committee make available for public inspection?
advisory committee shall be available for public inspection and copying . . . until the advisory
committee ceases to exist.” Id. § 10(b). Unlike FOIA, this provision looks forward. It requires
committees to take affirmative steps to make their records are public, even absent a request.
FACA’s definition of “advisory committee” has four parts. First, it includes “anyWhat did the landlords fail to alert the court about?
court documents containing fake citations, we conclude that
imposing monetary sanctions or dismissing this appeal would be
disproportionate to Al-Hamim’s violation of the Appellate Rules.
23
Further, in their answer brief, the landlords failed to alert this court
to the hallucinations in Al-Hamim’s opening brief and did not
request an award of attorney fees against Al-Hamim. Under theOn what date was the motion served on the plaintiff’s counsel?
also alleged (1) that plaintiff violated section 2-401(e) and (2) that she lacked good cause to
file anonymously because she signed an affidavit in her own name in another case with similar
allegations. The April 13 motion contains a “Certificate of Service” stating that it was served
on plaintiff’s counsel by e-mail on April 13. - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 32gradient_accumulation_steps
: 16learning_rate
: 2e-05num_train_epochs
: 4lr_scheduler_type
: cosinewarmup_ratio
: 0.1bf16
: Truetf32
: Trueload_best_model_at_end
: Trueoptim
: adamw_torch_fusedbatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 16eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: cosinelr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Truelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
---|---|---|---|---|---|---|
1.0 | 3 | 0.6493 | 0.6372 | 0.5987 | 0.5536 | 0.4520 |
2.0 | 6 | 0.6685 | 0.6514 | 0.6208 | 0.5916 | 0.4859 |
2.7111 | 8 | 0.6715 | 0.6608 | 0.6225 | 0.5932 | 0.4854 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.4.0
- Transformers: 4.48.1
- PyTorch: 2.5.1+cu124
- Accelerate: 1.3.0
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}