SentenceTransformer based on allenai/specter

This is a sentence-transformers model finetuned from allenai/specter. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: allenai/specter
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("m7n/discipline-tuned_specter_1_001")
# Run inference
sentences = [
    'To deal with the theme of the "unrepresented" it is necessary to clarify what we mean by representation. Depending on whether a formal, substantial, descriptive or symbolic concept of representation is adopted, in fact, the answer to the question: "Who are the unrepresented?" changes. Based on a formal concept, the unrepresented are those formally excluded from political rights. On the basis of a substantial conception, instead, they too can be considered represented, if there is someone who pursues their interests in the institutions. According to a descriptive concept, an assembly selected through the draw must be considered representative. The same can be said of a leader in whom a community identifies itself symbolically. The author claims that the adoption of exclusively substantial, descriptive or symbolic conceptions of representation involves many problems from the point of view of democratic theory, and therefore adopts a formal perspective. According to it, the unrepresented can be divided into three categories: a) who has not the right to elect representatives; b) who has this right, but fails to elect his or her own representative; c) who has this right but doesn\'t exercise it. The first category includes foreign residents without citizenship in democratic countries. The author argues that discrimination against them is not rationally justifiable, because it cannot be based on any of the classic arguments developed to limit political rights (such as lack of capacity, independence, or interest). The second category includes those who vote, but don\'t contribute to the election of anyone representing them. The existence of this category raises the problem of distorting electoral laws, and the issue of the size of representative assemblies. The third category includes those who don\'t exercise their political rights. A worrying sign that the vote by many is no longer perceived as a vehicle for change.',
    'The constitutional State is under attack. If we separate Rule of Law and democratic sovereignty, civil rights and social rights, the holding of pluralist democracies is jeopardized. The sunset of the Rule of Law risks being one of the most dangerous consequences of neoliberal globalism and its crisis. The demolition of the welfare State and the technocratic depletion of politics have in fact generated a distortion of constitutional democracies, which can open the way for the questioning of the Rule of law. The opposing ideological narratives on the Rule of Law can be grouped according to two visions: an optimistic one, which sees in neo-liberal globalization the opportunity for its generalized diffusion; a radical-maximalist, which completely liquidates its regulatory framework and inheritance. The essay analyzes these two trends, to focus then on the emergency paradigm as a challenge to the "Rule of law".',
    'Human attachment relationships are considered to be foundational to psychological well-being (Fonagy, ; Warren, Huston, Egeland, & Sroufe, ) and, by extension, attachment to God is often considered foundational to psychological well-being amongst Christian believers (Kirkpatrick, ; Miner, ). However, studies of psychological need satisfaction by different attachment figures (La Guardia, Ryan, Couchman, & Deci, ) suggest that experiences in which basic psychological needs are satisfied are conducive to more secure attachment relationships, and thus, to enhanced psychological well-being. This paper tests two contrasting models of attachment to God, need satisfaction, and well-being: the Attachment Security Primacy Model which holds that attachment security facilitates experiences of psychological need satisfaction and thence increased well-being; and the Need Satisfaction Primacy Model which holds that experiences of psychological need satisfaction facilitate attachment security and thence increased well-being. Using self-report data from Australian Christian participants, Structural Equation Modeling indicated that the Need Satisfaction Primacy Model fit the data better than competing models. Implications for augmenting theories of attachment to God and providing contexts in which people can experience God as meeting basic needs are discussed.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Datasets: specter_og and modernBERT_disciplines
Evaluated with TripletEvaluator

Metric	specter_og	modernBERT_disciplines
cosine_accuracy	0.984	0.9847

Training Details

Training Dataset

Unnamed Dataset

Size: 7,828 training samples
Columns: anchor, positive, and negative

Approximate statistics based on the first 1000 samples:

	anchor	positive	negative
type	string	string	string
details	min: 88 tokens mean: 245.68 tokens max: 512 tokens	min: 85 tokens mean: 243.1 tokens max: 512 tokens	min: 77 tokens mean: 242.04 tokens max: 512 tokens

Samples:

anchor	positive	negative
ChemInformVolume , Issue Reviews ChemInform Abstract: -Chloro- -aza- -propeniminium Units as Versatile Building Blocks in Organic Synthesis J. LIEBSCHER, J. LIEBSCHER Sekt. Chem., Humboldt-Univ., DDR- BerlinSearch for more papers by this author J. LIEBSCHER, J. LIEBSCHER Sekt. Chem., Humboldt-Univ., DDR- BerlinSearch for more papers by this author First published: January , the full textAboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinked InRedditWechat No abstract is available for this article. Volume00, Issue0January , RelatedInformation	ChemInformVolume , Issue Heterocyclic Compounds ChemInform Abstract: Anhydrous CeCl0-Catalyzed C0-Selective Propargylation of Indoles with Tertiary Alcohols. Claudio C. Silveira, Claudio C. Silveira Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, BrazilSearch for more papers by this authorSamuel R. Mendes, Samuel R. Mendes Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, BrazilSearch for more papers by this authorLucas Wolf, Lucas Wolf Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, BrazilSearch for more papers by this authorGuilherme M. Martins, Guilherme M. Martins Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, BrazilSearch for more papers by this author Claudio C. Silveira, Claudio C. Silveira Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, BrazilSearch for more papers by this authorSamuel R. Mendes, Samuel R. Mendes Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, ...	INSIGHTVolume , Issue p. - Special Feature Highway Infrastructure Michael E. Krueger, Michael E. Krueger Search for more papers by this author Michael E. Krueger, Michael E. Krueger Search for more papers by this author First published: June 0AboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Citing Literature Volume0, Issue0April 0000Pages - RelatedInformation
Background: When determining the duration of an acute bout of physical activity (PA) in an experiment, it is important for researchers to consider associations between duration and the target outcome, as well as how amenable participants will be to enrolling, and whether they will be capable of completing the study. Researchers must strike a suitable balance when working with populations that are commonly inactive, such as people with schizophrenia. Conceptually, a participant's task self-efficacy might indicate their willingness to participate in a study and their confidence in completing a PA protocol. To inform a future protocol, this study examined the self-efficacy of individuals with schizophrenia to complete PA bouts of differing durations. Methods: A secondary analysis on reliability data from a Health Action Process Approach inventory for PA in schizophrenia (n= ) was performed. Task self-efficacy was measured using -items. Participants rated how confident they were in their p...	Involvement opportunities (IOs) are perceived benefits that are only present through continued sport involvement (Weiss & Amorose, ). Knowing which IOs are poignant in different segments of a population may be important in explaining participants' sport commitment, their behaviours, and purchase intentions (Casper & Stellino, ; Young, Bennett & Seguin, ). This study examined how Masters swimmers judged IOs, as a function of age group ( - , - , - , +), sex, prior participation length (< , + yrs), and probability (low, high) of attending a World championship event. Participants reported information on demographics, sport involvement, intentions, and responded to a survey (Bennett & Young, ) assessing different IOs. A series of MANOVAs identified differences according to sample segments, all ps < . All age cohorts highly recognized opportunities for 'enjoyment', 'health and fitness', 'social', 'stress relief' and 'personal testing and assessment', though the youngest group viewed the latt...	The article analyzes the current state with anti-monopolistic regulation with regards to the transactions of mergers and acquisitions in Russia, describes the recent changes in legislation related to it, and analyzes the major trends in state regulation over merges and acquisitions in the post-crisis period. The acquisition of TNK-BP by Rosneft' is the central example discussed in the article. The author presents recommendation towards improving the mechanisms of evaluation of the effect produced by the transaction of merges and acquisitions upon Russian economy
In academe, there is a great bifurcation in the understanding of such things, i.e., the main thought of German ideology and its status in the history of Marxist philosophy. By exploring the historical background and the purpose of German ideology, and with the help of the direct explanation of Marx and Engels in this book, the author thinks that the main thought and basic content of German ideology is the theory on individuals. After that, the author elucidates the relation between the theory on individuals of Marxism and historical materialism, and the history of Marxist philosophy as well.	After the Second Opium War, the government of the Qing Dynasty signed Treaty of Tianjing with America and British in succession, starting from which Shantou opened its seaport to the world, and the foreigners had enjoyed the rights to rent estate to build warehouse, church, hospital and graveyards. Being different with other cities such as Shanghai, Tianjin, and Hankou, where designed some special areas to rent, the expanding situation in Shantou is more complicated. Based on the analysis of British Public Archives Files, this paper focuses on the two recorded disputes on estate involving Chinese and foreigners in Shantou, in order to help us to gain a deep understanding of the complicate situation in the expanding progression of Chinese coastal ports, by presenting the transition of the strategies adopted by foreigners to rent or buy estate, and the reactions taken by Chinese officials during the formation of Shantou City in the end of the Qing Dynasty.	We complete the determination of the \ell -block distribution of characters for quasi-simple exceptional groups of Lie type up to some minor ambiguities relating to non-uniqueness of Jordan decomposition. For this, we first determine the \ell -block distribution for finite reductive groups whose ambient algebraic group defined in characteristic different from \ell has connected centre. As a consequence we derive a compatibility between \ell -blocks, e -Harish-Chandra series and Jordan decomposition. Further we apply our results to complete the proof of Robinson's conjecture on defects of characters.

Loss: TripletLoss with these parameters:

{
    "distance_metric": "TripletDistanceMetric.COSINE",
    "triplet_margin": 0.3
}

Evaluation Dataset

Unnamed Dataset

Size: 391 evaluation samples
Columns: anchor, positive, and negative

Approximate statistics based on the first 391 samples:

	anchor	positive	negative
type	string	string	string
details	min: 88 tokens mean: 241.2 tokens max: 512 tokens	min: 91 tokens mean: 242.12 tokens max: 512 tokens	min: 92 tokens mean: 244.54 tokens max: 512 tokens

Samples:

anchor	positive	negative
Some two component catalysts supported on inorganic oxides were prepared by wet impregnation and their catalytic performances for the direct synthesis of dimethyl carbonate from carbon dioxide, propylene oxide and methanol were studied. The influences of reaction temperature, amount of catalyst, reaction pressure and size of support on the synthetic reaction were investigated. The results showed that two component catalyst supported on ZnO had good catalytic activity and optimum reaction temperature was . The highest yield of DMC was obtained over a catalyst with % active component. The influence of reaction pressure was not obvious, and the decrease of the size of support favored the formation of DMC.	-Dihydroxybenzoic acid was prepared by carboxylation of resorcinol in solvent under under atmospheric pressure was studied.The mechanism of Kolbe-Schmitt carboxylation was analyzed and the sutable solvent for the carboxylation reaction was selected.The optimum process parameters were determined by orthogonal test.Under the optimized parameters,i.e.,resorcinol to potassium carbonate molar ratio of ,reaction temperature of - ,reaction time of hours- hours,and dimethyl acetamide as solvent,the yield of -dihydroxybenzoic acid was up to %	Background This paper uses a SEIR(D) model to analyse the time-varying transmission dynamics of the COVID- epidemic in Korea throughout its multiple stages of development. This multi-stage estimation of the model parameters offers a better model fit compared to the whole period analysis and shows how the COVID- 's infection patterns change over time, primarily depending on the effectiveness of the public health authority's non-pharmaceutical interventions (NPIs).Methods This paper uses the SEIR(D) compartment model to simulate and estimate the parameters for three distinctive stages of the COVID- epidemic in Korea, using a manually compiled COVID- epidemic dataset for the period between February and February . The paper identifies three major stages of the COVID- epidemic, conducts multi-stage estimations of the SEIR(D) model parameters, and carefully infers context-dependent meaning of the estimation results to help better understand the unique patterns of the transmission of the nove...
Clinical Pharmacology & TherapeuticsVolume , Issue p. - USAN Council List No. First published: July ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinked InRedditWechat Abstract Clinical Pharmacology and Therapeutics ( ) , ; doi: /clpt.0000.000 Volume00, Issue0July 0000Pages - RelatedInformation	Clinical Pharmacology & TherapeuticsVolume , Issue p. - FDA papers FDA papers II Walter Modell M.D., Walter Modell M.D.Search for more papers by this authorC. E. Healy M.D., C. E. Healy M.D. Evansville, Ind.Search for more papers by this author Walter Modell M.D., Walter Modell M.D.Search for more papers by this authorC. E. Healy M.D., C. E. Healy M.D. Evansville, Ind.Search for more papers by this author First published: March 0AboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Citing Literature Volume0, Issue0March 0000Pages - Relat...	Preeclampsia is characterized by reduced placental perfusion with placental ischemia and hypertension during pregnancy. Preeclamptic women also exhibit a heightened inflammatory state and greater number of neutrophils in the vasculature compared to normal pregnancy. Since neutrophils are associated with tissue injury and inflammation, we hypothesized that neutrophils are critical to placental ischemia-induced hypertension and fetal demise. Using the reduced uteroplacental perfusion pressure (RUPP) model of placental ischemia-induced hypertension in the rat, we determined the effect of neutrophil depletion on blood pressure and fetal resorptions. Neutrophils were depleted with repeated injections of polyclonal rabbit anti-rat polymorphonuclear leukocyte (PMN) antibody (antiPMN). Rats received either antiPMN or normal rabbit serum (Control) on , , , and days post conception (dpc). On dpc, rats underwent either Sham surgery or clip placement on ovarian arteries and abdominal aorta to redu...
Prior to the start of the LHC Run , the US ATLAS Software and Computing operations program established three shared Tier Analysis Facilities (AFs). The newest AF was established at the University of Chicago in the past year, joining the existing AFs at Brookhaven National Lab and SLAC National Accelerator Lab. In this paper, we will describe both the common and unique aspects of these three AFs, and the resulting distributed facility from the user's perspective, including how we monitor and measure the AFs. The common elements include enabling easy access via Federated ID, file sharing via EOS, provisioning of similar Jupyter environments using common Jupyter kernels and containerization, and efforts to centralize documentation and user support channels. The unique components we will cover are driven in turn by the requirements, expertise and resources at each individual site. Finally, we will highlight how the US AFs are collaborating with other ATLAS and LHC wide (IRIS-HEP and HSF) u...	Network traffic optimisation is difficult as the load is by nature dynamic and seemingly unpredictable. However, the increased usage of file transfer services may help the detection of future loads and the prediction of their expected duration. The NOTED project seeks to do exactly this and to dynamically adapt network topology to deliver improved bandwidth for users of such services. This article introduces, and explains the features of, the two main components of NOTED, the Transfer Broker and the Network Intelligence component. The Transfer Broker analyses all queued and on-going FTS transfers, producing a traffic report which can be used by network controllers. Based on this report and its knowledge of the network topology and routing, the Network Intelligence (NI) component makes decisions as to when a network reconfiguration could be beneficial. Any Software Defined Network controller can then apply these decision to the network, so optimising transfer execution time and reducing...	Human Geophagia, a phenomenon widely practised especially in Africa, is the craving and deliberate ingestion of clayey soils. It is frequently practised by women and children to relieve hunger, supply nutritional deficiencies or as folk medicine. Geophagic individuals are very selective in the type of clayey soil they consume, where it is obtained, and its physical state; as well as its colour, smell and texture. Though clayey soils are medicinal, they could equally be risky and hazardous to human health. Reports have associated geophagia with iron deficiency leading to anaemia, infestation with geohelminths, and abrasion of the gastro-intestines. This overview brings awareness on clayey soils consumed and throws light on the human health associated effects.

Loss: TripletLoss with these parameters:

{
    "distance_metric": "TripletDistanceMetric.COSINE",
    "triplet_margin": 0.3
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 4
per_device_eval_batch_size: 4
learning_rate: 1e-05
weight_decay: 0.01
num_train_epochs: 2
warmup_ratio: 0.1
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 4
per_device_eval_batch_size: 4
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 1e-05
weight_decay: 0.01
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 2
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Epoch	Step	Training Loss	Validation Loss	specter_og_cosine_accuracy	modernBERT_disciplines_cosine_accuracy
0	0	-	-	0.9579	-
0.0511	100	0.1007	0.0612	0.9649	-
0.1022	200	0.0442	0.0423	0.9687	-
0.1533	300	0.0372	0.0342	0.9725	-
0.2044	400	0.0319	0.0274	0.9725	-
0.2555	500	0.0307	0.0282	0.9738	-
0.3066	600	0.0318	0.0268	0.9789	-
0.3577	700	0.0278	0.0251	0.9770	-
0.4088	800	0.0266	0.0282	0.9757	-
0.4599	900	0.0274	0.0252	0.9745	-
0.5110	1000	0.0317	0.0263	0.9770	-
0.5621	1100	0.024	0.0249	0.9770	-
0.6132	1200	0.0201	0.0236	0.9770	-
0.6643	1300	0.0202	0.0225	0.9757	-
0.7154	1400	0.0284	0.0228	0.9777	-
0.7665	1500	0.0229	0.0236	0.9777	-
0.8176	1600	0.0299	0.0219	0.9789	-
0.8687	1700	0.0315	0.0197	0.9808	-
0.9198	1800	0.0222	0.0193	0.9840	-
0.9709	1900	0.0251	0.0197	0.9821	-
1.0220	2000	0.0283	0.0190	0.9789	-
1.0731	2100	0.017	0.0198	0.9770	-
1.1242	2200	0.0154	0.0189	0.9821	-
1.1753	2300	0.0079	0.0192	0.9840	-
1.2264	2400	0.0042	0.0191	0.9834	-
1.2775	2500	0.0065	0.0197	0.9808	-
1.3286	2600	0.0066	0.0198	0.9796	-
1.3797	2700	0.0058	0.0196	0.9821	-
1.4308	2800	0.0084	0.0196	0.9828	-
1.4819	2900	0.009	0.0199	0.9847	-
1.5330	3000	0.0053	0.0193	0.9828	-
1.5841	3100	0.0075	0.0185	0.9821	-
1.6352	3200	0.0045	0.0188	0.9840	-
1.6863	3300	0.0051	0.0185	0.9821	-
1.7374	3400	0.008	0.0189	0.9821	-
1.7885	3500	0.0097	0.0187	0.9834	-
1.8396	3600	0.0083	0.0186	0.9840	-
1.8906	3700	0.007	0.0183	0.9847	-
1.9417	3800	0.0072	0.0180	0.9840	-
1.9673	3850	-	-	-	0.9847

Framework Versions

Python: 3.10.12
Sentence Transformers: 3.3.1
Transformers: 4.48.0.dev0
PyTorch: 2.5.1+cu121
Accelerate: 1.2.1
Datasets: 3.2.0
Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

m7n
/

discipline-tuned_specter_1_001

SentenceTransformer based on allenai/specter

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Evaluation

Metrics

Triplet

Training Details

Training Dataset

Unnamed Dataset

Evaluation Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Framework Versions

Citation

BibTeX

Sentence Transformers

TripletLoss

Model tree for m7n/discipline-tuned_specter_1_001

Evaluation results