SentenceTransformer based on allenai/specter

This is a sentence-transformers model finetuned from allenai/specter. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: allenai/specter
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("m7n/discipline-tuned_specter_1_001")
# Run inference
sentences = [
    'To deal with the theme of the "unrepresented" it is necessary to clarify what we mean by representation. Depending on whether a formal, substantial, descriptive or symbolic concept of representation is adopted, in fact, the answer to the question: "Who are the unrepresented?" changes. Based on a formal concept, the unrepresented are those formally excluded from political rights. On the basis of a substantial conception, instead, they too can be considered represented, if there is someone who pursues their interests in the institutions. According to a descriptive concept, an assembly selected through the draw must be considered representative. The same can be said of a leader in whom a community identifies itself symbolically. The author claims that the adoption of exclusively substantial, descriptive or symbolic conceptions of representation involves many problems from the point of view of democratic theory, and therefore adopts a formal perspective. According to it, the unrepresented can be divided into three categories: a) who has not the right to elect representatives; b) who has this right, but fails to elect his or her own representative; c) who has this right but doesn\'t exercise it. The first category includes foreign residents without citizenship in democratic countries. The author argues that discrimination against them is not rationally justifiable, because it cannot be based on any of the classic arguments developed to limit political rights (such as lack of capacity, independence, or interest). The second category includes those who vote, but don\'t contribute to the election of anyone representing them. The existence of this category raises the problem of distorting electoral laws, and the issue of the size of representative assemblies. The third category includes those who don\'t exercise their political rights. A worrying sign that the vote by many is no longer perceived as a vehicle for change.',
    'The constitutional State is under attack. If we separate Rule of Law and democratic sovereignty, civil rights and social rights, the holding of pluralist democracies is jeopardized. The sunset of the Rule of Law risks being one of the most dangerous consequences of neoliberal globalism and its crisis. The demolition of the welfare State and the technocratic depletion of politics have in fact generated a distortion of constitutional democracies, which can open the way for the questioning of the Rule of law. The opposing ideological narratives on the Rule of Law can be grouped according to two visions: an optimistic one, which sees in neo-liberal globalization the opportunity for its generalized diffusion; a radical-maximalist, which completely liquidates its regulatory framework and inheritance. The essay analyzes these two trends, to focus then on the emergency paradigm as a challenge to the "Rule of law".',
    'Human attachment relationships are considered to be foundational to psychological well-being (Fonagy, ; Warren, Huston, Egeland, & Sroufe, ) and, by extension, attachment to God is often considered foundational to psychological well-being amongst Christian believers (Kirkpatrick, ; Miner, ). However, studies of psychological need satisfaction by different attachment figures (La Guardia, Ryan, Couchman, & Deci, ) suggest that experiences in which basic psychological needs are satisfied are conducive to more secure attachment relationships, and thus, to enhanced psychological well-being. This paper tests two contrasting models of attachment to God, need satisfaction, and well-being: the Attachment Security Primacy Model which holds that attachment security facilitates experiences of psychological need satisfaction and thence increased well-being; and the Need Satisfaction Primacy Model which holds that experiences of psychological need satisfaction facilitate attachment security and thence increased well-being. Using self-report data from Australian Christian participants, Structural Equation Modeling indicated that the Need Satisfaction Primacy Model fit the data better than competing models. Implications for augmenting theories of attachment to God and providing contexts in which people can experience God as meeting basic needs are discussed.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

  • Datasets: specter_og and modernBERT_disciplines
  • Evaluated with TripletEvaluator
Metric specter_og modernBERT_disciplines
cosine_accuracy 0.984 0.9847

Training Details

Training Dataset

Unnamed Dataset

  • Size: 7,828 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 88 tokens
    • mean: 245.68 tokens
    • max: 512 tokens
    • min: 85 tokens
    • mean: 243.1 tokens
    • max: 512 tokens
    • min: 77 tokens
    • mean: 242.04 tokens
    • max: 512 tokens
  • Samples:
    anchor positive negative
    ChemInformVolume , Issue Reviews ChemInform Abstract: -Chloro- -aza- -propeniminium Units as Versatile Building Blocks in Organic Synthesis J. LIEBSCHER, J. LIEBSCHER Sekt. Chem., Humboldt-Univ., DDR- BerlinSearch for more papers by this author J. LIEBSCHER, J. LIEBSCHER Sekt. Chem., Humboldt-Univ., DDR- BerlinSearch for more papers by this author First published: January , the full textAboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinked InRedditWechat No abstract is available for this article. Volume00, Issue0January , RelatedInformation ChemInformVolume , Issue Heterocyclic Compounds ChemInform Abstract: Anhydrous CeCl0-Catalyzed C0-Selective Propargylation of Indoles with Tertiary Alcohols. Claudio C. Silveira, Claudio C. Silveira Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, BrazilSearch for more papers by this authorSamuel R. Mendes, Samuel R. Mendes Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, BrazilSearch for more papers by this authorLucas Wolf, Lucas Wolf Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, BrazilSearch for more papers by this authorGuilherme M. Martins, Guilherme M. Martins Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, BrazilSearch for more papers by this author Claudio C. Silveira, Claudio C. Silveira Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, BrazilSearch for more papers by this authorSamuel R. Mendes, Samuel R. Mendes Dep. Quim., Univ. Fed. Santa Maria, Santa Maria, Rio Grande do Sul, ... INSIGHTVolume , Issue p. - Special Feature Highway Infrastructure Michael E. Krueger, Michael E. Krueger Search for more papers by this author Michael E. Krueger, Michael E. Krueger Search for more papers by this author First published: June 0AboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Citing Literature Volume0, Issue0April 0000Pages - RelatedInformation
    Background: When determining the duration of an acute bout of physical activity (PA) in an experiment, it is important for researchers to consider associations between duration and the target outcome, as well as how amenable participants will be to enrolling, and whether they will be capable of completing the study. Researchers must strike a suitable balance when working with populations that are commonly inactive, such as people with schizophrenia. Conceptually, a participant's task self-efficacy might indicate their willingness to participate in a study and their confidence in completing a PA protocol. To inform a future protocol, this study examined the self-efficacy of individuals with schizophrenia to complete PA bouts of differing durations. Methods: A secondary analysis on reliability data from a Health Action Process Approach inventory for PA in schizophrenia (n= ) was performed. Task self-efficacy was measured using -items. Participants rated how confident they were in their p... Involvement opportunities (IOs) are perceived benefits that are only present through continued sport involvement (Weiss & Amorose, ). Knowing which IOs are poignant in different segments of a population may be important in explaining participants' sport commitment, their behaviours, and purchase intentions (Casper & Stellino, ; Young, Bennett & Seguin, ). This study examined how Masters swimmers judged IOs, as a function of age group ( - , - , - , +), sex, prior participation length (< , + yrs), and probability (low, high) of attending a World championship event. Participants reported information on demographics, sport involvement, intentions, and responded to a survey (Bennett & Young, ) assessing different IOs. A series of MANOVAs identified differences according to sample segments, all ps < . All age cohorts highly recognized opportunities for 'enjoyment', 'health and fitness', 'social', 'stress relief' and 'personal testing and assessment', though the youngest group viewed the latt... The article analyzes the current state with anti-monopolistic regulation with regards to the transactions of mergers and acquisitions in Russia, describes the recent changes in legislation related to it, and analyzes the major trends in state regulation over merges and acquisitions in the post-crisis period. The acquisition of TNK-BP by Rosneft' is the central example discussed in the article. The author presents recommendation towards improving the mechanisms of evaluation of the effect produced by the transaction of merges and acquisitions upon Russian economy
    In academe, there is a great bifurcation in the understanding of such things, i.e., the main thought of German ideology and its status in the history of Marxist philosophy. By exploring the historical background and the purpose of German ideology, and with the help of the direct explanation of Marx and Engels in this book, the author thinks that the main thought and basic content of German ideology is the theory on individuals. After that, the author elucidates the relation between the theory on individuals of Marxism and historical materialism, and the history of Marxist philosophy as well. After the Second Opium War, the government of the Qing Dynasty signed Treaty of Tianjing with America and British in succession, starting from which Shantou opened its seaport to the world, and the foreigners had enjoyed the rights to rent estate to build warehouse, church, hospital and graveyards. Being different with other cities such as Shanghai, Tianjin, and Hankou, where designed some special areas to rent, the expanding situation in Shantou is more complicated. Based on the analysis of British Public Archives Files, this paper focuses on the two recorded disputes on estate involving Chinese and foreigners in Shantou, in order to help us to gain a deep understanding of the complicate situation in the expanding progression of Chinese coastal ports, by presenting the transition of the strategies adopted by foreigners to rent or buy estate, and the reactions taken by Chinese officials during the formation of Shantou City in the end of the Qing Dynasty. We complete the determination of the \ell -block distribution of characters for quasi-simple exceptional groups of Lie type up to some minor ambiguities relating to non-uniqueness of Jordan decomposition. For this, we first determine the \ell -block distribution for finite reductive groups whose ambient algebraic group defined in characteristic different from \ell has connected centre. As a consequence we derive a compatibility between \ell -blocks, e -Harish-Chandra series and Jordan decomposition. Further we apply our results to complete the proof of Robinson's conjecture on defects of characters.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.3
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 391 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 391 samples:
    anchor positive negative
    type string string string
    details
    • min: 88 tokens
    • mean: 241.2 tokens
    • max: 512 tokens
    • min: 91 tokens
    • mean: 242.12 tokens
    • max: 512 tokens
    • min: 92 tokens
    • mean: 244.54 tokens
    • max: 512 tokens
  • Samples:
    anchor positive negative
    Some two component catalysts supported on inorganic oxides were prepared by wet impregnation and their catalytic performances for the direct synthesis of dimethyl carbonate from carbon dioxide, propylene oxide and methanol were studied. The influences of reaction temperature, amount of catalyst, reaction pressure and size of support on the synthetic reaction were investigated. The results showed that two component catalyst supported on ZnO had good catalytic activity and optimum reaction temperature was . The highest yield of DMC was obtained over a catalyst with % active component. The influence of reaction pressure was not obvious, and the decrease of the size of support favored the formation of DMC. -Dihydroxybenzoic acid was prepared by carboxylation of resorcinol in solvent under under atmospheric pressure was studied.The mechanism of Kolbe-Schmitt carboxylation was analyzed and the sutable solvent for the carboxylation reaction was selected.The optimum process parameters were determined by orthogonal test.Under the optimized parameters,i.e.,resorcinol to potassium carbonate molar ratio of ,reaction temperature of - ,reaction time of hours- hours,and dimethyl acetamide as solvent,the yield of -dihydroxybenzoic acid was up to % Background This paper uses a SEIR(D) model to analyse the time-varying transmission dynamics of the COVID- epidemic in Korea throughout its multiple stages of development. This multi-stage estimation of the model parameters offers a better model fit compared to the whole period analysis and shows how the COVID- 's infection patterns change over time, primarily depending on the effectiveness of the public health authority's non-pharmaceutical interventions (NPIs).Methods This paper uses the SEIR(D) compartment model to simulate and estimate the parameters for three distinctive stages of the COVID- epidemic in Korea, using a manually compiled COVID- epidemic dataset for the period between February and February . The paper identifies three major stages of the COVID- epidemic, conducts multi-stage estimations of the SEIR(D) model parameters, and carefully infers context-dependent meaning of the estimation results to help better understand the unique patterns of the transmission of the nove...
    Clinical Pharmacology & TherapeuticsVolume , Issue p. - USAN Council List No. First published: July ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinked InRedditWechat Abstract Clinical Pharmacology and Therapeutics ( ) , ; doi: /clpt.0000.000 Volume00, Issue0July 0000Pages - RelatedInformation Clinical Pharmacology & TherapeuticsVolume , Issue p. - FDA papers FDA papers II Walter Modell M.D., Walter Modell M.D.Search for more papers by this authorC. E. Healy M.D., C. E. Healy M.D. Evansville, Ind.Search for more papers by this author Walter Modell M.D., Walter Modell M.D.Search for more papers by this authorC. E. Healy M.D., C. E. Healy M.D. Evansville, Ind.Search for more papers by this author First published: March 0AboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Citing Literature Volume0, Issue0March 0000Pages - Relat... Preeclampsia is characterized by reduced placental perfusion with placental ischemia and hypertension during pregnancy. Preeclamptic women also exhibit a heightened inflammatory state and greater number of neutrophils in the vasculature compared to normal pregnancy. Since neutrophils are associated with tissue injury and inflammation, we hypothesized that neutrophils are critical to placental ischemia-induced hypertension and fetal demise. Using the reduced uteroplacental perfusion pressure (RUPP) model of placental ischemia-induced hypertension in the rat, we determined the effect of neutrophil depletion on blood pressure and fetal resorptions. Neutrophils were depleted with repeated injections of polyclonal rabbit anti-rat polymorphonuclear leukocyte (PMN) antibody (antiPMN). Rats received either antiPMN or normal rabbit serum (Control) on , , , and days post conception (dpc). On dpc, rats underwent either Sham surgery or clip placement on ovarian arteries and abdominal aorta to redu...
    Prior to the start of the LHC Run , the US ATLAS Software and Computing operations program established three shared Tier Analysis Facilities (AFs). The newest AF was established at the University of Chicago in the past year, joining the existing AFs at Brookhaven National Lab and SLAC National Accelerator Lab. In this paper, we will describe both the common and unique aspects of these three AFs, and the resulting distributed facility from the user's perspective, including how we monitor and measure the AFs. The common elements include enabling easy access via Federated ID, file sharing via EOS, provisioning of similar Jupyter environments using common Jupyter kernels and containerization, and efforts to centralize documentation and user support channels. The unique components we will cover are driven in turn by the requirements, expertise and resources at each individual site. Finally, we will highlight how the US AFs are collaborating with other ATLAS and LHC wide (IRIS-HEP and HSF) u... Network traffic optimisation is difficult as the load is by nature dynamic and seemingly unpredictable. However, the increased usage of file transfer services may help the detection of future loads and the prediction of their expected duration. The NOTED project seeks to do exactly this and to dynamically adapt network topology to deliver improved bandwidth for users of such services. This article introduces, and explains the features of, the two main components of NOTED, the Transfer Broker and the Network Intelligence component. The Transfer Broker analyses all queued and on-going FTS transfers, producing a traffic report which can be used by network controllers. Based on this report and its knowledge of the network topology and routing, the Network Intelligence (NI) component makes decisions as to when a network reconfiguration could be beneficial. Any Software Defined Network controller can then apply these decision to the network, so optimising transfer execution time and reducing... Human Geophagia, a phenomenon widely practised especially in Africa, is the craving and deliberate ingestion of clayey soils. It is frequently practised by women and children to relieve hunger, supply nutritional deficiencies or as folk medicine. Geophagic individuals are very selective in the type of clayey soil they consume, where it is obtained, and its physical state; as well as its colour, smell and texture. Though clayey soils are medicinal, they could equally be risky and hazardous to human health. Reports have associated geophagia with iron deficiency leading to anaemia, infestation with geohelminths, and abrasion of the gastro-intestines. This overview brings awareness on clayey soils consumed and throws light on the human health associated effects.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.3
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 2
  • warmup_ratio: 0.1
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss specter_og_cosine_accuracy modernBERT_disciplines_cosine_accuracy
0 0 - - 0.9579 -
0.0511 100 0.1007 0.0612 0.9649 -
0.1022 200 0.0442 0.0423 0.9687 -
0.1533 300 0.0372 0.0342 0.9725 -
0.2044 400 0.0319 0.0274 0.9725 -
0.2555 500 0.0307 0.0282 0.9738 -
0.3066 600 0.0318 0.0268 0.9789 -
0.3577 700 0.0278 0.0251 0.9770 -
0.4088 800 0.0266 0.0282 0.9757 -
0.4599 900 0.0274 0.0252 0.9745 -
0.5110 1000 0.0317 0.0263 0.9770 -
0.5621 1100 0.024 0.0249 0.9770 -
0.6132 1200 0.0201 0.0236 0.9770 -
0.6643 1300 0.0202 0.0225 0.9757 -
0.7154 1400 0.0284 0.0228 0.9777 -
0.7665 1500 0.0229 0.0236 0.9777 -
0.8176 1600 0.0299 0.0219 0.9789 -
0.8687 1700 0.0315 0.0197 0.9808 -
0.9198 1800 0.0222 0.0193 0.9840 -
0.9709 1900 0.0251 0.0197 0.9821 -
1.0220 2000 0.0283 0.0190 0.9789 -
1.0731 2100 0.017 0.0198 0.9770 -
1.1242 2200 0.0154 0.0189 0.9821 -
1.1753 2300 0.0079 0.0192 0.9840 -
1.2264 2400 0.0042 0.0191 0.9834 -
1.2775 2500 0.0065 0.0197 0.9808 -
1.3286 2600 0.0066 0.0198 0.9796 -
1.3797 2700 0.0058 0.0196 0.9821 -
1.4308 2800 0.0084 0.0196 0.9828 -
1.4819 2900 0.009 0.0199 0.9847 -
1.5330 3000 0.0053 0.0193 0.9828 -
1.5841 3100 0.0075 0.0185 0.9821 -
1.6352 3200 0.0045 0.0188 0.9840 -
1.6863 3300 0.0051 0.0185 0.9821 -
1.7374 3400 0.008 0.0189 0.9821 -
1.7885 3500 0.0097 0.0187 0.9834 -
1.8396 3600 0.0083 0.0186 0.9840 -
1.8906 3700 0.007 0.0183 0.9847 -
1.9417 3800 0.0072 0.0180 0.9840 -
1.9673 3850 - - - 0.9847

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.48.0.dev0
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
6
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for m7n/discipline-tuned_specter_1_001

Base model

allenai/specter
Finetuned
(2)
this model

Evaluation results