SentenceTransformer based on answerdotai/ModernBERT-large
This is a sentence-transformers model finetuned from answerdotai/ModernBERT-large. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: answerdotai/ModernBERT-large
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 1024 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("m7n/discipline-bert-modern-large_v02")
# Run inference
sentences = [
"The social sciences have long shown that health is not born of pure biology, empirically (re)centred the social and material causes of disease, and affirmed the subjective experiences of disease. Disputed both in popular and academic discourses, social health has variously attempted to stress the social aspects of health. Existing conceptions remain analytically limited as they are predominantly used as descriptors for populational health. This article theorises social health as an analytical lens for making sense of the relations, affects and events where health unfolds and comes into expression. Drawing on social practice theory, feminist care ethics and posthumanism this conceptual paper re-imagines how social health might be conceived as lived social practices anchored in care. Care within our framework acknowledges the unavoidable interdependency foundational to the existence of beings and stresses the 'know how' and embodied practices of care in the mundane in order to emphasise that care itself is absolutely integral to the maintenance of social health. The article argues that health needs to be understood as a verb intrinsically (re)made in and through social contexts and structures and comprised of meaningful, human-human and human-non-human interactions. Ultimately, in theorising social health through mundane care practices, we hope to open up research to making sense of how the doing of health unfolds inside often banal, patterned forms of social activity. Such taken-for-granted social practices exemplify the often overlooked lived realities that comprise our health. To understand health in its own right, we argue, these everyday practices need to be interrogated.",
"Care has been theorised in relationship to eating disorders as a central consideration across diagnoses. In the context of avoidant restrictive food intake disorder (ARFID) specifically, there is room to further develop the nuances around layers of care involved in working towards well-being. In this paper, we engage with the stories of caregivers of people with ARFID, exploring their pathways to care (or lack thereof) through the healthcare system in Aotearoa New Zealand. We explore the material, affective and relational aspects of care and care-seeking, engaging with the power and politics of care as it flows through care-seeking assemblages. Using postqualitative methods of analysis, we discuss how while participants were seeking care, they received (or, at times, did not receive) treatment, and unpack how care and treatment are not always synonymous. We work up extracts from parents' stories surrounding their caring for their children and how their actions were, at times, interpreted in ways that made them feel blame and shame rather than care. Participants' stories also offer glimmers of care within a resource-strapped healthcare system, which invite us to consider the potentiality of a relational ethics of care as an assemblage-shifting moment.",
'A dB dynamic range and cm spatial resolution tunable photon-counting optical time-domain reflectometer (PC-OTDR) is presented along with a Field Programmable Gate Array (FPGA)-based detection management system that allows several regions of the fiber to be interrogated by the same optical pulse, increasing the data acquisition rate when compared to previous solutions. The optical pulse generation is implemented by a tunable figure- passive mode-locked laser providing pulses with the desired bandwidth and center wavelength for WDM applications in the C-band. The acquisition rate is limited by the afterpulse effect and dead time of the employed gated avalanche single-photon detectors. The devised acquisition system not only allows for centimeter-resolution monitoring of fiber links as long as km in under minutes but is also readily adapted to any other photon-counting strategy for increased acquisition rate. The system provides a -fold decrease in acquisition times when compared with state-of-the-art solutions, allowing affordable times in centimeter-resolution long-distance fiber measurements.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Datasets:
modernBERT
andmodernBERT_disciplines
- Evaluated with
TripletEvaluator
Metric | modernBERT | modernBERT_disciplines |
---|---|---|
cosine_accuracy | 0.9847 | 0.9789 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 7,828 training samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 86 tokens
- mean: 240.32 tokens
- max: 633 tokens
- min: 84 tokens
- mean: 243.66 tokens
- max: 668 tokens
- min: 88 tokens
- mean: 237.15 tokens
- max: 681 tokens
- Samples:
anchor positive negative Flash memory devices are investigated to confirm their application as physically unclonable functions (PUFs). Inherent fluctuations in the characteristics of flash memory devices, even with identical fabrication processes, produce different outputs, which are useful for device fingerprints. A difference in programming/erasing efficiency arises from a widely distributed threshold voltage. However, statistical fluctuations in the threshold voltage represent an advantage for PUF applications. The characteristics of PUFs, such as their unclonability, uncontrollability, unpredictability, and robustness, are investigated using fabricated flash memory devices. A simulation study is performed to support the experimental results and to show that the unpredictability is induced by variations in the gate dielectric thickness.
Ternary Content Addressable Memory (TCAM) is used in applications that require a low power dissipation and fast data retrieval. This paper presents a domain wall-based spintronic TCAM cell. The proposed design exploits the resistive behavior of this nonvolatile memory, reduces total power dissipation by reducing the voltage swing at the match line, and minimizes delay by employing a tiny sensing unit within each cell. Our experimental evaluation on nm technology for a -bit word-size TCAM at an V supply voltage and mV sense margin show that the delay is less than ps. The per-bit search energy is approximately fJ. Experimental evaluation on benchmark applications on the AMD Southern Islands GPU reveal that the GPU always dissipates less power when enhanced with the proposed TCAM design. Furthermore, the proposed method consumes at least % less energy when compared to state-of-the-art TCAM designs.
Abstract. The main focus of the paper is to present a flood and landslide early warning system, named HEWS (Hydrohazards Early Warning System), specifically developed for the Civil Protection Department of Sicily, based on the combined use of rainfall thresholds, soil moisture modelling and quantitative precipitation forecast (QPF). The warning system is referred to different Alert Zones in which Sicily has been divided into and based on a threshold system of three different increasing critical levels: ordinary, moderate and high. In this system, for early flood warning, a Soil Moisture Accounting (SMA) model provides daily soil moisture conditions, which allow to select a specific set of three rainfall thresholds, one for each critical level considered, to be used for issue the alert bulletin. Wetness indexes, representative of the soil moisture conditions of a catchment, are calculated using a simple, spatially-lumped rainfallstreamflow model, based on the SCS-CN method, and on the u...
A new method for the determination of trace levels of bromates by selective membrane collection is presented. Various membranes containing a few micrograms of different complexing reagents in a poly(vinyl chloride) matrix were tested. These membranes were produced on the surface of quartz glass (reflectors), and they were immersed in solutions containing bromate and bromide ions. At the first stage the prepared membranes collected both bromate and bromide ions, so different bromide masking agents were put in the analyzed solutions to avoid bromide collection. By the end of the equilibration time, the reflectors were left to dry, and they were analyzed by total reflection X-ray fluorescence (TXRF). The poly(vinyl chloride) with aliquat- membrane and o-dianisidin complexing agent gave the best results. The minimum detection limit was equal to ng/mL for ultrapure water and ng/mL for drinking water.
ADVERTISEMENT RETURN TO ISSUEPREVArticleNEXTVoltammetric anion responsive sensors based on modulation of ion permeability through Langmuir-Blodgett films containing synthetic anion receptorsShinobu. Nagase, Masamitsu. Kataoka, Ryuichi. Naganawa, Ryoko. Komatsu, Kazunori. Odashima, and Yoshio. UmezawaCite this: Anal. Chem. , , , 00000000Publication Date (Print):July , 0000Publication History Published online0 May 0000Published inissue July 0000https://pubs.acs.org/doi/ /ac00000a000https://doi.org/ /ac00000a000research-articleACS PublicationsRequest reuse permissionsArticle Views000Altmetric-Citations00LEARN ABOUT THESE METRICSArticle Views are the COUNTER-compliant sum of full text article downloads since November (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information abo...
This study investigated whether performance of an interceptive skill requires an intact visual-perception-action cycle. Eleven skilled male Australian rules football athletes (M age = , SD = ) were recruited from an elite developmental pathway squad for a within-subject study. Participants were required to kick a ball directly at a goal from a -meter distance while wearing a pair of stroboscopic glasses. The glasses were used to create four vision conditions. Condition one kept intact the visual-perception-action cycle with uninterrupted vision of the motor skill. Three other conditions included stroboscopic vision that presented temporal samples of vision, which interrupted the perception-action cycle through progressive increases to intermittent vision occlusion of the motor skill. Goal kick error of ball position relative to a central target line within the goal and number of successful goals kicked were measured. Written report of internal and external focus of attention was also m...
The study aimed to determine the effectiveness of Contextual Teaching and Learning (CTL) in reducing and improving learning outcomes and math anxiety among students at a private elementary school in Indonesia. The research utilized a one-group control pre-posttest design with a sample of 0th-grade students. The study used a combination of pre-test and post-test and a closed-ended questionnaire as the data collection instruments. The independent variable in the study was CTL, while the dependent variables were learning outcomes and math anxiety. The paired t-test showed a significant increase in the students' average learning outcomes and a decrease in the average math anxiety levels. The findings suggest that implementing CTL is a practical approach to reducing math anxiety and improving student learning outcomes.
This study aims to determine the problem-solving ability of field independent (FI) and field dependent (FD) students in solving HOTS story problems. This type of research is qualitative research. The research strategy used is a descriptive model. This research was carried out at a junior school in Malang, Indonesia. The respondent was tenth-grade students. Data collection methods in this study include tests and interviews. Data analysis techniques include data collection, reduction, presentation, and concluding. The results of this study show that FI and FD students understand the problem. There is no difference between the two; FI and FD students are good at understanding the problem. FI students plan solutions well and can correctly create mathematical models, while FD students have difficulty developing mathematical models. In getting answers, FI and FD students have something in common: they are not quite right in the final solution.
The recently proposed recursive least-squares (RLS) algorithm for trilinear forms, namely RLS-TF, was designed for the identification of third-order tensors of rank one. In this context, a high-dimension system identification problem can be efficiently addressed (gaining in terms of both performance and complexity) based on tensor decompositions and modelling. In this paper, following the framework of the RLS-TF, we propose a regularized version of this algorithm, where the regularization terms are incorporated within the cost functions. Furthermore, the optimal regularization parameters are derived, aiming at attenuating the effects of the system noise. Simulation results support the performance features of the proposed algorithm, especially in terms of its robustness in noisy environments.
- Loss:
TripletLoss
with these parameters:{ "distance_metric": "TripletDistanceMetric.COSINE", "triplet_margin": 0.05 }
Evaluation Dataset
Unnamed Dataset
- Size: 391 evaluation samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 391 samples:
anchor positive negative type string string string details - min: 85 tokens
- mean: 237.84 tokens
- max: 629 tokens
- min: 93 tokens
- mean: 239.31 tokens
- max: 610 tokens
- min: 83 tokens
- mean: 234.79 tokens
- max: 499 tokens
- Samples:
anchor positive negative The aim of the study was to determine the relationship between emotional intelligence and cohesion in a sports team of girls engaged in synchronized figure skating. The following psychological tests were used in the study: the Emotional Intelligence test by D.V. Lyusin, a test to determine the index of group cohesion of the Sisor. The study was conducted on the basis of the sports school "Yunost" in Yekaterinburg. Two teams of different age groups took part in the experiment: athletes performing in the category of "novices" ( years old), girls performing in the team of "CMS" ( - years old). Testing was conducted twice: at the beginning of the season and after the competitive season. The study revealed positive dynamics of the development of cohesion in both teams. It also revealed reliable relationships between interpersonal emotional intelligence and the level of cohesion in the team. Further research may be aimed at developing a strategy to increase emotional intelligence as a factor...
Recreational swimming can be used as a reliable preventive measure for those diseases that are widespread among students. The purpose of the research is to study the effect of swimming on the functional state of students. The study involved male students who are selfemployed in swimming and male students who are professionally engaged in the swimming section. The research methods used samples of Martinet-Kushelevsky, Rufier, Stange and Genchi, as well as chest excursions. It was revealed that students, who practice swimming in the section, have more favorable conditions for a comprehensive effect on the body than students who swim independently, due to a greater load during training and their systematic nature.
Mg Ni + x% Ti . Mn . V . ( x = ,00,and ) composites were prepared by hydriding combustion synthesis( HCS) and the HCS products were mechanically milled( MM) to obtain Mg-based hydrogen-storage composites. The dehydriding properties,phase structure,surface morphology,and particle composition were studied by pressure-composition-temperature( pcT),X-ray diffraction( XRD) and scanning electron microscopy( SEM). Results showed that addition of %( mass fraction) Ti . Mn . V . exhibited the best desorption property for the HCS + MM product of Mg Ni , which could completely desorb . % H in s at K. The apparent dehydrogenation activation energy of the system was decreased to . kJ / mol from . kJ / mol of Mg Ni . The improvement of the desorption property could be attributed to the enhancement of diffusion and the hydrogen pumpingof Ti . Mn . V . .
This article has been retracted: please see Elsevier Policy on Article Withdrawal ( ). This article has been retracted at the request of authors due to scientific errors reported by authors. The author reported errors are: : In the " Case Description" section, Fig. A0 (wind and PV output power) is the input data for the simulation calculation. The authors report that, due to an oversight, they did not use real wind and PV output power data, which would lead to inaccurate results for the system simulation calculations. : For the " Model solving algorithm", the authors found that it is incorrect to use the properties of Gaussian functions to improve the CDE algorithm because Gaussian functions do not have the properties of concave functions. This is evidenced in the literature "DOI: : Fig. (Iterative Convergence Curve of Rastrigin Function) is tested using the benchmark test function (Rastrigin function) in order to demonstrate the feasibility of the GCDE algorithm. However, it is clear ...
Energy accessibility especially electrical energy is considered as one of the most appealing factors to achieve energy sustainability. The purpose of this study is to investigate energy sustainability using renewable energies for two high potential cities in the south-east of Iran until the year . In this regard, Homer software is used to evaluate economic and technical analyses of PV-wind-diesel hybrid system for the two cities by the data gathering which was collected from Iran's meteorological organization. Therefore, the average of solar radiation per month for Zabol and Zahak were about and (h/d). Also, mean wind speeds are calculated m/s and m/s for Zabol and Zahak respectively which proposed that these cities have high potential in order to electrical production by a hybrid system. Furthermore, the amount of electricity production by PV array for Zabol and Zahak were (kWh/yr) and (kWh/yr) respectively, and the amount of electricity production by wind turbine were (kWh/yr) and (k...
The philosophy that built by German Idealism is obtained and never neglected religion, this is not about the religious dogmas or the fantasy and legendary nature of religion, but it is about the spirit and the crux of religion. Nevertheless, there is always struggled to deprive it from fantasies and rebuilt by philosophical ideas. These ideal philosophers are asserted to reconstruct the stories and imaginary schemes of religion into philosophical and rational thinking. There is a change in the result of this process which is religion is retreated and the metaphysics is slightly appeared. In other word, this change is directed from revelation to metaphysical views. In the light of this, the German Idealism is taking two different ways toward religion: the negative direction; which is involved to the critical studies of the basis and construction of religion, and the positive direction; this direction is returned to religion, but this return is happened after reconstruct religion by the ...
In this paper measurements of momentum and current transport caused by current driven tearing instability are reported. The measurements are done in the Madison Symmetric Torus reversed-field pinch [R. N. Dexter, D. W. Kerst, T. W. Lovell, S. C. Prager, and J. C. Sprott, Fusion Technol. , ( )] in a regime with repetitive bursts of tearing instability causing magnetic field reconnection. It is established that the plasma parallel momentum profile flattens during these reconnection events: The flow decreases in the core and increases at the edge. The momentum relaxation phenomenon is similar in nature to the well established relaxation of the parallel electrical current and could be a general feature of self-organized systems. The measured fluctuation-induced Maxwell and Reynolds stresses, which govern the dynamics of plasma flow, are large and almost balance each other such that their difference is approximately equal to the rate of change of plasma momentum. The Hall dynamo, which is d...
We present measurements of magnetic fields generated in laser-driven coil targets irradiated by laser pulses of nanosecond duration, m wavelength, J energy, and W/cm0 intensity, at the LULI0000 facility. Using two perpendicular probing axes, proton deflectometry is used to characterize the coil current and static charge at different times. Results reveal various deflection features that can be unambiguously linked to a looping quasi-steady current of well-understood polarity or to a static charging of the coil surface. Measured currents are broadly consistent with predictions from a laser-driven diode-current source and lumped circuit model, supporting the quasi-steady assessment of the discharges. Peak magnetic fields of T at the center of -m-diameter coils, obtained at the moderate laser intensity, open up the use of such laser-driven coil targets at facilities worldwide to study numerous phenomena in magnetized high-energy-density plasmas, and its potential applications.
EU , , , , , . . . , . . , . , - -EU . .In August , the UK launched a new export strategy to increase UK total exports as a proportion of gross domestic product (GDP) to % and to build trading relationships around the world after Brexit. And the government aims to strengthen UK's position as one of the 00st century's great trading nations and to expand the export of traders by setting the five principle. These principles are a business-led approach, doing what only government can do, joining up across government with local partners and the private sector, digital by design and value for money. This paper examines the background, purpose and main contents of the UK new export strategy in UK and the countermeasures for the new UK export strategy. First of all, we should prepare a scenarios based on directions of Brexit. Second, it is necessary to discuss the redefinition of relationship with Korea-UK and Korea-EU. And finally, Korean companies should enter the UK by utilizing the e-comme...
- Loss:
TripletLoss
with these parameters:{ "distance_metric": "TripletDistanceMetric.COSINE", "triplet_margin": 0.05 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 4per_device_eval_batch_size
: 4learning_rate
: 1e-05weight_decay
: 0.01num_train_epochs
: 2warmup_ratio
: 0.1batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 4per_device_eval_batch_size
: 4per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 1e-05weight_decay
: 0.01adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 2max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | Validation Loss | modernBERT_cosine_accuracy | modernBERT_disciplines_cosine_accuracy |
---|---|---|---|---|---|
0 | 0 | - | - | 0.8951 | - |
0.0511 | 100 | 0.0064 | 0.0049 | 0.9616 | - |
0.1022 | 200 | 0.002 | 0.0071 | 0.9565 | - |
0.1533 | 300 | 0.0076 | 0.0034 | 0.9795 | - |
0.2044 | 400 | 0.0074 | 0.0039 | 0.9668 | - |
0.2555 | 500 | 0.0036 | 0.0036 | 0.9693 | - |
0.3066 | 600 | 0.0035 | 0.0029 | 0.9770 | - |
0.3577 | 700 | 0.004 | 0.0035 | 0.9693 | - |
0.4088 | 800 | 0.0027 | 0.0034 | 0.9770 | - |
0.4599 | 900 | 0.0044 | 0.0032 | 0.9719 | - |
0.5110 | 1000 | 0.0037 | 0.0053 | 0.9565 | - |
0.5621 | 1100 | 0.0048 | 0.0029 | 0.9795 | - |
0.6132 | 1200 | 0.0032 | 0.0031 | 0.9744 | - |
0.6643 | 1300 | 0.0023 | 0.0036 | 0.9744 | - |
0.7154 | 1400 | 0.0044 | 0.0029 | 0.9821 | - |
0.7665 | 1500 | 0.0022 | 0.0032 | 0.9795 | - |
0.8176 | 1600 | 0.0036 | 0.0034 | 0.9770 | - |
0.8687 | 1700 | 0.0022 | 0.0031 | 0.9821 | - |
0.9198 | 1800 | 0.0028 | 0.0025 | 0.9821 | - |
0.9709 | 1900 | 0.0054 | 0.0025 | 0.9821 | - |
1.0220 | 2000 | 0.003 | 0.0029 | 0.9770 | - |
1.0731 | 2100 | 0.0018 | 0.0026 | 0.9795 | - |
1.1242 | 2200 | 0.0021 | 0.0024 | 0.9847 | - |
1.1753 | 2300 | 0.0015 | - | - | 0.9789 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.3.1
- Transformers: 4.48.0.dev0
- PyTorch: 2.5.1+cu121
- Accelerate: 1.2.1
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
TripletLoss
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for m7n/discipline-bert-modern-large_v02
Base model
answerdotai/ModernBERT-largeEvaluation results
- Cosine Accuracy on modernBERTself-reported0.985
- Cosine Accuracy on modernBERT disciplinesself-reported0.979