SetFit with jinaai/jina-embeddings-v2-base-en
This is a SetFit model that can be used for Text Classification. This SetFit model uses jinaai/jina-embeddings-v2-base-en as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
Model Sources
Model Labels
Label |
Examples |
ccro:BasedOn |
- 'The axiomatizations presented in Quesada (2010, 2011) also dispense with strong monotonicity.'
|
ccro:Basedon |
- 'A formal mathematical description of the h-index introduced by Hirsch (2005)'
- 'Woeginger (2008a, b) and Quesada (2009, 2010) have already suggested characterizations of the Hirsch index'
- 'Woeginger (2008a, b) and Quesada (2009, 2010) have already suggested characterizations of the Hirsch index'
|
ccro:Compare |
- 'Instead, a variety of studies [8, 9] have shown that the h index by and large agrees with other objective and subjective measures of scientific quality in a variety of different disciplines (10–15),'
- 'Instead, a variety of studies [8, 9] have shown that the h index by and large agrees with other objective and subjective measures of scientific quality in a variety of different disciplines (10–15),'
- 'Instead, a variety of studies [8, 9] have shown that the h index by and large agrees with other objective and subjective measures of scientific quality in a variety of different disciplines (10–15),'
|
ccro:Contrast |
- 'Hirsch (2005) argues that two individuals with similar Hirsch-index are comparable in terms of their overall scientific impact, even if their total number of papers or their total number of citations is very different.'
- 'The three differ from Woeginger’s (2008a) characterization in requiring fewer axioms (three instead of five)'
- 'Marchant (2009), instead of characterizing the index itself, characterizes the ranking that the Hirsch index induces on outputs.'
|
ccro:Criticize |
- 'The h-index does not take into account that some papers may have extraordinarily many citations, and the g-index tries to compensate for this; see also Egghe (2006b) and Tol (2008).'
- 'The h-index does not take into account that some papers may have extraordinarily many citations, and the g-index tries to compensate for this; see also Egghe (2006b) and Tol (2008).'
- 'Woeginger (2008a, p. 227) stresses that his axioms should be interpreted within the context of MON.'
|
ccro:Discuss |
- 'The relation between N and h will depend on the detailed form of the particular distribution (HI0501-01)'
- 'As discussed by Redner (HI0501-03), most papers earn their citations over a limited period of popularity and then they are no longer cited.'
- 'It is also possible that papers "drop out" and then later come back into the h count, as would occur for the kind of papers termed "sleeping beauties" (HI0501-04).'
|
ccro:Extend |
- 'In [3] the analogous formula for the g-index has been proved'
|
ccro:Incorporate |
- 'In this paper, we provide an axiomatic characterization of the Hirsch-index, in very much the same spirit as Arrow (1950, 1951), May (1952), and Moulin (1988) did for numerous other problems in mathematical decision making.'
- 'In this paper, we provide an axiomatic characterization of the Hirsch-index, in very much the same spirit as Arrow (1950, 1951), May (1952), and Moulin (1988) did for numerous other problems in mathematical decision making.'
- 'In this paper, we provide an axiomatic characterization of the Hirsch-index, in very much the same spirit as Arrow (1950, 1951), May (1952), and Moulin (1988) did for numerous other problems in mathematical decision making.'
|
ccro:Negate |
- 'Recently, Lehmann et al. (2, 3) have argued that the mean number of citations per paper (nc = Nc/Np) is a superior indicator.'
- 'If one chose instead to use as indicator of scientific achievement the mean number of citations per paper [following Lehmann et al. (2, 3)], our results suggest that (as in the stock market) ‘‘past performance is not predictive of future performance.’’'
- 'It has been argued in the literature that one drawback of the h index is that it does not give enough ‘‘credit’’ to very highly cited papers, and various modifications have been proposed to correct this, in particular, Egghe’s g index (4), Jin et al.’s AR index (5), and Komulski’s H(2) index (6).'
|
Evaluation
Metrics
Label |
Accuracy |
all |
0.6667 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("Corran/CCRO2")
preds = model("One of the referees recommends mentioning Quesada (2008) as another characterization of the Hirsch index relying as well on monotonicity.")
Training Details
Training Set Metrics
Training set |
Min |
Median |
Max |
Word count |
6 |
25.7812 |
53 |
Label |
Training Sample Count |
ccro:BasedOn |
1 |
ccro:Basedon |
11 |
ccro:Compare |
21 |
ccro:Contrast |
3 |
ccro:Criticize |
4 |
ccro:Discuss |
37 |
ccro:Extend |
1 |
ccro:Incorporate |
14 |
ccro:Negate |
4 |
Training Hyperparameters
- batch_size: (32, 32)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 100
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
Epoch |
Step |
Training Loss |
Validation Loss |
0.0017 |
1 |
0.311 |
- |
0.0833 |
50 |
0.1338 |
- |
0.1667 |
100 |
0.0054 |
- |
0.25 |
150 |
0.0017 |
- |
0.3333 |
200 |
0.0065 |
- |
0.4167 |
250 |
0.0003 |
- |
0.5 |
300 |
0.0003 |
- |
0.5833 |
350 |
0.0005 |
- |
0.6667 |
400 |
0.0004 |
- |
0.75 |
450 |
0.0002 |
- |
0.8333 |
500 |
0.0002 |
- |
0.9167 |
550 |
0.0002 |
- |
1.0 |
600 |
0.0002 |
- |
Framework Versions
- Python: 3.10.12
- SetFit: 1.0.3
- Sentence Transformers: 2.2.2
- Transformers: 4.35.2
- PyTorch: 2.1.0+cu121
- Datasets: 2.16.1
- Tokenizers: 0.15.0
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}