diff --git "a/README.md" "b/README.md" new file mode 100644--- /dev/null +++ "b/README.md" @@ -0,0 +1,367 @@ +--- +library_name: setfit +tags: +- setfit +- sentence-transformers +- text-classification +- generated_from_setfit_trainer +metrics: +- accuracy +widget: +- text: This is a significant result since such information cannot be deduced from + the study of a static and global modification using the area-difference elasticity + model. +- text: Our study reported no significant difference in the results of samples taken + from the posterior fornix of vagina and those taken from the endometrial cavity + (P value = 0.853). +- text: This is an important issue, especially for the smaller facilities. +- text: 'Responses to this question included: "Gives specific probes/presses to elicit + responses which are often delayed/impaired in children with autism", "It provides + stimuli/presses which tend to bring out some of those behaviors associated with + ASD that may not be obvious (or observed) under the more structured circumstances + of a cognitive or educational assessment".' +- text: The objective of the present study was to determine the degree of instability + of cardiovascular responses to postural challenge in normotensive and hypertensive + subjects. +pipeline_tag: text-classification +inference: true +base_model: jinaai/jina-embeddings-v2-base-en +model-index: +- name: SetFit with jinaai/jina-embeddings-v2-base-en + results: + - task: + type: text-classification + name: Text Classification + dataset: + name: Unknown + type: unknown + split: test + metrics: + - type: accuracy + value: 0.931758530183727 + name: Accuracy +--- + +# SetFit with jinaai/jina-embeddings-v2-base-en + +This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [jinaai/jina-embeddings-v2-base-en](https://huggingface.co/jinaai/jina-embeddings-v2-base-en) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. + +The model has been trained using an efficient few-shot learning technique that involves: + +1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. +2. Training a classification head with features from the fine-tuned Sentence Transformer. + +## Model Details + +### Model Description +- **Model Type:** SetFit +- **Sentence Transformer body:** [jinaai/jina-embeddings-v2-base-en](https://huggingface.co/jinaai/jina-embeddings-v2-base-en) +- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance +- **Maximum Sequence Length:** 8192 tokens +- **Number of Classes:** 77 classes + + + + +### Model Sources + +- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) +- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) +- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) + +### Model Labels +| Label | Examples | +|:------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| 1 | | +| 2 | | +| 3 | | +| 4 | | +| 5 | | +| 6 | | +| 7 | | +| 8 | | +| 9 | | +| 10 | | +| 11 | | +| 12 | | +| 13 | | +| 14 | | +| 16 | | +| 17 | | +| 18 | | +| 19 | | +| 20 | | +| 21 | | +| 22 | | +| 23 | | +| 24 | | +| 25 | | +| 26 | | +| 27 | | +| 28 | | +| 30 | | +| 31 | | +| 32 | | +| 33 | | +| 34 | | +| 35 | | +| 36 | | +| 37 | | +| 38 | | +| 40 | | +| 41 | | +| 42 | | +| 43 | | +| 44 | | +| 45 | | +| 46 | | +| 47 | | +| 48 | | +| 49 | | +| 50 | | +| 51 | | +| 52 | | +| 53 | | +| 54 | | +| 55 | | +| 56 | | +| 57 | | +| 58 | | +| 59 | | +| 60 | | +| 61 | | +| 62 | | +| 63 | | +| 64 | | +| 65 | | +| 66 | | +| 67 | | +| 68 | | +| 69 | | +| 70 | | +| 71 | | +| 72 | | +| 73 | | +| 74 | | +| 75 | | +| 76 | | +| 77 | | +| 78 | | +| 79 | | +| 80 | | + +## Evaluation + +### Metrics +| Label | Accuracy | +|:--------|:---------| +| **all** | 0.9318 | + +## Uses + +### Direct Use for Inference + +First install the SetFit library: + +```bash +pip install setfit +``` + +Then you can load this model and run inference. + +```python +from setfit import SetFitModel + +# Download from the 🤗 Hub +model = SetFitModel.from_pretrained("Corran/Jina_Sci") +# Run inference +preds = model("This is an important issue, especially for the smaller facilities.") +``` + + + + + + + + + +## Training Details + +### Training Set Metrics +| Training set | Min | Median | Max | +|:-------------|:----|:--------|:----| +| Word count | 5 | 26.5274 | 153 | + +| Label | Training Sample Count | +|:------|:----------------------| +| 1 | 12 | +| 2 | 12 | +| 3 | 12 | +| 4 | 12 | +| 5 | 12 | +| 6 | 12 | +| 7 | 12 | +| 8 | 12 | +| 9 | 12 | +| 10 | 12 | +| 11 | 12 | +| 12 | 12 | +| 13 | 12 | +| 14 | 12 | +| 16 | 12 | +| 17 | 12 | +| 18 | 12 | +| 19 | 12 | +| 20 | 12 | +| 21 | 12 | +| 22 | 12 | +| 23 | 10 | +| 24 | 12 | +| 25 | 12 | +| 26 | 12 | +| 27 | 4 | +| 28 | 12 | +| 30 | 12 | +| 31 | 2 | +| 32 | 12 | +| 33 | 12 | +| 34 | 12 | +| 35 | 12 | +| 36 | 12 | +| 37 | 12 | +| 38 | 12 | +| 40 | 12 | +| 41 | 12 | +| 42 | 12 | +| 43 | 12 | +| 44 | 12 | +| 45 | 12 | +| 46 | 12 | +| 47 | 12 | +| 48 | 12 | +| 49 | 12 | +| 50 | 12 | +| 51 | 12 | +| 52 | 9 | +| 53 | 12 | +| 54 | 12 | +| 55 | 12 | +| 56 | 12 | +| 57 | 12 | +| 58 | 6 | +| 59 | 12 | +| 60 | 12 | +| 61 | 12 | +| 62 | 12 | +| 63 | 12 | +| 64 | 12 | +| 65 | 12 | +| 66 | 12 | +| 67 | 12 | +| 68 | 12 | +| 69 | 12 | +| 70 | 12 | +| 71 | 12 | +| 72 | 12 | +| 73 | 12 | +| 74 | 12 | +| 75 | 12 | +| 76 | 12 | +| 77 | 12 | +| 78 | 12 | +| 79 | 12 | +| 80 | 12 | + +### Training Hyperparameters +- batch_size: (16, 16) +- num_epochs: (1, 1) +- max_steps: -1 +- sampling_strategy: oversampling +- num_iterations: 5 +- body_learning_rate: (2e-05, 2e-05) +- head_learning_rate: 2e-05 +- loss: CosineSimilarityLoss +- distance_metric: cosine_distance +- margin: 0.25 +- end_to_end: False +- use_amp: False +- warmup_proportion: 0.1 +- seed: 42 +- eval_max_steps: -1 +- load_best_model_at_end: False + +### Training Results +| Epoch | Step | Training Loss | Validation Loss | +|:------:|:----:|:-------------:|:---------------:| +| 0.0012 | 1 | 0.2396 | - | +| 0.0893 | 50 | 0.2144 | - | +| 0.1786 | 100 | 0.1351 | - | +| 0.2679 | 150 | 0.1429 | - | +| 0.3571 | 200 | 0.1853 | - | +| 0.4464 | 250 | 0.0647 | - | +| 0.5357 | 300 | 0.0376 | - | +| 0.625 | 350 | 0.0555 | - | +| 0.7143 | 400 | 0.036 | - | +| 0.8036 | 450 | 0.0382 | - | +| 0.8929 | 500 | 0.0647 | - | +| 0.9821 | 550 | 0.0271 | - | + +### Framework Versions +- Python: 3.10.12 +- SetFit: 1.0.1 +- Sentence Transformers: 2.2.2 +- Transformers: 4.35.2 +- PyTorch: 2.1.0+cu121 +- Datasets: 2.15.0 +- Tokenizers: 0.15.0 + +## Citation + +### BibTeX +```bibtex +@article{https://doi.org/10.48550/arxiv.2209.11055, + doi = {10.48550/ARXIV.2209.11055}, + url = {https://arxiv.org/abs/2209.11055}, + author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, + keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, + title = {Efficient Few-Shot Learning Without Prompts}, + publisher = {arXiv}, + year = {2022}, + copyright = {Creative Commons Attribution 4.0 International} +} +``` + + + + + + \ No newline at end of file