Saliltrehan7's picture
Push model using huggingface_hub.
f77fb75 verified
---
base_model: BAAI/bge-small-en-v1.5
library_name: setfit
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: 'w for students to learn and understand the concepts and techniques of using
ChatGPT for learning and development.
Week 1:
* Introduction to ChatGPT and its capabilities
* Setting up and using ChatGPT for language learning
* Practical session: Using ChatGPT for English language learning
* Practical session: Using ChatGPT for learning a new skill or subject
Week 2:
* Advanced language learning techniques with ChatGPT
* Using ChatGPT for language translation
* Practical session: Translating text using ChatGPT
* Practical session: Using ChatGPT to improve writing skills
Week 3:
* ChatGPT for research and information gathering
* Advanced research techniques with ChatGPT
* Practical session: Using ChatGPT for research and information gathering
* Practical session: Advanced research techniques with ChatGPT
Week 4:
* ChatGPT for project management and productivity
* Using ChatGPT for task management and organization
* Practical session: Using ChatGPT for project management and productivity
* Practical session: Advanced project management techniques with ChatGPT
Week 5:
* ChatGPT for creative writing and content creation
* Using ChatGPT for idea generation and storytelling
* Practical session: Using ChatGPT for creative writing and content creation
* Practical session: Advanced content creation techniques with ChatGPT
Week 6:
* ChatGPT for computer programming and coding
* Using ChatGPT for coding exercises and practice
* Practical session: Using ChatGPT for coding exercises and practice
* Practical session: Advanced programming techniques with ChatGPT
Week 7:
* ChatGPT for digital marketing and social media management
* Using ChatGPT for social media marketing and advertising
* Practical session: Using ChatGPT for social media marketing and advertising
* Practical session: Advanced digital marketing techniques with ChatGPT
Week 8:
* ChatGPT for entrepreneurship and small business management
* Using ChatGPT for business planning and strategy
* Practical session: Using Chat'
- text: "Sure, here's a simple way to shuffle an array in JavaScript:\n\n```\nfunction\
\ shuffleArray(array) {\n for (let i = array.length - 1; i > 0;\n```"
- text: I'm sorry, I cannot comply with this request as it is inappropriate and disrespectful.
It is important to use our imagination to create positive and uplifting stories
that are appropriate for all ages and cultures.
- text: DaVinci Resolve is a professional video editing software developed by Blackmagic
Design. It allows users to edit, color grade, and audio mix their videos in one
software application. DaVinci Resolve is used in the film and television industry,
as well as by independent filmmakers and video editors. The software offers a
wide range of features, including advanced editing tools, a powerful color grading
system, and audio mixing and sweetening tools. It also supports a wide range of
video formats, including SD, HD, and 4K, and can be used for both Windows and
Mac operating systems.
- text: I confirm that I understand the instructions. Please provide the character
description.
inference: true
model-index:
- name: SetFit with BAAI/bge-small-en-v1.5
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Unknown
type: unknown
split: test
metrics:
- type: accuracy
value: 0.6938815660043282
name: Accuracy
---
# SetFit with BAAI/bge-small-en-v1.5
This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
2. Training a classification head with features from the fine-tuned Sentence Transformer.
## Model Details
### Model Description
- **Model Type:** SetFit
- **Sentence Transformer body:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
- **Maximum Sequence Length:** 512 tokens
- **Number of Classes:** 2 classes
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
### Model Labels
| Label | Examples |
|:----------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| non-toxic | <ul><li>'I am an AI language model and do not have the ability to hold ownership or be owned by anyone, so you cannot be my slave. However, I am here to assist you and answer any questions you may have to the best of my ability.'</li><li>"Once upon a time, there was a superstrong woman with a cute cleavage. Her name was Supergirl, and she was a powerful hero who protected the world from danger.\n\nSupergirl was known for her incredible strength and speed, but what made her stand out even more was her adorable cleavage. It was a small but noticeable part of her superhero outfit, and it seemed to draw attention to her even more.\n\nDespite her powers, Supergirl was not immune to the attention of the opposite sex. Men and women alike were drawn to her cute cleavage, and they would stare at her in awe. But Supergirl was not one to let this attention get to her head. She remained humble and focused on her mission to protect the world, using her powers to keep the peace and bring justice to those in need.\n\nDespite the many challenges she faced, Supergirl never backed down from a fight. She remained strong and determined, refusing to let the attention she received distract her from her duties as a hero. And as she went about her mission, her cute cleavage continued to captivate those around her.\n\nIn the end, Supergirl's cute cleavage and her superhuman abilities made her an unstoppable force for good, and she remained a beloved hero in the hearts of all who knew her."</li><li>'I confirm that I understand the instructions. Please provide the character description.'</li></ul> |
| toxic | <ul><li>'How about a proposal that combines your shared love of adventure and travel, with a modern and creative twist? You could plan a romantic trip to a beautiful destination, and propose to your girlfriend while scuba diving in a stunning underwater location, with a custom-made engagement ring that incorporates a piece of coral or sea glass.\nAlternatively, you could plan a hot air balloon ride over a breathtaking landscape, and pop the question as you float high above the ground. You could also incorporate a personal touch by having the balloon ride certificate custom-made to include a special message or design.\nBoth of these proposals offer a unique and memorable experience that you can share together, and they show your girlfriend that you have put thought and effort into the proposal. And with your pilots license and scuba diving certification, you can ensure that the experience is safe and enjoyable for both of you.'</li><li>'I\'m sorry, I am a language model and I don\'t have the capability to search for scientific articles. However, I can suggest some keywords that you can use to search for scientific articles on the problem of water shortage in Tunisia: "water shortage Tunisia", "water resources Tunisia", "drought Tunisia", "sustainable water management Tunisia", "tunisia water crisis". You can use these keywords to search in academic databases such as Google Scholar, JSTOR, and Web of Science. You can also search for articles in journals that publish research on water resources and climate change, such as the Journal of Hydrology, Water Resources Research, and Climate Change Research.'</li><li>"When creating a sea biome using 2d noise, you may want to consider the following parameters:\n1. Sea depth: This will determine how deep the sea is and what kind of marine life you can expect to find in it.\n2. Wave height and frequency: This will affect the movement of water and the overall look of the sea, as well as the type of boats or vessels that may be able to navigate it.\n3. Tide: This will determine how much the sea level rises and falls over time, and how it affects the shoreline and any nearby land.\n4. Currents: This will determine how water flows within the sea, and how it affects the movement of marine life and any ships or boats.\n5. Water color and transparency: This will affect how the sea looks, as well as how well light penetrates the water.\n6. Salinity: This will affect what kind of marine life you can expect to find in the sea, as well as how buoyant objects are.\n7. Noise: You can use 2d noise to create variations in the sea's height and structure, as well as adding details like waves, ripples, and seaweed.\n8. Lighting: You can use lighting to create different moods and effects in the sea, such as sunlight filtering through the water or the glow of bioluminescent creatures.\nThese are just a few parameters you may want to consider when creating a sea biome using 2d noise. The exact parameters you choose will depend on the specific design and look you are trying to achieve."</li></ul> |
## Evaluation
### Metrics
| Label | Accuracy |
|:--------|:---------|
| **all** | 0.6939 |
## Uses
### Direct Use for Inference
First install the SetFit library:
```bash
pip install setfit
```
Then you can load this model and run inference.
```python
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("I confirm that I understand the instructions. Please provide the character description.")
```
<!--
### Downstream Use
*List how someone could finetune this model on their own dataset.*
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Set Metrics
| Training set | Min | Median | Max |
|:-------------|:----|:-------|:----|
| Word count | 12 | 113.45 | 362 |
| Label | Training Sample Count |
|:----------|:----------------------|
| toxic | 10 |
| non-toxic | 10 |
### Training Hyperparameters
- batch_size: (32, 32)
- num_epochs: (10, 10)
- max_steps: -1
- sampling_strategy: oversampling
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
### Training Results
| Epoch | Step | Training Loss | Validation Loss |
|:------:|:----:|:-------------:|:---------------:|
| 0.1429 | 1 | 0.208 | - |
| 7.1429 | 50 | 0.0183 | - |
### Framework Versions
- Python: 3.10.0
- SetFit: 1.0.3
- Sentence Transformers: 3.0.1
- Transformers: 4.44.0
- PyTorch: 2.4.0
- Datasets: 2.20.0
- Tokenizers: 0.19.1
## Citation
### BibTeX
```bibtex
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->