setfitmodel / README.md

Add SetFit model

119bbbe verified 9 months ago

11.6 kB

	---
	library_name: setfit
	tags:
	- setfit
	- sentence-transformers
	- text-classification
	- generated_from_setfit_trainer
	base_model: Omar-Nasr/setfitmodel
	metrics:
	- accuracy
	widget:
	- text: ' Like jumping into a pool, or starting an assignment'
	- text: ' That''s kind of the nature of my volunteer work, but you could volunteer
	with a food bank or boys and girls club, which would involve more social interaction
	Just breaking that cycle by going for a short walk around the neighbourhood is
	a good idea'
	- text: ' And you will have the confidence inside of you, so you wont have to worry
	about the outside so much'
	- text: ' Do your make up then, get out of that hotel room and take your notes with
	you! Go for a walk, try to focus on your senses (the smells, the sounds, the winds
	and the temperature, possible the sun burning your skin)'
	- text: I would disagree as I usually read people well and could see that he was not
	comfortable talking with me, in the first lunch break he left after 5 minutes
	and said he wanted to take a walk around the building
	pipeline_tag: text-classification
	inference: true
	model-index:
	- name: SetFit with Omar-Nasr/setfitmodel
	results:
	- task:
	type: text-classification
	name: Text Classification
	dataset:
	name: Unknown
	type: unknown
	split: test
	metrics:
	- type: accuracy
	value: 0.4666666666666667
	name: Accuracy
	---

	# SetFit with Omar-Nasr/setfitmodel

	This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [Omar-Nasr/setfitmodel](https://huggingface.co/Omar-Nasr/setfitmodel) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.

	The model has been trained using an efficient few-shot learning technique that involves:

	1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
	2. Training a classification head with features from the fine-tuned Sentence Transformer.

	## Model Details

	### Model Description
	- Model Type: SetFit
	- Sentence Transformer body: [Omar-Nasr/setfitmodel](https://huggingface.co/Omar-Nasr/setfitmodel)
	- Classification head: a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
	- Maximum Sequence Length: 256 tokens
	- Number of Classes: 4 classes
	<!-- - Training Dataset: [Unknown](https://huggingface.co/datasets/unknown) -->
	<!-- - Language: Unknown -->
	<!-- - License: Unknown -->

	### Model Sources

	- Repository: [SetFit on GitHub](https://github.com/huggingface/setfit)
	- Paper: [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
	- Blogpost: [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)

	### Model Labels
	\| Label \| Examples \|
	\|:------\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| 1.0 \| <ul><li>' Go out for a walk once a day additionally and slowly start increasing the time you spend outside Go out for a walk once a day additionally and slowly start increasing the time you spend outside Start doing sport, either outdoors or at a gym If you can, try to take your dog to a dog park or something like that'</li><li>" Now I'm not saying to go to a party on the spot, just go out, shop, take a walk in the park, that kind of thing Now I'm not saying to go to a party on the spot, just go out, shop, take a walk in the park, that kind of thing"</li><li>' antidepressants, therapy, meditation, walking in nature (these work for me idk about other people) antidepressants, therapy, meditation, walking in nature (these work for me idk about other people) '</li></ul> \|
	\| 2.0 \| <ul><li>'Try playing soccer haha'</li><li>" I didn't go outside too much"</li><li>" But we can't sleep in a hotel, we have go camping"</li></ul> \|
	\| 3.0 \| <ul><li>" I don't know anyone in the city, I don't think I have enough courage to go outside alone"</li><li>" But even then, I didn't have any other problems outside school I still had no friends at European school, I haven't had any walks which I had constantly with my friends back in Ukraine"</li><li>" For the past two months I've been doing my usual; not going outside just watching South Park and only going out when someone drags me For the past two months I've been doing my usual; not going outside just watching South Park and only going out when someone drags me"</li></ul> \|
	\| 0.0 \| <ul><li>' I would do this too but then I run the risk of meeting someone I know lol'</li><li>" Does it matter you are alone? NO Does it matter you can't run the full marathon at top speed? NO Are you useless? NO Seriously nobody is going to pay attention to you and think you are a lonely loser"</li><li>' If anything you should be thinking about wearing sun screen so you retain your good skin as it becomes your ally as you age outside'</li></ul> \|

	## Evaluation

	### Metrics
	\| Label \| Accuracy \|
	\|:--------\|:---------\|
	\| all \| 0.4667 \|

	## Uses

	### Direct Use for Inference

	First install the SetFit library:

	```bash
	pip install setfit
	```

	Then you can load this model and run inference.

	```python
	from setfit import SetFitModel

	# Download from the 🤗 Hub
	model = SetFitModel.from_pretrained("Omar-Nasr/setfitmodel")
	# Run inference
	preds = model(" Like jumping into a pool, or starting an assignment")
	```

	<!--
	### Downstream Use

	List how someone could finetune this model on their own dataset.
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Set Metrics
	\| Training set \| Min \| Median \| Max \|
	\|:-------------\|:----\|:-------\|:-----\|
	\| Word count \| 4 \| 49.85 \| 1083 \|

	\| Label \| Training Sample Count \|
	\|:------\|:----------------------\|
	\| 0.0 \| 20 \|
	\| 1.0 \| 20 \|
	\| 2.0 \| 20 \|
	\| 3.0 \| 20 \|

	### Training Hyperparameters
	- batch_size: (8, 8)
	- num_epochs: (1, 1)
	- max_steps: -1
	- sampling_strategy: oversampling
	- body_learning_rate: (2e-05, 1e-05)
	- head_learning_rate: 0.01
	- loss: CosineSimilarityLoss
	- distance_metric: cosine_distance
	- margin: 0.25
	- end_to_end: False
	- use_amp: False
	- warmup_proportion: 0.1
	- seed: 42
	- eval_max_steps: -1
	- load_best_model_at_end: False

	### Training Results
	\| Epoch \| Step \| Training Loss \| Validation Loss \|
	\|:------:\|:----:\|:-------------:\|:---------------:\|
	\| 0.0017 \| 1 \| 0.0931 \| - \|
	\| 0.0833 \| 50 \| 0.001 \| - \|
	\| 0.1667 \| 100 \| 0.0002 \| - \|
	\| 0.25 \| 150 \| 0.0 \| - \|
	\| 0.3333 \| 200 \| 0.0 \| - \|
	\| 0.4167 \| 250 \| 0.0 \| - \|
	\| 0.5 \| 300 \| 0.0 \| - \|
	\| 0.5833 \| 350 \| 0.0 \| - \|
	\| 0.6667 \| 400 \| 0.0 \| - \|
	\| 0.75 \| 450 \| 0.0 \| - \|
	\| 0.8333 \| 500 \| 0.0 \| - \|
	\| 0.9167 \| 550 \| 0.0 \| - \|
	\| 1.0 \| 600 \| 0.0 \| - \|

	### Framework Versions
	- Python: 3.10.13
	- SetFit: 1.0.3
	- Sentence Transformers: 2.7.0
	- Transformers: 4.39.3
	- PyTorch: 2.1.2
	- Datasets: 2.18.0
	- Tokenizers: 0.15.2

	## Citation

	### BibTeX
	```bibtex
	@article{https://doi.org/10.48550/arxiv.2209.11055,
	doi = {10.48550/ARXIV.2209.11055},
	url = {https://arxiv.org/abs/2209.11055},
	author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
	keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
	title = {Efficient Few-Shot Learning Without Prompts},
	publisher = {arXiv},
	year = {2022},
	copyright = {Creative Commons Attribution 4.0 International}
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->