--- library_name: setfit tags: - setfit - sentence-transformers - text-classification - generated_from_setfit_trainer metrics: - accuracy - precision - recall - f1 widget: - text: 'I''m trying to take a dataframe and convert them to tensors to train a model in keras. I think it''s being triggered when I am converting my Y label to a tensor: I''m getting the following error when casting y_train to tensor from slices: In the tutorials this seems to work but I think those tutorials are doing multiclass classifications whereas I''m doing a regression so y_train is a series not multiple columns. Any suggestions of what I can do?' - text: My weights are defined as I want to use the weights decay so I add, for example, the argument to the tf.get_variable. Now I'm wondering if during the evaluation phase this is still correct or maybe I have to set the regularizer factor to 0. There is also another argument trainable. The documentation says If True also add the variable to the graph collection GraphKeys.TRAINABLE_VARIABLES. which is not clear to me. Should I use it? Can someone explain to me if the weights decay effects in a sort of wrong way the evaluation step? How can I solve in that case? - text: 'Maybe I''m confused about what "inner" and "outer" tensor dimensions are, but the documentation for tf.matmul puzzles me: Isn''t it the case that R-rank arguments need to have matching (or no) R-2 outer dimensions, and that (as in normal matrix multiplication) the Rth, inner dimension of the first argument must match the R-1st dimension of the second. That is, in The outer dimensions a, ..., z must be identical to a'', ..., z'' (or not exist), and x and x'' must match (while p and q can be anything). Or put another way, shouldn''t the docs say:' - text: 'I am using tf.data with reinitializable iterator to handle training and dev set data. For each epoch, I initialize the training data set. The official documentation has similar structure. I think this is not efficient especially if the training set is large. Some of the resources I found online has sess.run(train_init_op, feed_dict={X: X_train, Y: Y_train}) before the for loop to avoid this issue. But then we can''t process the dev set after each epoch; we can only process it after we are done iterating over epochs epochs. Is there a way to efficiently process the dev set after each epoch?' - text: 'Why is the pred variable being calculated before any of the training iterations occur? I would expect that a pred would be generated (through the RNN() function) during each pass through of the data for every iteration? There must be something I am missing. Is pred something like a function object? I have looked at the docs for tf.matmul() and that returns a tensor, not a function. Full source: https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/recurrent_network.py Here is the code:' pipeline_tag: text-classification inference: true base_model: flax-sentence-embeddings/stackoverflow_mpnet-base model-index: - name: SetFit with flax-sentence-embeddings/stackoverflow_mpnet-base results: - task: type: text-classification name: Text Classification dataset: name: Unknown type: unknown split: test metrics: - type: accuracy value: 0.81875 name: Accuracy - type: precision value: 0.8248924988055423 name: Precision - type: recall value: 0.81875 name: Recall - type: f1 value: 0.8178892421209625 name: F1 --- # SetFit with flax-sentence-embeddings/stackoverflow_mpnet-base This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [flax-sentence-embeddings/stackoverflow_mpnet-base](https://huggingface.co/flax-sentence-embeddings/stackoverflow_mpnet-base) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. The model has been trained using an efficient few-shot learning technique that involves: 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. 2. Training a classification head with features from the fine-tuned Sentence Transformer. ## Model Details ### Model Description - **Model Type:** SetFit - **Sentence Transformer body:** [flax-sentence-embeddings/stackoverflow_mpnet-base](https://huggingface.co/flax-sentence-embeddings/stackoverflow_mpnet-base) - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance - **Maximum Sequence Length:** 512 tokens - **Number of Classes:** 2 classes ### Model Sources - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) ### Model Labels | Label | Examples | |:------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 1 | | | 0 | | ## Evaluation ### Metrics | Label | Accuracy | Precision | Recall | F1 | |:--------|:---------|:----------|:-------|:-------| | **all** | 0.8187 | 0.8249 | 0.8187 | 0.8179 | ## Uses ### Direct Use for Inference First install the SetFit library: ```bash pip install setfit ``` Then you can load this model and run inference. ```python from setfit import SetFitModel # Download from the 🤗 Hub model = SetFitModel.from_pretrained("sharukat/so_mpnet-base_question_classifier") # Run inference preds = model("I'm trying to take a dataframe and convert them to tensors to train a model in keras. I think it's being triggered when I am converting my Y label to a tensor: I'm getting the following error when casting y_train to tensor from slices: In the tutorials this seems to work but I think those tutorials are doing multiclass classifications whereas I'm doing a regression so y_train is a series not multiple columns. Any suggestions of what I can do?") ``` ## Training Details ### Training Set Metrics | Training set | Min | Median | Max | |:-------------|:----|:---------|:----| | Word count | 12 | 128.0219 | 907 | | Label | Training Sample Count | |:------|:----------------------| | 0 | 320 | | 1 | 320 | ### Training Hyperparameters - batch_size: (8, 8) - num_epochs: (1, 16) - max_steps: -1 - sampling_strategy: unique - body_learning_rate: (2e-05, 1e-05) - head_learning_rate: 0.01 - loss: CosineSimilarityLoss - distance_metric: cosine_distance - margin: 0.25 - end_to_end: False - use_amp: False - warmup_proportion: 0.1 - max_length: 256 - seed: 42 - eval_max_steps: -1 - load_best_model_at_end: True ### Training Results | Epoch | Step | Training Loss | Validation Loss | |:-------:|:---------:|:-------------:|:---------------:| | 0.0000 | 1 | 0.3266 | - | | **1.0** | **25640** | **0.0** | **0.2863** | * The bold row denotes the saved checkpoint. ### Framework Versions - Python: 3.10.13 - SetFit: 1.0.3 - Sentence Transformers: 2.5.1 - Transformers: 4.38.1 - PyTorch: 2.1.2 - Datasets: 2.18.0 - Tokenizers: 0.15.2 ## Citation ### BibTeX ```bibtex @article{https://doi.org/10.48550/arxiv.2209.11055, doi = {10.48550/ARXIV.2209.11055}, url = {https://arxiv.org/abs/2209.11055}, author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Efficient Few-Shot Learning Without Prompts}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} } ```