knowledgator
/

comprehend_it-base

@@ -103,6 +103,72 @@ Below, you can see the F1 score on several text classification datasets. All tes
 | [Comprehendo (184M)](https://huggingface.co/knowledgator/comprehend_it-base)           | 0.90 | 0.7982  | 0.5660   |
 | SetFit [BAAI/bge-small-en-v1.5 (33.4M)](https://huggingface.co/BAAI/bge-small-en-v1.5) | 0.86 | 0.5636 | 0.5754 |
 ### Alternative usage
 Besides text classification, the model can be used for many other information extraction tasks.

 | [Comprehendo (184M)](https://huggingface.co/knowledgator/comprehend_it-base)           | 0.90 | 0.7982  | 0.5660   |
 | SetFit [BAAI/bge-small-en-v1.5 (33.4M)](https://huggingface.co/BAAI/bge-small-en-v1.5) | 0.86 | 0.5636 | 0.5754 |
+### Few-shot learning
+You can effectively fine-tune the model using 💧[LiqFit](https://github.com/Knowledgator/LiqFit). LiqFit is an easy-to-use framework for few-shot learning of cross-encoder models.
+Download and install `LiqFit` by running:
+```bash
+pip install liqfit
+```
+For the most up-to-date version, you can build from source code by executing:
+```bash
+pip install git+https://github.com/knowledgator/LiqFit.git
+```
+You need to process a dataset, initialize a model, choose a loss function and set training arguments. Read more in a quick start section of the [documentation](https://docs.knowledgator.com/docs/frameworks/liqfit/quick-start).
+```python
+from liqfit.modeling import LiqFitModel
+from liqfit.losses import FocalLoss
+from liqfit.collators import NLICollator
+from transformers import TrainingArguments, Trainer
+backbone_model = AutoModelForSequenceClassification.from_pretrained('knowledgator/comprehend_it-base')
+loss_func = FocalLoss(multi_target=True)
+model = LiqFitModel(backbone_model.config, backbone_model, loss_func=loss_func)
+data_collator = NLICollator(tokenizer, max_length=128, padding=True, truncation=True)
+training_args = TrainingArguments(
+    output_dir='comprehendo',
+    learning_rate=3e-5,
+    per_device_train_batch_size=3,
+    per_device_eval_batch_size=3,
+    num_train_epochs=9,
+    weight_decay=0.01,
+    evaluation_strategy="epoch",
+    save_steps = 5000,
+    save_total_limit=3,
+    remove_unused_columns=False,
+)
+trainer = Trainer(
+    model=model,
+    args=training_args,
+    train_dataset=nli_train_dataset,
+    eval_dataset=nli_test_dataset,
+    tokenizer=tokenizer,
+    data_collator=data_collator,
+)
+trainer.train()
+```
+### Benchmarks:
+| Model & examples per label | Emotion | AgNews | SST5 |
+|-|-|-|-|
+| Comprehend-it/0 | 56.60 | 79.82 | 37.9 |
+| Comprehend-it/8 | 63.38 | 85.9 | 46.67 |
+| Comprehend-it/64 | 80.7 | 88 | 47 |
+| SetFit/0 | 57.54 | 56.36 | 24.11 |
+| SetFit/8 | 56.81 | 64.93 | 33.61 |
+| SetFit/64 | 79.03 | 88 | 45.38 |
 ### Alternative usage
 Besides text classification, the model can be used for many other information extraction tasks.