Ihor commited on
Commit
6cda4e1
1 Parent(s): d7441ad

Add information on how to fine-tune the model

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md CHANGED
@@ -103,6 +103,72 @@ Below, you can see the F1 score on several text classification datasets. All tes
103
  | [Comprehendo (184M)](https://huggingface.co/knowledgator/comprehend_it-base) | 0.90 | 0.7982 | 0.5660 |
104
  | SetFit [BAAI/bge-small-en-v1.5 (33.4M)](https://huggingface.co/BAAI/bge-small-en-v1.5) | 0.86 | 0.5636 | 0.5754 |
105
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
106
  ### Alternative usage
107
  Besides text classification, the model can be used for many other information extraction tasks.
108
 
 
103
  | [Comprehendo (184M)](https://huggingface.co/knowledgator/comprehend_it-base) | 0.90 | 0.7982 | 0.5660 |
104
  | SetFit [BAAI/bge-small-en-v1.5 (33.4M)](https://huggingface.co/BAAI/bge-small-en-v1.5) | 0.86 | 0.5636 | 0.5754 |
105
 
106
+ ### Few-shot learning
107
+ You can effectively fine-tune the model using 💧[LiqFit](https://github.com/Knowledgator/LiqFit). LiqFit is an easy-to-use framework for few-shot learning of cross-encoder models.
108
+
109
+ Download and install `LiqFit` by running:
110
+
111
+ ```bash
112
+ pip install liqfit
113
+ ```
114
+
115
+ For the most up-to-date version, you can build from source code by executing:
116
+
117
+ ```bash
118
+ pip install git+https://github.com/knowledgator/LiqFit.git
119
+ ```
120
+
121
+ You need to process a dataset, initialize a model, choose a loss function and set training arguments. Read more in a quick start section of the [documentation](https://docs.knowledgator.com/docs/frameworks/liqfit/quick-start).
122
+
123
+ ```python
124
+ from liqfit.modeling import LiqFitModel
125
+ from liqfit.losses import FocalLoss
126
+ from liqfit.collators import NLICollator
127
+ from transformers import TrainingArguments, Trainer
128
+
129
+ backbone_model = AutoModelForSequenceClassification.from_pretrained('knowledgator/comprehend_it-base')
130
+
131
+ loss_func = FocalLoss(multi_target=True)
132
+
133
+ model = LiqFitModel(backbone_model.config, backbone_model, loss_func=loss_func)
134
+
135
+ data_collator = NLICollator(tokenizer, max_length=128, padding=True, truncation=True)
136
+
137
+
138
+ training_args = TrainingArguments(
139
+ output_dir='comprehendo',
140
+ learning_rate=3e-5,
141
+ per_device_train_batch_size=3,
142
+ per_device_eval_batch_size=3,
143
+ num_train_epochs=9,
144
+ weight_decay=0.01,
145
+ evaluation_strategy="epoch",
146
+ save_steps = 5000,
147
+ save_total_limit=3,
148
+ remove_unused_columns=False,
149
+ )
150
+
151
+ trainer = Trainer(
152
+ model=model,
153
+ args=training_args,
154
+ train_dataset=nli_train_dataset,
155
+ eval_dataset=nli_test_dataset,
156
+ tokenizer=tokenizer,
157
+ data_collator=data_collator,
158
+ )
159
+
160
+ trainer.train()
161
+ ```
162
+ ### Benchmarks:
163
+ | Model & examples per label | Emotion | AgNews | SST5 |
164
+ |-|-|-|-|
165
+ | Comprehend-it/0 | 56.60 | 79.82 | 37.9 |
166
+ | Comprehend-it/8 | 63.38 | 85.9 | 46.67 |
167
+ | Comprehend-it/64 | 80.7 | 88 | 47 |
168
+ | SetFit/0 | 57.54 | 56.36 | 24.11 |
169
+ | SetFit/8 | 56.81 | 64.93 | 33.61 |
170
+ | SetFit/64 | 79.03 | 88 | 45.38 |
171
+
172
  ### Alternative usage
173
  Besides text classification, the model can be used for many other information extraction tasks.
174