Edit model card

README

This is a facebook/bart-large-mnli model finetuned on a dataset created from EU taxonomie. The model is in ONNX format and is optimized at O4 level: with mixed precision (fp16, GPU-only, requires --device cuda).

The model has a single neuron output for binary classification as to predict whether a keyword is semantically contained/relevant within a given topic. The topics can be from any domain.

Tasks Positive example Negative example
Topic Fashion industry Healthcare
Keyword Pollution of textile production Football athlete
Prediction Relevant Irrelevant

How to load the model locally and do inference:

from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import BartTokenizerFast, BartConfig
from torch import nn

config = BartConfig(num_labels = 1)
model_checkpoint = "cocorooxinnn/optim_finetuned_bart_inclusion"
ort_model = ORTModelForSequenceClassification.from_pretrained(model_checkpoint, 
config = config, provider="CUDAExecutionProvider",)
tokenizer = BartTokenizerFast.from_pretrained("facebook/bart-large-mnli")

assert str(device) == "cuda", "Error: CUDA device not available. The inference is optimized to run on a GPU. Please run the program on a GPU."
m = nn.Sigmoid
#Simple NLI template for premise, hypothesis
template = "They are talking about "
premise, hypothesis = (template + "Fashion industry", template + "Pollution of textile production ")
x = tokenizer(premise, hypothesis,return_tensors='pt').to(ort_model.device)
prediction = m(model(**x).logits).squeeze() > 0.5  # batch_size, 1
# True

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-4
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 20
  • num_epochs: 2

Training results (F-score is calculated on validation set)

Step Training Loss Validation Loss F1 Precision Recall Accuracy
50 0.441800 0.285273 0.882523 0.892977 0.872311 0.883930
100 0.328400 0.259236 0.892380 0.903460 0.881568 0.893727
150 0.287400 0.257763 0.908518 0.877091 0.942282 0.905157
200 0.286300 0.243944 0.909042 0.898643 0.919684 0.908015
250 0.275500 0.239769 0.894515 0.925225 0.865777 0.897945
300 0.271800 0.222483 0.912901 0.912653 0.913150 0.912913
350 0.245500 0.221774 0.916039 0.905560 0.926763 0.915090
400 0.250700 0.219120 0.912324 0.924044 0.900898 0.913458
450 0.249000 0.211039 0.914482 0.922204 0.906888 0.915227
500 0.241500 0.207655 0.927005 0.910691 0.943915 0.925704
550 0.249900 0.197901 0.924230 0.925239 0.923224 0.924343
600 0.236000 0.202164 0.921937 0.929204 0.914784 0.922574
650 0.196100 0.192816 0.931687 0.918740 0.945004 0.930739
700 0.214800 0.206045 0.930494 0.894499 0.969507 0.927609
750 0.223600 0.186433 0.931180 0.928533 0.933842 0.931011
800 0.223900 0.189542 0.933564 0.911757 0.956439 0.931964
850 0.197500 0.191664 0.930473 0.928204 0.932753 0.930331
900 0.194600 0.185483 0.930797 0.922460 0.939287 0.930195
950 0.190200 0.183808 0.934791 0.916100 0.954261 0.933460
1000 0.189700 0.181666 0.934212 0.923404 0.945276 0.933460
1050 0.199300 0.177857 0.933693 0.924473 0.943098 0.933052

TrainOutput(global_step=1072, training_loss=0.2457642003671447, metrics={'train_runtime': 3750.3603, 'train_samples_per_second': 18.289, 'train_steps_per_second': 0.286, 'total_flos': 7425156147297000.0, 'train_loss': 0.2457642003671447, 'epoch': 2.0})

Evaluation on testset

Precision Recall F-score
0.94 0.94 0.94

Framework versions

  • PEFT 0.10.0
  • Transformers 4.41.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
185
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train cocorooxinnn/optim_finetuned_bart_inclusion