skillner / README.md
ihk's picture
Update README.md
b7f7069
metadata
base_model: jjzha/jobbert-base-cased
metrics:
  - accuracy
  - precision
  - recall
  - f1
model-index:
  - name: results
    results: []
widget:
  - text: You should be a skilled communicator.
  - text: You can programme in Python and CSS.

results

This model is a fine-tuned version of jjzha/jobbert-base-cased for the task of token classification. It achieves the following results on the evaluation set:

  • Loss: 0.1244
  • Accuracy: 0.9701
  • Precision: 0.5581
  • Recall: 0.6814
  • F1: 0.6136

Model description

The base model (jjzha/jobbert-base-cased) is a BERT transformer model, pretrained on a corpus of ~3.2 million sentences from job adverts for the objective of Masked Language Modelling (MLM). A token classification head is added to the top of the model to predict a label for every token in a given sequence. In this instance, it is predicting a label for every token in a job description, where the label is either a 'B-SKILL', 'I-SKILL' or 'O' (not a skill).

Training and evaluation data

The model was trained on 4112 job advert sentences.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Accuracy Precision Recall F1
No log 1.0 257 0.0769 0.9725 0.5578 0.7003 0.6210
0.0816 2.0 514 0.1051 0.9653 0.5086 0.7445 0.6044
0.0816 3.0 771 0.0986 0.9709 0.5761 0.7161 0.6385
0.0262 4.0 1028 0.1140 0.9703 0.5627 0.6940 0.6215
0.0262 5.0 1285 0.1244 0.9701 0.5581 0.6814 0.6136

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu118
  • Datasets 2.14.6
  • Tokenizers 0.14.1