lorenzoscottb's picture
Update README.md
8c3c327
|
raw
history blame
2.45 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: bert-base-cased-PLANE-ood-2
    results: []
language:
  - en
pipeline_tag: text-classification
widget:
  - text: A fake smile is a smile

BERT for PLANE classification

This model is a fine-tuned version of bert-base-cased on one of the PLANE's dataset split (no.2), introduced in Bertolini et al., COLING 2022 It achieves the following results on the evaluation set:

  • Accuracy: 0.9043

Model description

The model is trained to perform a sequence classification task over phrase-level adjective-noun inferences (e.g., "A red car is a vehicle").

Intended uses & limitations

More information needed

Training and evaluation data

The data used for training and testing, as well as the other splits used for the experiments, are available on the paper's git page here. The reported accuracy reference to out-of-distribution evaluation. that is, the model was tested to perform text classification as presented but on unknown adjectives and nouns.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.12.1
  • Datasets 2.5.1
  • Tokenizers 0.12.1

Cite

if you want to use the model or data in your work please reference the paper too

@inproceedings{bertolini-etal-2022-testing,
    title = "Testing Large Language Models on Compositionality and Inference with Phrase-Level Adjective-Noun Entailment",
    author = "Bertolini, Lorenzo  and
      Weeds, Julie  and
      Weir, David",
    booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
    month = oct,
    year = "2022",
    address = "Gyeongju, Republic of Korea",
    publisher = "International Committee on Computational Linguistics",
    url = "https://aclanthology.org/2022.coling-1.359",
    pages = "4084--4100",
}