Back to all models
text-generation mask_token:
Query this model
πŸ”₯ This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint  

⚑️ Upgrade your account to access the Inference API

Share Copied link to clipboard

Monthly model downloads

aliosm/ComVE-gpt2 aliosm/ComVE-gpt2
last 30 days



Contributed by

aliosm Ali Hamdi Ali Fadel
10 models

How to use this model directly from the πŸ€—/transformers library:

Copy to clipboard
from transformers import AutoTokenizer, AutoModelWithLMHead tokenizer = AutoTokenizer.from_pretrained("aliosm/ComVE-gpt2") model = AutoModelWithLMHead.from_pretrained("aliosm/ComVE-gpt2")


Model description

Finetuned model on Commonsense Validation and Explanation (ComVE) dataset introduced in SemEval2020 Task4 using a causal language modeling (CLM) objective. The model is able to generate a reason why a given natural language statement is against commonsense.

Intended uses & limitations

You can use the raw model for text generation to generate reasons why natural language statements are against commonsense.

How to use

You can use this model directly to generate reasons why the given statement is against commonsense using script.

Note: make sure that you are using version 2.4.1 of transformers package. Newer versions has some issue in text generation and the model repeats the last token generated again and again.

Limitations and bias

The model biased to negate the entered sentence usually instead of producing a factual reason.

Training data

The model is initialized from the gpt2 model and finetuned using ComVE dataset which contains 10K against commonsense sentences, each of them is paired with three reference reasons.

Training procedure

Each natural language statement that against commonsense is concatenated with its reference reason with <|continue|> as a separator, then the model finetuned using CLM objective. The model trained on Nvidia Tesla P100 GPU from Google Colab platform with 5e-5 learning rate, 5 epochs, 128 maximum sequence length and 64 batch size.

Eval results

The model achieved 14.0547/13.6534 BLEU scores on SemEval2020 Task4: Commonsense Validation and Explanation development and testing dataset.

BibTeX entry and citation info

  title={JUSTers at SemEval-2020 Task 4: Evaluating Transformer Models Against Commonsense Validation and Explanation},
  author={Fadel, Ali and Al-Ayyoub, Mahmoud and Cambria, Erik},