AlignScoreCS / README.md
krotima1's picture
Upload tokenizer
52ecdfe verified
|
raw
history blame
No virus
5.45 kB
metadata
language:
  - en
  - cs
license: cc-by-4.0
metrics:
  - bleurt
  - bleu
  - bertscore
pipeline_tag: text-classification

AlignScoreCS

MultiTask multilingual model for assessing facticity in various NLU tasks in Czech and English language. We followed the initial paper AlignScore https://arxiv.org/abs/2305.16739. We trained a model using a shared architecture of checkpoint xlm-roberta-large https://huggingface.co/FacebookAI/xlm-roberta-large with three linear layers for regression, binary classification and ternary classification.

Usage

  # Assuming you copied the attached Files_and_versions/AlignScore.py file for ease of use in transformers.
  from AlignScoreCS import AlignScoreCS
  alignScoreCS = AlignScoreCS.from_pretrained("krotima1/AlignScoreCS")
  # put the model to cuda to accelerate
  print(alignScoreCS.score(context="This is context", claim="This is claim"))

Results

Training datasets

The following table shows datasets that has been utilized for training the model. We translated these english datasets to Czech using seamLessM4t.

NLP Task Dataset Training Task Context (n words) Claim (n words) Sample Count
NLI SNLI 3-way 10 13 Cs: 500k
En: 550k
MultiNLI 3-way 16 20 Cs: 393k
En: 393k
Adversarial NLI 3-way 48 54 Cs: 163k
En: 163k
DocNLI 2-way 97 285 Cs: 200k
En: 942k
Fact Verification NLI-style FEVER 3-way 48 50 Cs: 208k
En: 208k
Vitamin C 3-way 23 25 Cs: 371k
En: 371k
Paraphrase QQP 2-way 9 11 Cs: 162k
En: 364k
PAWS 2-way - 18 Cs: -
En: 707k
PAWS labeled 2-way 18 - Cs: 49k
En: -
PAWS unlabeled 2-way 18 - Cs: 487k
En: -
STS SICK reg - 10 Cs: -
En: 4k
STS Benchmark reg - 10 Cs: -
En: 6k
Free-N1 reg 18 - Cs: 20k
En: -
QA SQuAD v2 2-way 105 119 Cs: 130k
En: 130k
RACE 2-way 266 273 Cs: 200k
En: 351k
Information Retrieval MS MARCO 2-way 49 56 Cs: 200k
En: 5M
Summarization WikiHow 2-way 434 508 Cs: 157k
En: 157k
SumAug 2-way - - Cs: -
En: -