Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

LM-Combiner

All the code and model are released link. Thank you for your patience!

Model Weight

  • cbart_large.zip

    • Weight of Bart baseline model.
  • lm_combiner.zip

    • Weight of LM-Combiner for Bart baseline on FCGEC dataset.

Requirements

The part of the model is implemented using the huggingface framework and the required environment is as follows:

  • Python
  • torch
  • transformers
  • datasets
  • tqdm

For the evaluation, we refer to the relevant environment configurations of ChERRANT.

Training Stage

Preprocessing

Baseline Model

  • Firstly, we train a baseline model (Chinese-Bart-large) for LM-Combiner on the FCGEC dataset using the Seq2Seq format.
sh ./script/run_bart_baseline.sh

Candidate Datasets

  1. Candidate Sentence Generation
  • We use the baseline model to generate candidate sentences for the training and test sets
  • On tasks where the model fits better (spelling correction, etc.), we recommend using the K-fold cross-inference from the paper to generate candidate sentences separately.
python ./src/predict_bl_tsv.py
  1. Golden Labels Merging
  • We use the ChERRANT tool to fully decouple the error correction task and the rewriting task by merging the correct labels.
python ./scorer_wapper/golden_label_merging.py

LM-combiner (gpt2)

  • Subsequently, we train LM-Combiner on the constructed candidate dataset
  • In particular, we supplement the gpt2 vocab (mainly double quotes) to better fit the FCGEC dataset, see ./pt_model/gpt2-base/vocab.txt for details.
sh ./script/run_lm_combiner.py

Evaluation

  • We use the official ChERRANT script to evaluate the model on the FCGEC-dev.
sh ./script/compute_score.sh
method Prec Rec F0.5
bart_baseline 28.88 38.95 40.46
+lm_combiner 52.15 37.41 48.34

Citation

If you find this work is useful for your research, please cite our paper:

@inproceedings{wang-etal-2024-lm-combiner,
    title = "{LM}-Combiner: A Contextual Rewriting Model for {C}hinese Grammatical Error Correction",
    author = "Wang, Yixuan  and
      Wang, Baoxin  and
      Liu, Yijun  and
      Wu, Dayong  and
      Che, Wanxiang",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.934",
    pages = "10675--10685",
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .