Machine-generated text-detection by fine-tuning of language models

This project is related to a bachelor's thesis with the title "Turning Poachers into Gamekeepers: Detecting Machine-Generated Text in Academia using Large Language Models" (see here) written by Nicolai Thorer Sivesind and Andreas Bentzen Winje at the Department of Computer Science at the Norwegian University of Science and Technology.

It contains text classification models trained to distinguish human-written text from text generated by language models like ChatGPT and GPT-3. The best models were able to achieve an accuracy of 100% on real and GPT-3-generated wikipedia articles (4500 samples), and an accuracy of 98.4% on real and ChatGPT-generated research abstracts (3000 samples).

The dataset card for the dataset that was created in relation to this project can be found here.

NOTE: the hosted inference on this site only works for the RoBERTa-models, and not for the Bloomz-models. The Bloomz-models otherwise can produce wrong predictions when not explicitly providing the attention mask from the tokenizer to the model for inference. To be sure, the pipeline-library seems to produce the most consistent results.

Fine-tuned detectors

This project includes 12 fine-tuned models based on the RoBERTa-base model, and three sizes of the bloomz-models.

Base-model	RoBERTa-base	Bloomz-560m	Bloomz-1b7	Bloomz-3b
Wiki	roberta-wiki	Bloomz-560m-wiki	Bloomz-1b7-wiki	Bloomz-3b-wiki
Academic	roberta-academic	Bloomz-560m-academic	Bloomz-1b7-academic	Bloomz-3b-academic
Mixed	roberta-mixed	Bloomz-560m-mixed	Bloomz-1b7-mixed	Bloomz-3b-mixed

Datasets

The models were trained on selections from the GPT-wiki-intros and ChatGPT-Research-Abstracts, and are separated into three types, wiki-detectors, academic-detectors and mixed-detectors, respectively.

Wiki-detectors:
- Trained on 30'000 datapoints (10%) of GPT-wiki-intros.
- Best model (in-domain) is Bloomz-3b-wiki, with an accuracy of 100%.
Academic-detectors:
- Trained on 20'000 datapoints (100%) of ChatGPT-Research-Abstracts.
- Best model (in-domain) is Bloomz-3b-academic, with an accuracy of 98.4%
Mixed-detectors:
- Trained on 15'000 datapoints (5%) of GPT-wiki-intros and 10'000 datapoints (50%) of ChatGPT-Research-Abstracts.
- Best model (in-domain) is RoBERTa-mixed, with an F1-score of 99.3%.

Hyperparameters

All models were trained using the same hyperparameters:

{
 "num_train_epochs": 1,
 "adam_beta1": 0.9,
 "adam_beta2": 0.999,
 "batch_size": 8,
 "adam_epsilon": 1e-08
 "optim": "adamw_torch" # the optimizer (AdamW)
 "learning_rate": 5e-05, # (LR)
 "lr_scheduler_type": "linear", # scheduler type for LR
 "seed": 42, # seed for PyTorch RNG-generator.
}

Metrics

Metrics can be found at https://wandb.ai/idatt2900-072/IDATT2900-072.

In-domain performance of wiki-detectors:

Base model	Accuracy	Precision	Recall	F1-score
Bloomz-560m	0.973	*1.000	0.945	0.972
Bloomz-1b7	0.972	*1.000	0.945	0.972
Bloomz-3b	*1.000	*1.000	*1.000	*1.000
RoBERTa	0.998	0.999	0.997	0.998

In-domain peformance of academic-detectors:

Base model	Accuracy	Precision	Recall	F1-score
Bloomz-560m	0.964	0.963	0.965	0.964
Bloomz-1b7	0.946	0.941	0.951	0.946
Bloomz-3b	*0.984	*0.983	0.985	*0.984
RoBERTa	0.982	0.968	*0.997	0.982

F1-scores of the mixed-detectors on all three datasets:

Base model	Mixed	Wiki	CRA
Bloomz-560m	0.948	0.972	*0.848
Bloomz-1b7	0.929	0.964	0.816
Bloomz-3b	0.988	0.996	0.772
RoBERTa	*0.993	*0.997	0.829

Credits

GPT-wiki-intro, by Aaditya Bhat
arxiv-abstracts-2021, by Giancarlo
Bloomz, by BigScience
RoBERTa, by Liu et. al.

Citation

Please use the following citation:

@misc {sivesind_2023,
    author       = { {Nicolai Thorer Sivesind} and {Andreas Bentzen Winje} },
    title        = { Machine-generated text-detection by fine-tuning of language models },
    url          = { https://huggingface.co/andreas122001/roberta-academic-detector },
    year         = 2023,
    publisher    = { Hugging Face }
}

andreas122001
/

roberta-wiki-detector