DistilBERT LLM Generated Text Detector

This is a fine-tuned DistilBERT-base model that is able to detect whether an essay was written by a large language model (LLM) or a student.

Model Details

Model Description

The model is a fine-tuned DistilBERT-base model that can detect whether an essay was written by a LLM or a student. The model was trained using data from 8 data sources that had a diverse amount of student written essays and LLM written essays. The total dataset has 4985 prompts, of which 1200 are for training and the remaining 1200 are for validation. Since some prompts had more essays than others, this meant that the training dataset was around 40000 essays and the validation set was around 5000 essays. To test generalization power, the model was tested using the LLM - Detect AI Generated Text Kaggle competition. Since the competition test data exhibits some engineered noise, the model was further tested on a collection of 5000 stories (2500 written and 2500 LLM generated).

When an example is fed in to the model, the model returns a probability of how likely the essay is written by a LLM. However, experiments showed that a decision boundary of 50% isn't good. Thus, I recommend that you utilize a decision boundary of around 70% if you intend on using this model.

Application 🚀

You can use the model by simply using these lines:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("JinalShah2002/distilbert-detector")
model = AutoModelForSequenceClassification.from_pretrained("JinalShah2002/distilbert-detector")

Preprocessing

Simple preprocessing was done for the model. Specifically, I replaced all "\n" and "\t" with empty strings. I also replaced '\xa0' with ' '. These decisions were made so that the model doesn't make its predictions based off of "\n". Rather, it should make its predictions based off of sentence structure and content. For simplicity, this is a simple preprocessing function I used:

import re

def preprocess(essay:str):
    preprocessed_essay = essay.lower()
    
    # Subbing out \n and \t
    preprocessed_essay = re.sub("\n","",preprocessed_essay)
    preprocessed_essay = re.sub("\t","",preprocessed_essay)

    # Replacing \xa0 = non-breaking space in Latin1
    preprocessed_essay = preprocessed_essay.replace(u'\xa0', u' ')
    
    return preprocessed_essay

Evaluation

For evaluation, I utilized the ROC AUC metric. For training, I found the model to have a ROC AUC of approximately 1. On validation, I found the ROC AUC to be approximately 0.97. When I tested the model on 5000 stories (2500 generated and 2500 human written), I found the ROC AUC to be approximately 0.97. However, I do need to note that for the Kaggle competition, the model displayed a weighted ROC AUC of 0.7728. Even though the competition test data has some engineered noise, this metric shows that the model can be susceptible to engineered noise at times. Thus, one should keep this aspect in mind.

More Information

I built an application that allows users to get predictions immediately! Feel free to check it out at https://verifyai.streamlit.app/.

Model Card Authors

Jinal Shah

Model Card Contact

If you have any questions about the model, feel free to reach out to me on LinkedIn!