RoBERTa-base AI Text Detector

Finetuned RoBERTa-base model for detecting AI generated English texts.

See FakespotAILabs/ApolloDFT for more details and a technical report of the model and experiments we conducted.

How to use

You can use this model directly with a pipeline.

For better performance, you should apply the clean_text function in utils.py.

from transformers import pipeline
from utils import clean_text

classifier = pipeline(
    "text-classification",
    model="fakespotailabs/roberta-base-ai-text-detection-v1"
)

# single text
text = "text 1"
classifier(clean_text(text))
[   
    {
        'label': str,
        'score': float
    }
]

# list of texts
texts = ["text 1", "text 2"]
classifier([clean_text(t) for t in texts])
[   
    {
        'label': str,
        'score': float
    },
    {
        'label': str,
        'score': float
    }
]

Disclaimer

  • The model's score represents an estimation of the likelihood of the input text being AI-generated or human-written, rather than indicating the proportion of the text that is AI-generated or human-written.
  • The accuracy and performance of the model generally improve with longer text inputs.
Downloads last month
76
Safetensors
Model size
125M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for fakespotailabs/roberta-base-ai-text-detection-v1

Finetuned
(1630)
this model

Space using fakespotailabs/roberta-base-ai-text-detection-v1 1