HAID 109M (fine tuned from bert-base-uncased)

This is a 109M parameter model fine-tuned to detect human vs. gpt-written texts. It outputs a score ranging from 0 (GPT-written) to 1 (Human-written). However, occasionally it can output a value slightly outside of that range, such as 1.01 or -0.013

Example Inference Code

from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load the model and tokenizer from the Hugging Face Hub
model_name = "qingy2024/human-ai-bert"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Example text to classify
text = "Quantum mechanics is a fundamental branch of physics that describes the behavior of particles on very small scales, such as atoms and subatomic particles. It differs significantly from classical mechanics, which governs macroscopic objects, because it introduces concepts like wave-particle duality, uncertainty, and probabilistic outcomes."

# Tokenize the text
inputs = tokenizer(text, truncation=True, max_length=512, padding=True, return_tensors="pt")

import torch
# Perform inference
with torch.no_grad():  # Disable gradient computation for inference
    outputs = model(**inputs)
    prediction = outputs.logits.item()  # Extract the single float value

# Interpret the result
print(f"Prediction score: {prediction:.3f}")
if prediction >= 0.5:
    print("Text is likely human-written.")
else:
    print("Text is likely GPT-written.")

Output

Prediction score: 0.145
Text is likely GPT-written.

qingy2024
/

HAID-109M

HAID 109M (fine tuned from bert-base-uncased)

Model tree for qingy2024/HAID-109M

Dataset used to train qingy2024/HAID-109M

Evaluation results