Text Readability Grade Predictor
This model predicts the reading grade level of text using ModernBERT, trained on a dataset of texts with grade-level annotations. It can be used to estimate the educational reading level of various texts, from elementary school to college level.
Model Details
- Model Type: ModernBERT fine-tuned for regression
- Language: English
- Task: Text Readability Assessment (Regression)
- Framework: PyTorch
- Base Model:
answerdotai/ModernBERT-base
- Training Data: CLEAR dataset
- Performance:
- RMSE: 1.4143198236928092
- R²: 0.8125544567620288
- Output: Predicted grade level (0-12)
Usage
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("kiddom/modernbert-readability-grade-predictor")
tokenizer = AutoTokenizer.from_pretrained("kiddom/modernbert-readability-grade-predictor")
# Prepare text
text = "Your text goes here."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
# Run inference
with torch.no_grad():
outputs = model(**inputs)
# Get prediction (ensure it's between 0 and 12)
pred_grade = outputs.logits.item()
pred_grade = max(0, min(pred_grade, 12.0))
print(f"Predicted grade level: {pred_grade:.1f}")
Reading Level Categories
The predicted grade levels correspond to these educational categories:
- < 1.0: Pre-Kindergarten
- 1.0 - 2.9: Early Elementary
- 3.0 - 5.9: Elementary
- 6.0 - 8.9: Middle School
- 9.0 - 11.9: High School
- 12.0+: College Level
Example Predictions
Example: Early Elementary
The cat sat on the mat. It was happy. The sun was shining.
Predicted Grade Level: 1.2
Example: Middle School
The water cycle is a continuous process that includes evaporation, condensation, and precipitation. ...
Predicted Grade Level: 8.9
Example: High School
The quantum mechanical model of atomic structure provides a theoretical framework for understanding ...
Predicted Grade Level: 11.6
Limitations
- The model is trained on English text only
- Performance may vary for specialized or technical content
- Very short texts (fewer than 10 words) may not yield accurate predictions
- The model is calibrated for US educational grade levels
Training
This model was fine-tuned on a custom dataset created by augmenting texts from various grade levels. The training process involved:
- Collecting texts with known Lexile measures and Flesch-Kincaid Grade Levels
- Augmenting the dataset through text chunking
- Averaging grade level metrics for a more reliable target
- Fine-tuning ModernBERT with a regression head
- Optimizing for minimum RMSE and maximum R²
- Downloads last month
- 10
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for kiddom/modernbert-readability-grade-predictor
Base model
answerdotai/ModernBERT-base