Sense-specific Historical Word Usage Generation
Collection
4 items
โข
Updated
The Sentence Dating Model is a fine-tuned RoBERTa-large transformer designed for predicting the decade in which a given sentence was written. This model is trained on historical text data to classify sentences into time periods from 1700 to 2021. It is particularly useful for historical linguistics, text dating, and semantic change studies.
This model is based on the work described in:
Sense-specific Historical Word Usage Generation
Pierluigi Cassotti, Nina Tahmasebi
University of Gothenburg
[Link to Paper]
roberta-large
The model is trained on a dataset derived from historical text corpora, including examples extracted from the Oxford English Dictionary (OED). The dataset includes:
AutoTokenizer.from_pretrained("roberta-large")
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("ChangeIsKey/text-dating")
model = AutoModelForSequenceClassification.from_pretrained("ChangeIsKey/text-dating")
# Example text
text = "He put the phone back in the cradle and turned toward the kitchen."
# Tokenize input
inputs = tokenizer(text, return_tensors="pt")
# Predict
with torch.no_grad():
outputs = model(**inputs)
predicted_label = torch.argmax(outputs.logits, dim=1).item()
print(f"Predicted decade: {1700 + predicted_label * 10}")
If you use this model, please cite:
@article{cassotti2025,
author = {Cassotti, Pierluigi and Tahmasebi, Nina},
title = {Sense-specific Historical Word Usage Generation},
journal = {TACL},
year = {2025}
}
MIT License
Base model
FacebookAI/roberta-large