Instructions to use TheSon2202/Temporal-MoEs-RoBERTa with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TheSon2202/Temporal-MoEs-RoBERTa with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="TheSon2202/Temporal-MoEs-RoBERTa")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("TheSon2202/Temporal-MoEs-RoBERTa") model = AutoModelForSequenceClassification.from_pretrained("TheSon2202/Temporal-MoEs-RoBERTa") - Notebooks
- Google Colab
- Kaggle
Temporal-MoEs-RoBERTa
Overview
Temporal-MoEs-RoBERTa is a fine-tuned model based on cardiffnlp/twitter-roberta-base-sentiment-latest, augmented with a Temporal Mixture-of-Experts (MoEs) architecture. This model was developed by the CITD@UIT research team for the SemEval-2026 Task 2: Subtask 2A (State Change Detection).
Architecture
The model integrates a standard RoBERTa backbone with a specialized Temporal MoE layer designed to capture sequential dependencies and state transition patterns in sentiment-labeled text data.
Performance
Achieved 5th place (excluding baselines) in the SemEval-2026 Subtask 2A competition.
Training Configuration
The model was trained using the following hyperparameters:
| Parameter | Value |
|---|---|
| Learning Rate | 2e-5 |
| Batch Size | 16 |
| Epochs | 8 |
| Weight Decay | 0.08 |
| LR Scheduler | Cosine |
| Warmup Ratio | 0.1 |
| Optimizer | AdamW (Torch) |
| Max Sequence Length | 512 |
| N_Expert | 4 |
SemEval-2026 Subtask 2A Official Ranking
Our model, Temporal-MoEs-RoBERTa, Top 6 results extracted from the official leaderboard:
| Rank | Team | Valence (r) | Arousal (r) | V&A Average |
|---|---|---|---|---|
| 1 | UKP_Psycontrol | 0.675 | 0.683 | 0.679 |
| 2 | YNU | 0.692 | 0.647 | 0.669 |
| 3 | UAlberta | 0.615 | 0.674 | 0.645 |
| 4 | Ajman University | 0.615 | 0.670 | 0.642 |
| 5 | CITD@UIT* | 0.629 | 0.633 | 0.631 |
| 6 | CSIRO-LT | 0.621 | 0.477 | 0.549 |
Reproducibility
Code and training pipeline: https://github.com/PTSown0222/SemEval-2026-Task-2
Model Weights: https://huggingface.co/TheSon2202/Temporal-MoEs-RoBERTa
Citation
If you use this model or our approach, please cite our paper:
@inproceedings{phuong-etal-2026-citd-uit,
author = {Son The Phuong and My Thuy-Tra Ngo and Tri Minh Dao and Duc-Vu Nguyen},
title = {CITD@UIT at SemEval-2026 Task 2: Temporal Mixture-of-Experts for Longitudinal Valence and Arousal Prediction from Ecological Essays},
booktitle = {Proceedings of the 20th International Workshop on Semantic Evaluation (SemEval-2026)},
year = {2026},
note = {To appear}
}
- Downloads last month
- 292