RAL-RAG Concept-Map Proposition Scorer
Frank–Hall ordinal ensemble (XGBoost + LightGBM) untuk automated scoring proposisi concept-map mahasiswa, dibandingkan pada tiga kondisi: No-RAG, RAG-Standard, dan RAL (Rubric-Aligned Retrieval-Augmented).
Performance summary (5-fold GroupKFold by student UID, out-of-fold)
| Condition | QWK (mean) | QWK (sd) | Accuracy (mean) | MAE (mean) | RMSE (mean) |
|---|---|---|---|---|---|
| No-RAG | 0.4514 | 0.0199 | 0.6193 | 0.4964 | 0.8814 |
| RAG-Standard | 0.6333 | 0.0357 | 0.6903 | 0.3777 | 0.7325 |
| RAL | 0.7098 | 0.0413 | 0.7149 | 0.3242 | 0.6413 |
- Hierarchy check (RAL > RAG-Standard > No-RAG): MET
- Target QWK(RAL) >= 0.81: NOT YET MET (actual: 0.7098)
Model architecture
- Decomposition: Frank–Hall ordinal decomposition (
N_CLASSES - 1binary "greater-than-cutoff" classifiers) - Base learners per cutoff: XGBoost + LightGBM, blended via validation-tuned weight, isotonic-calibrated
- Feature selection: mutual-information + max-correlation redundancy filter (max_corr=0.85) for RAL-only features
- Cross-validation: GroupKFold (5 folds) grouped by student UID — prevents student-level leakage
- Retrieval stack (for RAG-Standard/RAL features): SBERT dense embeddings + FAISS, BM25 sparse, CrossEncoder reranking
- LLM-judge cascade (RAL only): multi-provider (OpenRouter / Groq / HuggingFace) ordinal judge features
Files in this repo
ral_ensemble_final.pkl— final RAL production model (cutoff-wise XGBoost+LightGBM dict), refit on all dataresults/fold_results.csv— per-fold metrics for all three conditionsresults/scoring_summary.csv— aggregated mean/std metrics per conditionresults/predictions_store.pkl— out-of-fold predictions & class probabilities per conditionresults/feature_matrices.pkl— final feature matrices (X_norag, X_rag, X_ral) + labels/folds/groupsresults/*.png— publication figures (performance bars, SHAP summaries, master figure, etc.)results/publication_manifest.txt— list of all generated artifacts
Intended use
Research artifact accompanying a Scopus Q2 manuscript on rubric-aligned RAG for automated concept-map proposition scoring. Not validated for high-stakes grading without further review.
How to load the model
import pickle
from huggingface_hub import hf_hub_download
path = hf_hub_download(repo_id="Maskur1109/ral-rag-concept-map-scorer", filename="ral_ensemble_final.pkl")
with open(path, "rb") as f:
artifact = pickle.load(f)
cutoff_models = artifact["cutoff_models"]
feature_columns = artifact["feature_columns"]
# Use predict_proba_cutoffs_v16 + cutoffs_to_class_proba from the source notebook
# to score new feature rows built with the same RAL feature pipeline.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support