RAL-RAG Concept-Map Proposition Scorer

Frank–Hall ordinal ensemble (XGBoost + LightGBM) untuk automated scoring proposisi concept-map mahasiswa, dibandingkan pada tiga kondisi: No-RAG, RAG-Standard, dan RAL (Rubric-Aligned Retrieval-Augmented).

Performance summary (5-fold GroupKFold by student UID, out-of-fold)

Condition	QWK (mean)	QWK (sd)	Accuracy (mean)	MAE (mean)	RMSE (mean)
No-RAG	0.4514	0.0199	0.6193	0.4964	0.8814
RAG-Standard	0.6333	0.0357	0.6903	0.3777	0.7325
RAL	0.7098	0.0413	0.7149	0.3242	0.6413

Hierarchy check (RAL > RAG-Standard > No-RAG): MET
Target QWK(RAL) >= 0.81: NOT YET MET (actual: 0.7098)

Model architecture

Decomposition: Frank–Hall ordinal decomposition (N_CLASSES - 1 binary "greater-than-cutoff" classifiers)
Base learners per cutoff: XGBoost + LightGBM, blended via validation-tuned weight, isotonic-calibrated
Feature selection: mutual-information + max-correlation redundancy filter (max_corr=0.85) for RAL-only features
Cross-validation: GroupKFold (5 folds) grouped by student UID — prevents student-level leakage
Retrieval stack (for RAG-Standard/RAL features): SBERT dense embeddings + FAISS, BM25 sparse, CrossEncoder reranking
LLM-judge cascade (RAL only): multi-provider (OpenRouter / Groq / HuggingFace) ordinal judge features

Files in this repo

ral_ensemble_final.pkl — final RAL production model (cutoff-wise XGBoost+LightGBM dict), refit on all data
results/fold_results.csv — per-fold metrics for all three conditions
results/scoring_summary.csv — aggregated mean/std metrics per condition
results/predictions_store.pkl — out-of-fold predictions & class probabilities per condition
results/feature_matrices.pkl — final feature matrices (X_norag, X_rag, X_ral) + labels/folds/groups
results/*.png — publication figures (performance bars, SHAP summaries, master figure, etc.)
results/publication_manifest.txt — list of all generated artifacts

Intended use

Research artifact accompanying a Scopus Q2 manuscript on rubric-aligned RAG for automated concept-map proposition scoring. Not validated for high-stakes grading without further review.

How to load the model

import pickle
from huggingface_hub import hf_hub_download

path = hf_hub_download(repo_id="Maskur1109/ral-rag-concept-map-scorer", filename="ral_ensemble_final.pkl")
with open(path, "rb") as f:
    artifact = pickle.load(f)

cutoff_models = artifact["cutoff_models"]
feature_columns = artifact["feature_columns"]
# Use predict_proba_cutoffs_v16 + cutoffs_to_class_proba from the source notebook
# to score new feature rows built with the same RAL feature pipeline.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support