metadata

license: apache-2.0

Model Card for XLM-Roberta-large-reflective-conf4

This is a reflectivity classification model trained to distinguish different types of reflectivity in the reports of teaching students.

It was evaluated in a cross-lingual settings and was found to work well also in languages outside English -- see the results in the referenced paper.

Model Details

Repository: https://github.com/EduMUNI/reflection-classification
Paper: https://link.springer.com/article/10.1007/s10639-022-11254-7
Developed by: Jan Nehyba & Michal Stefanik, Masaryk University
Model type: Roberta-large
Finetuned from model: XLM-R-large

Usage

To match the training format, it is best to use the prepared wrapper that will format the classified sentence and its surrounding context in the expected format:

from transformers import AutoConfig, AutoModelForSequenceClassification, AutoTokenizer

LABELS = ["Other", "Belief", "Perspective", "Feeling", "Experience",
          "Reflection", "Difficulty", "Intention", "Learning"]

class NeuralClassifier:

    def __init__(self, model_path: str, uses_context: bool, device: str):
        self.config = AutoConfig.from_pretrained(model_path)
        self.device = device
        self.model = AutoModelForSequenceClassification.from_pretrained(model_path, config=self.config).to(device)
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.uses_context = uses_context

    def predict_sentence(self, sentence: str, context: str = None):
        if context is None and self.uses_context:
            raise ValueError("You need to pass in context argument, including the sentence")

        features = self.tokenizer(sentence, text_pair=context,
                                  padding="max_length", truncation=True, return_tensors='pt')
        outputs = self.model(**features.to(self.device), return_dict=True)
        argmax = outputs.logits.argmax(dim=-1).detach().cpu().tolist()[0]
        labels = LABELS[argmax]

        return labels

The wrapper can be used as follows:

classifier = NeuralClassifier(model_path="MU-NLPC/XLM-R-large-reflective-conf4", 
                              uses_context=False,
                              device="cpu")

test_sentences = ["And one day I will be a real teacher and I will try to do the best I can for the children.",
                  "I felt really well!",
                  "gfagdhj gjfdjgh dg"]

y_pred = [classifier.predict_sentence(sentence) for sentence in tqdm(test_sentences)]

print(y_pred)

>>> ['Intention', 'Feeling', 'Other']

Training Data

The model was trained on a CEReD dataset and aims for the best possible evaluation in cross-lingual settings (on unseen languages).

See the reproducible training script in the project directory: https://github.com/EduMUNI/reflection-classification

Citation

If you use the model in scientific work, please acknowledge our work as follows.

@Article{Nehyba2022applications,
  author={Nehyba, Jan and {\v{S}}tef{\'a}nik, Michal},
  title={Applications of deep language models for reflective writings},
  journal={Education and Information Technologies},
  year={2022},
  month={Sep},
  day={05},
  issn={1573-7608},
  doi={10.1007/s10639-022-11254-7},
  url={https://doi.org/10.1007/s10639-022-11254-7}
}