michal-stefanik's picture
Create README.md
39e4834
|
raw
history blame
3.46 kB
metadata
license: apache-2.0

Model Card for XLM-Roberta-large-reflective-conf4

This is a reflectivity classification model trained to distinguish different types of reflectivity in the reports of teaching students.

It was evaluated in a cross-lingual settings and was found to work well also in languages outside English -- see the results in the referenced paper.

Model Details

Usage

To match the training format, it is best to use the prepared wrapper that will format the classified sentence and its surrounding context in the expected format:

from transformers import AutoConfig, AutoModelForSequenceClassification, AutoTokenizer

LABELS = ["Other", "Belief", "Perspective", "Feeling", "Experience",
          "Reflection", "Difficulty", "Intention", "Learning"]

class NeuralClassifier:

    def __init__(self, model_path: str, uses_context: bool, device: str):
        self.config = AutoConfig.from_pretrained(model_path)
        self.device = device
        self.model = AutoModelForSequenceClassification.from_pretrained(model_path, config=self.config).to(device)
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.uses_context = uses_context

    def predict_sentence(self, sentence: str, context: str = None):
        if context is None and self.uses_context:
            raise ValueError("You need to pass in context argument, including the sentence")

        features = self.tokenizer(sentence, text_pair=context,
                                  padding="max_length", truncation=True, return_tensors='pt')
        outputs = self.model(**features.to(self.device), return_dict=True)
        argmax = outputs.logits.argmax(dim=-1).detach().cpu().tolist()[0]
        labels = LABELS[argmax]

        return labels

The wrapper can be used as follows:

classifier = NeuralClassifier(model_path="MU-NLPC/XLM-R-large-reflective-conf4", 
                              uses_context=False,
                              device="cpu")

test_sentences = ["And one day I will be a real teacher and I will try to do the best I can for the children.",
                  "I felt really well!",
                  "gfagdhj gjfdjgh dg"]

y_pred = [classifier.predict_sentence(sentence) for sentence in tqdm(test_sentences)]

print(y_pred)

>>> ['Intention', 'Feeling', 'Other']

Training Data

The model was trained on a CEReD dataset and aims for the best possible evaluation in cross-lingual settings (on unseen languages).

See the reproducible training script in the project directory: https://github.com/EduMUNI/reflection-classification

Citation

If you use the model in scientific work, please acknowledge our work as follows.

@Article{Nehyba2022applications,
  author={Nehyba, Jan and {\v{S}}tef{\'a}nik, Michal},
  title={Applications of deep language models for reflective writings},
  journal={Education and Information Technologies},
  year={2022},
  month={Sep},
  day={05},
  issn={1573-7608},
  doi={10.1007/s10639-022-11254-7},
  url={https://doi.org/10.1007/s10639-022-11254-7}
}