File size: 3,615 Bytes
39e4834
 
b8a9250
 
 
 
39e4834
b8a9250
39e4834
 
 
 
 
 
 
 
 
 
 
de54e91
39e4834
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
license: apache-2.0

widget:
- text: "One day I will be a real teacher and I will try to do the best I can for the children."
  example_title: "Classification (without context)"
---

# Model Card for XLM-Roberta-large-reflective-conf4

This is a reflectivity classification model trained to distinguish different types of reflectivity in the reports of teaching students.

It was evaluated in a cross-lingual settings and was found to work well also in languages outside English -- see the results in the referenced paper.

## Model Details

- **Repository:** https://github.com/EduMUNI/reflection-classification
- **Paper:** https://link.springer.com/article/10.1007/s10639-022-11254-7

- **Developed by:** Michal Stefanik & Jan Nehyba, Masaryk University
- **Model type:** Roberta-large
- **Finetuned from model:** [XLM-R-large](https://huggingface.co/xlm-roberta-large)

## Usage

To match the training format, it is best to use the prepared wrapper that will format the classified sentence and its surrounding context in the expected format:

```python
from transformers import AutoConfig, AutoModelForSequenceClassification, AutoTokenizer

LABELS = ["Other", "Belief", "Perspective", "Feeling", "Experience",
          "Reflection", "Difficulty", "Intention", "Learning"]

class NeuralClassifier:

    def __init__(self, model_path: str, uses_context: bool, device: str):
        self.config = AutoConfig.from_pretrained(model_path)
        self.device = device
        self.model = AutoModelForSequenceClassification.from_pretrained(model_path, config=self.config).to(device)
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.uses_context = uses_context

    def predict_sentence(self, sentence: str, context: str = None):
        if context is None and self.uses_context:
            raise ValueError("You need to pass in context argument, including the sentence")

        features = self.tokenizer(sentence, text_pair=context,
                                  padding="max_length", truncation=True, return_tensors='pt')
        outputs = self.model(**features.to(self.device), return_dict=True)
        argmax = outputs.logits.argmax(dim=-1).detach().cpu().tolist()[0]
        labels = LABELS[argmax]

        return labels
```

The wrapper can be used as follows:
```python
classifier = NeuralClassifier(model_path="MU-NLPC/XLM-R-large-reflective-conf4", 
                              uses_context=False,
                              device="cpu")

test_sentences = ["And one day I will be a real teacher and I will try to do the best I can for the children.",
                  "I felt really well!",
                  "gfagdhj gjfdjgh dg"]

y_pred = [classifier.predict_sentence(sentence) for sentence in tqdm(test_sentences)]

print(y_pred)

>>> ['Intention', 'Feeling', 'Other']
```

### Training Data

The model was trained on a [CEReD dataset](http://hdl.handle.net/11372/LRT-3573) and aims for the best possible evaluation in cross-lingual settings (on unseen languages).

See the reproducible training script in the project directory: https://github.com/EduMUNI/reflection-classification

## Citation

If you use the model in scientific work, please acknowledge our work as follows.

```bibtex
@Article{Nehyba2022applications,
  author={Nehyba, Jan and {\v{S}}tef{\'a}nik, Michal},
  title={Applications of deep language models for reflective writings},
  journal={Education and Information Technologies},
  year={2022},
  month={Sep},
  day={05},
  issn={1573-7608},
  doi={10.1007/s10639-022-11254-7},
  url={https://doi.org/10.1007/s10639-022-11254-7}
}
```