MU-NLPC
/

XLM-R-large-reflective-conf4

Text Classification

Inference Endpoints

Model card Files Files and versions Community

XLM-R-large-reflective-conf4 / README.md

michal-stefanik's picture

michal-stefanik

Update README.md

de54e91 over 1 year ago

|

history blame contribute delete

3.62 kB

	---
	license: apache-2.0

	widget:
	- text: "One day I will be a real teacher and I will try to do the best I can for the children."
	example_title: "Classification (without context)"
	---

	# Model Card for XLM-Roberta-large-reflective-conf4

	This is a reflectivity classification model trained to distinguish different types of reflectivity in the reports of teaching students.

	It was evaluated in a cross-lingual settings and was found to work well also in languages outside English -- see the results in the referenced paper.

	## Model Details

	- Repository: https://github.com/EduMUNI/reflection-classification
	- Paper: https://link.springer.com/article/10.1007/s10639-022-11254-7

	- Developed by: Michal Stefanik & Jan Nehyba, Masaryk University
	- Model type: Roberta-large
	- Finetuned from model: [XLM-R-large](https://huggingface.co/xlm-roberta-large)

	## Usage

	To match the training format, it is best to use the prepared wrapper that will format the classified sentence and its surrounding context in the expected format:

	```python
	from transformers import AutoConfig, AutoModelForSequenceClassification, AutoTokenizer

	LABELS = ["Other", "Belief", "Perspective", "Feeling", "Experience",
	"Reflection", "Difficulty", "Intention", "Learning"]

	class NeuralClassifier:

	def __init__(self, model_path: str, uses_context: bool, device: str):
	self.config = AutoConfig.from_pretrained(model_path)
	self.device = device
	self.model = AutoModelForSequenceClassification.from_pretrained(model_path, config=self.config).to(device)
	self.tokenizer = AutoTokenizer.from_pretrained(model_path)
	self.uses_context = uses_context

	def predict_sentence(self, sentence: str, context: str = None):
	if context is None and self.uses_context:
	raise ValueError("You need to pass in context argument, including the sentence")

	features = self.tokenizer(sentence, text_pair=context,
	padding="max_length", truncation=True, return_tensors='pt')
	outputs = self.model(**features.to(self.device), return_dict=True)
	argmax = outputs.logits.argmax(dim=-1).detach().cpu().tolist()[0]
	labels = LABELS[argmax]

	return labels
	```

	The wrapper can be used as follows:
	```python
	classifier = NeuralClassifier(model_path="MU-NLPC/XLM-R-large-reflective-conf4",
	uses_context=False,
	device="cpu")

	test_sentences = ["And one day I will be a real teacher and I will try to do the best I can for the children.",
	"I felt really well!",
	"gfagdhj gjfdjgh dg"]

	y_pred = [classifier.predict_sentence(sentence) for sentence in tqdm(test_sentences)]

	print(y_pred)

	>>> ['Intention', 'Feeling', 'Other']
	```

	### Training Data

	The model was trained on a [CEReD dataset](http://hdl.handle.net/11372/LRT-3573) and aims for the best possible evaluation in cross-lingual settings (on unseen languages).

	See the reproducible training script in the project directory: https://github.com/EduMUNI/reflection-classification

	## Citation

	If you use the model in scientific work, please acknowledge our work as follows.

	```bibtex
	@Article{Nehyba2022applications,
	author={Nehyba, Jan and {\v{S}}tef{\'a}nik, Michal},
	title={Applications of deep language models for reflective writings},
	journal={Education and Information Technologies},
	year={2022},
	month={Sep},
	day={05},
	issn={1573-7608},
	doi={10.1007/s10639-022-11254-7},
	url={https://doi.org/10.1007/s10639-022-11254-7}
	}
	```