Automated Motive Coder with intfloat/multilingual-e5-large as base

This is a SetFit model that can be used for Text Classification. This SetFit model uses a finetuned version of intfloat/multilingual-e5-large as the Sentence Transformer embedding model. A OneVsRestClassifier instance with SGDClassifier estimators is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: intfloat/multilingual-e5-large
Classification head: a OneVsRestClassifier instance
Maximum Sequence Length: 512 tokens
Number of Classes: 4 classes (nAch: "ach", nAff: "aff", nPow: "pow", Null: "null")
Training Dataset: Labeled PSE-stories published by Schönbrodt and colleagues

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("automatedMotiveCoder/setfit")
# Run inference
preds = model("I loved the spiderman movie!")

When using the predict_proba() method, the predicted probabilities for all classes might not sum to 1. This is because the model loaded uses a One-vs-Rest classification approach, which means the model treats each class as a binary classification problem. As a result, the probabilities are independent, and their sum may exceed or fall below 1.

For more details on how to use this model, see our blogpost.

Versioning

The most recent version is the version at submission to the NLPSI workshop at the ICWSM 2025 in Kopenhagen.

For all Versions see this table:

Commit-Hash	SemVer	Comment
738833d	1.0.0	Version at submission

To use the model with a specific version, add the revision argument to the model-loading:

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("automatedMotiveCoder/setfit",
                                    revision="738833d") # Version 1.0.0

Training Details

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 3.3.1
Transformers: 4.46.3
PyTorch: 2.3.1+cu121
Datasets: 2.21.0
Tokenizers: 0.20.3
Scikit-learn: 1.6.1

Citation

BibTeX

@inproceedings{bredeAutomaticallyCodingImplicit2025,
  title = {Automatically {{Coding Implicit Motives}} in {{Picture Story Exercises}}: {{The Automated Motive Coder}}},
  shorttitle = {Automatically {{Coding Implicit Motives}} in {{Picture Story Exercises}}},
  author = {Brede, Max and Sch{\"o}nbrodt, Felix and Hagemeyer, Birk and Lerche, Veronika},
  year = {2025},
  month = jun,
  publisher = {ICWSM},
  address = {US},
  urldate = {2025-06-23},
  abstract = {The Picture Story Exercise (PSE) is a projective measure in personality psychology where individuals create narratives based on ambiguous images. Traditionally, the coding of these narratives has been labor-intensive. We introduce the Automated Motive Coder (AMC), which employs recent advances in natural language processing and machine learning to automate the coding of PSE narratives. Trained on an extensive dataset, the AMC demonstrates accuracy comparable to expert coders for both original and translated texts. The model offers support for multiple languages that were absent in prior methods while improving in accuracy and speed. To illustrate its effectiveness, we tested and successfully replicated the established psychological effect of gender difference in the affiliation motive. The AMC can be utilized through established machine learning tools, offering a pragmatic and reliable method for coding across several languages. This tool provides an option to reduce the workload involved in PSE coding, promoting efficiency and consistency in motive assessment.},
  langid = {english}
}

Downloads last month: 195

Safetensors

Model size

560M params

Tensor type

F32

Model tree for automatedMotiveCoder/setfit

Base model

intfloat/multilingual-e5-large

Finetuned

(133)

this model