Model Card for dejanseo/CTR

This model is fine-tuned from the microsoft/deberta-v3-large model for the purpose of classifying the impact of various textual inputs on click-through rates (CTR) in a binary classification setup. It was developed with the intention to help marketers, SEO specialists, and content creators predict how different titles and descriptions may affect the CTR of online content.

Engage Our Team

Interested in using this in an automated pipeline for bulk CTR prediction?

Please book an appointment to discuss your needs.

Model Details

Model Description

The dejanseo/CTR model is a fine-tuned version of Microsoft's DeBERTa (v3 large) architecture, specifically adapted to predict the impact of textual content on click-through rates (CTR). It uses pre-tokenized titles and descriptions as inputs and outputs a binary classification that estimates whether a given piece of content is likely to have a high or low impact on CTR.

Developed by: DEJAN MARKETING
Model type: DebertaV2ForSequenceClassification
Language(s) (NLP): Primarily English (due to the dataset used for training)
Finetuned from model: microsoft/deberta-v3-large

Training Data:

USA, Australia, Germany, UK, Canada

Uses

Direct Use

This model is designed to be directly applied to predict the CTR impact of textual content, aiding in optimizing titles and descriptions for marketing content, SEO, and online advertisements.

Out-of-Scope Use

The model is not intended for use cases significantly divergent from CTR prediction, such as sentiment analysis or content generation. Its performance on non-English languages or significantly different domains (e.g., legal documents) has not been tested and is considered out of scope.

Bias, Risks, and Limitations

The model's predictions are based on patterns learned from its training data, which may reflect biases present in the source material. Users should be cautious of unintended biases in predictions, especially when applied to diverse or sensitive content areas.

Recommendations

We recommend continuous monitoring and validation of the model's predictions against a diverse set of content to mitigate potential biases. Further research and fine-tuning may be required to adapt the model for specific use cases or demographics.

Training Details

Training Data

The model was trained on a dataset comprised of content titles and descriptions, along with their associated impacts on CTR (binary labels). The data was split, with a majority used for training and a small portion reserved for validation.

Training Procedure

Preprocessing

The training data underwent pre-tokenization using DebertaV2Tokenizer from the transformers library, with a maximum sequence length of 128 tokens.

Training Hyperparameters

Learning rate: 1e-5
Batch size: 24
Epochs: 10
Warmup steps: 100
Optimizer: AdamW