Model Card for dejanseo/CTR
This model is fine-tuned from the microsoft/deberta-v3-large
model for the purpose of classifying the impact of various textual inputs on click-through rates (CTR) in a binary classification setup. It was developed with the intention to help marketers, SEO specialists, and content creators predict how different titles and descriptions may affect the CTR of online content.
Engage Our Team
Interested in using this in an automated pipeline for bulk CTR prediction?
Please book an appointment to discuss your needs.
Model Details
Model Description
The dejanseo/CTR
model is a fine-tuned version of Microsoft's DeBERTa (v3 large) architecture, specifically adapted to predict the impact of textual content on click-through rates (CTR). It uses pre-tokenized titles and descriptions as inputs and outputs a binary classification that estimates whether a given piece of content is likely to have a high or low impact on CTR.
- Developed by: DEJAN MARKETING
- Model type: DebertaV2ForSequenceClassification
- Language(s) (NLP): Primarily English (due to the dataset used for training)
- Finetuned from model:
microsoft/deberta-v3-large
Training Data:
Uses
Direct Use
This model is designed to be directly applied to predict the CTR impact of textual content, aiding in optimizing titles and descriptions for marketing content, SEO, and online advertisements.
Out-of-Scope Use
The model is not intended for use cases significantly divergent from CTR prediction, such as sentiment analysis or content generation. Its performance on non-English languages or significantly different domains (e.g., legal documents) has not been tested and is considered out of scope.
Bias, Risks, and Limitations
The model's predictions are based on patterns learned from its training data, which may reflect biases present in the source material. Users should be cautious of unintended biases in predictions, especially when applied to diverse or sensitive content areas.
Recommendations
We recommend continuous monitoring and validation of the model's predictions against a diverse set of content to mitigate potential biases. Further research and fine-tuning may be required to adapt the model for specific use cases or demographics.
Training Details
Training Data
The model was trained on a dataset comprised of content titles and descriptions, along with their associated impacts on CTR (binary labels). The data was split, with a majority used for training and a small portion reserved for validation.
Training Procedure
Preprocessing
The training data underwent pre-tokenization using DebertaV2Tokenizer
from the transformers
library, with a maximum sequence length of 128 tokens.
Training Hyperparameters
- Learning rate: 1e-5
- Batch size: 24
- Epochs: 10
- Warmup steps: 100
- Optimizer: AdamW
- Downloads last month
- 16