floxoris/adrash-v0

Adrash v0 is a compact binary text classification model for detecting advertisements, promo spam, referral spam, Telegram channel promotion, suspicious job spam, and obfuscated ad-like messages.

The model is designed for lightweight moderation systems, especially:

  • Telegram bots
  • Telegram groups
  • Telegram Mini Apps
  • marketplaces
  • comment sections
  • chat systems
  • small moderation APIs

Adrash means Ad + Trash: a small filter that catches advertising garbage before it reaches users.

Labels

ID Label Meaning
0 clean Normal message
1 ad_spam Advertisement, promo, referral spam, job spam, channel promotion, suspicious commercial message

What Adrash v0 detects

Adrash v0 is trained to detect messages like:

  • Telegram channel promotion
  • referral spam
  • promo-code spam
  • suspicious job offers
  • “work online” spam
  • salary bait messages
  • “write me in DM” spam
  • obfuscated Telegram spam
  • emoji-heavy salary fragments
  • messages with mixed Cyrillic, Latin, and Greek letters
  • messages with hidden Unicode / zero-width characters

Examples of target spam:

РАБОТА ОНЛАЙН 💰
Ищу людей в команду на обучение
Опыт не требуется, всему научу
ЗП 2000-5000р/день
Связь: @username
Подпишись на канал и получи бонус

Obfuscated examples:

Ηa ceгοдня–зaвтpa нужны 2 чeлοвeκa
⚠️ЗП в m еcяц 2000💵+
➡️Uщy людeй в koмaнду на 0бучenиe
Εcли гοтοвы выйти — пишитe «+» в личныe cοοбщeния

What Adrash v0 is not for

Adrash v0 is not a general safety model.

It is not designed to reliably detect:

  • toxicity
  • hate speech
  • violent threats
  • illegal activity
  • self-harm
  • sexual content
  • malware
  • political manipulation
  • general abuse

For those categories, use a separate safety classifier.

Recommended thresholds

The model outputs probabilities for clean and ad_spam.

Recommended moderation policy:

ad_spam score Action
>= 0.85 Block / delete
0.65 - 0.85 Send to manual moderation
< 0.65 Allow

For production systems, it is better to reduce false positives. Accidentally deleting normal messages is usually worse than missing a small amount of spam.

Usage with Transformers

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "floxoris/adrash-v0"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

model.eval()

text = "РАБОТА ОНЛАЙН 💰 ЗП каждый день, пишите в личку"

inputs = tokenizer(
    text,
    return_tensors="pt",
    truncation=True,
    max_length=160,
)

with torch.inference_mode():
    logits = model(**inputs).logits[0]
    probs = torch.softmax(logits, dim=-1)

clean_score = float(probs[0])
ad_spam_score = float(probs[1])

label = "ad_spam" if ad_spam_score >= clean_score else "clean"

print({
    "label": label,
    "clean": clean_score,
    "ad_spam": ad_spam_score,
})

Usage with pipeline

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="floxoris/adrash-v0",
    tokenizer="floxoris/adrash-v0",
    return_all_scores=True,
)

text = "Подпишись на канал и получи бонус"
result = classifier(text)

print(result)

Telegram bot moderation example

def moderation_decision(ad_spam_score: float) -> str:
    if ad_spam_score >= 0.85:
        return "block"
    if ad_spam_score >= 0.65:
        return "moderate"
    return "allow"

For Telegram groups, it is recommended to classify a short message buffer from the same user instead of only one isolated message.

Example:

User sends 3 messages within 20 seconds:

1. РАБОТА ОНЛАЙН 💰
2. Опыт не требуется, всему научу
3. Связь: @username

Classifying the combined block is usually more reliable than classifying each fragment separately.

Training data

Adrash v0 was trained on a mixture of public spam/ham datasets, Telegram-like datasets, synthetic Telegram-style advertisement examples, clean hard-negative examples, and obfuscation-heavy spam samples.

Training sources include:

thehamkercat/telegram-spam-ham
mshenoda/spam-messages
Deysi/spam-detection-dataset
SetFit/enron_spam
KSE-RESEARCH-Group/UAReviews
zefang-liu/phishing-email-dataset
ucirvine/sms_spam
SmsSpamCollection
ScoutieAutoML/russian-news-telegram-dataset
ScoutieAutoML/cybersecurity_news_telegram_dataset

The training set also includes hard-negative examples such as:

як зробити реферальну систему в боті?
потрібно додати кнопку підписатися
мій Telegram-бот не бачить канал
скільки коштує реклама в телеграмі?
це реклама чи нормальний пост?

These examples help reduce false positives on developer, moderation, marketplace, and Telegram-bot related conversations.

Obfuscation robustness

Adrash v0 was trained with examples containing:

  • zero-width Unicode characters
  • Cyrillic / Latin / Greek homoglyph mixing
  • digits used as letters
  • emoji salary fragments
  • short Telegram spam fragments
  • suspicious job-spam patterns
  • mixed-language spam
  • Telegram invite links
  • username/contact bait

Examples:

⁠⁠⁠⁠⁠⁠⁠⁠⁠РАБОТА О НЛАЙН 💰
➡️Uщy людeй в koмaнду на 0бучenиe
⚠️ЗП в m еcяц 2000💵+
👀 Bсе что нужно - teлeфoн и жeлаnue paб0taть
✉️ Св⁠язь: @username⁠͏‍

Evaluation

Replace this section with real metrics from the final training run.

{
  "validation": {
    "eval_precision_ad": "TODO",
    "eval_recall_ad": "TODO",
    "eval_f1_ad": "TODO",
    "eval_false_positive_rate": "TODO",
    "eval_false_negative_rate": "TODO"
  },
  "benchmark": {
    "benchmark_precision_ad": "TODO",
    "benchmark_recall_ad": "TODO",
    "benchmark_f1_ad": "TODO",
    "benchmark_false_positive_rate": "TODO",
    "benchmark_false_negative_rate": "TODO"
  },
  "hard_test": {
    "hard_test_precision_ad": "TODO",
    "hard_test_recall_ad": "TODO",
    "hard_test_f1_ad": "TODO",
    "hard_test_false_positive_rate": "TODO",
    "hard_test_false_negative_rate": "TODO"
  }
}

Limitations

Adrash v0 may still fail on:

  • very short fragments without context
  • new spam formats not present in training data
  • messages that require external context
  • mixed moderation categories, such as toxic spam or illegal offers
  • intentionally adversarial text designed to bypass classifiers
  • messages where spam intent is only clear across multiple user messages

For best results, use Adrash v0 together with:

  • short user message buffering
  • repeated-message detection
  • link/domain checks
  • rate limits
  • admin review for medium-confidence cases

Model details

Field Value
Model name floxoris/adrash-v0
Task Binary text classification
Labels clean, ad_spam
Base model cointegrated/rubert-tiny2
Main languages Russian, Ukrainian, English
Max length used in training 160 tokens
Framework Transformers / PyTorch

Example output

{
  "label": "ad_spam",
  "clean": 0.0214,
  "ad_spam": 0.9786
}

Citation

@misc{floxoris_adrash_v0,
  title={Adrash v0: Compact Advertisement and Spam Filter},
  author={Floxoris},
  year={2026},
  publisher={Hugging Face},
  howpublished={https://huggingface.co/floxoris/adrash-v0}
}

Disclaimer

Adrash v0 is an experimental moderation model. It should not be used as the only moderation layer in high-risk systems. Always test it on your own real messages before production deployment.

Downloads last month
-
Safetensors
Model size
29.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for floxoris/adrash-v0

Finetuned
(69)
this model