Crypto Project Classifier

This is a Hugging Face model built on FacebookAI/roberta-large, fine-tuned to classify Twitter accounts as crypto projects or non-crypto entities. The model takes as input a sentence containing the author's name and bio, and outputs a probability score and a classification (1 for a crypto project, 0 otherwise).

Model Details

Language: English
Base Model: FacebookAI/roberta-large
Task: Sequence Classification

Example Input

hi i am {author_name}, i do this {twitter_bio}

How to Use

Below is a sample Python script to use the model for classification:

import torch  
import pandas as pd  
from transformers import RobertaTokenizer, AutoModelForSequenceClassification  

model_name = "yoursdevkalki/crypto_project_classifier"  
tokenizer = RobertaTokenizer.from_pretrained(model_name)  
model = AutoModelForSequenceClassification.from_pretrained(model_name)  

projects_df = pd.read_csv("projects.csv")  
test_df = projects_df.head(500)  

def process_row(description):  
    inputs = tokenizer(description, return_tensors="pt", padding=True, truncation=True)  

    with torch.no_grad():  
        outputs = model(**inputs)  

    logits = outputs.logits  
    probability = torch.sigmoid(logits).numpy()[0][0]  # Convert to probability  

    # Compute prediction  
    prediction = 1 if probability >= 0.6 else 0  

    return probability * 100, prediction  

test_df["prob"], test_df["prediction"] = zip(*test_df["twitter_bio"].apply(process_row))  

print(test_df)

Input Format

The input text should be structured as:

hi i am {author_name}, i do this {twitter_bio}

Outputs

prob: The model's confidence in percentage (0–100%).
prediction: Classification result (1 for crypto project, 0 for non-project).

Dataset

The model was trained on a dataset of 40k samples:

20k Crypto Projects (labeled as 1)
20k Non-Crypto Entities (labeled as 0)

Metrics Achieved

F1 Score: >90%
Accuracy: >90%

Donations

If you find this model useful, consider supporting its development:

Solana Address: 2oiBTZ3QvTbsns4babAW54PHcKzacYG3MXUcpAMp7LKV
Ethereum Address: 0x56a28F1Bd2CD4E2AAA386aeA1c30a24A2f854Ec4

Reach Out

Twitter: @yourdevkalki
Email: yourdevkalki@gmail.com