YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Crypto Project Classifier

This is a Hugging Face model built on FacebookAI/roberta-large, fine-tuned to classify Twitter accounts as crypto projects or non-crypto entities. The model takes as input a sentence containing the author's name and bio, and outputs a probability score and a classification (1 for a crypto project, 0 otherwise).

Model Details

  • Language: English
  • Base Model: FacebookAI/roberta-large
  • Task: Sequence Classification

Example Input

hi i am {author_name}, i do this {twitter_bio}  

How to Use

Below is a sample Python script to use the model for classification:

import torch  
import pandas as pd  
from transformers import RobertaTokenizer, AutoModelForSequenceClassification  

model_name = "yoursdevkalki/crypto_project_classifier"  
tokenizer = RobertaTokenizer.from_pretrained(model_name)  
model = AutoModelForSequenceClassification.from_pretrained(model_name)  

projects_df = pd.read_csv("projects.csv")  
test_df = projects_df.head(500)  

def process_row(description):  
    inputs = tokenizer(description, return_tensors="pt", padding=True, truncation=True)  

    with torch.no_grad():  
        outputs = model(**inputs)  

    logits = outputs.logits  
    probability = torch.sigmoid(logits).numpy()[0][0]  # Convert to probability  

    # Compute prediction  
    prediction = 1 if probability >= 0.6 else 0  

    return probability * 100, prediction  

test_df["prob"], test_df["prediction"] = zip(*test_df["twitter_bio"].apply(process_row))  

print(test_df)  

Input Format

The input text should be structured as:

hi i am {author_name}, i do this {twitter_bio}  

Outputs

  • prob: The model's confidence in percentage (0–100%).
  • prediction: Classification result (1 for crypto project, 0 for non-project).

Dataset

The model was trained on a dataset of 40k samples:

  • 20k Crypto Projects (labeled as 1)
  • 20k Non-Crypto Entities (labeled as 0)

Metrics Achieved

  • F1 Score: >90%
  • Accuracy: >90%

Donations

If you find this model useful, consider supporting its development:

  • Solana Address: 2oiBTZ3QvTbsns4babAW54PHcKzacYG3MXUcpAMp7LKV
  • Ethereum Address: 0x56a28F1Bd2CD4E2AAA386aeA1c30a24A2f854Ec4

Reach Out

Downloads last month
10
Safetensors
Model size
355M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .