Model Description

This model has been fine-tuned using dbmdz/bert-base-turkish-128k-uncased model.

This model created for detecting gibberish sentences like "adssnfjnfjn" . It is a simple binary classification project that gives sentence is gibberish or real.

Usage

from transformers import AutoModelForSequenceClassification, AutoTokenizer
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = AutoModelForSequenceClassification.from_pretrained("TURKCELL/gibberish-detection-model-tr")
tokenizer = AutoTokenizer.from_pretrained("TURKCELL/gibberish-detection-model-tr", do_lower_case=True, use_fast=True)

model.to(device)

def get_result_for_one_sample(model, tokenizer, device, sample):
    d = {
        1: 'gibberish',
        0: 'real'
    }
    test_sample = tokenizer([sample], padding=True, truncation=True, max_length=256, return_tensors='pt').to(device)
    # test_sample
    output = model(**test_sample)
    y_pred = np.argmax(output.logits.detach().to('cpu').numpy(), axis=1)
    return d[y_pred[0]]

sentence = "nabeer rdahdaajdajdnjnjf"
result = get_result_for_one_sample(model, tokenizer, device, sentence)
print(result)
Downloads last month
345
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using TURKCELL/gibberish-sentence-detection-model-tr 1