zeynepgulhan commited on
Commit
a174d1c
·
verified ·
1 Parent(s): adf33f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md CHANGED
@@ -1,3 +1,42 @@
1
  ---
2
  license: mit
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - tr
5
+ pipeline_tag: text-classification
6
+ tags:
7
+ - text-classification
8
  ---
9
+
10
+ ## Model Description
11
+ This model has been fine-tuned using [dbmdz/bert-base-turkish-128k-uncased](https://huggingface.co/dbmdz/bert-base-turkish-128k-uncased) model.
12
+
13
+ This model created for detecting gibberish sentences like "adssnfjnfjn" .
14
+ It is a simple binary classification project that gives sentence is gibberish or real.
15
+
16
+ ## Usage
17
+
18
+ ```python
19
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
20
+ device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
21
+ model = AutoModelForSequenceClassification.from_pretrained("TURKCELL/gibberish-detection-model-tr")
22
+ tokenizer = AutoTokenizer.from_pretrained("TURKCELL/gibberish-detection-model-tr", do_lower_case=True, use_fast=True)
23
+
24
+ model.to(device)
25
+
26
+ def get_result_for_one_sample(model, tokenizer, device, sample):
27
+ d = {
28
+ 1: 'gibberish',
29
+ 0: 'real'
30
+ }
31
+ test_sample = tokenizer([sample], padding=True, truncation=True, max_length=256, return_tensors='pt').to(device)
32
+ # test_sample
33
+ output = model(**test_sample)
34
+ y_pred = np.argmax(output.logits.detach().to('cpu').numpy(), axis=1)
35
+ return d[y_pred[0]]
36
+
37
+ sentence = "nabeer rdahdaajdajdnjnjf"
38
+ result = get_result_for_one_sample(model, tokenizer, device, sentence)
39
+ print(result)
40
+
41
+ ```
42
+