tuva commited on
Commit
cdc7003
1 Parent(s): aeb1fe4

Added logistic regression language classifier model

Browse files
Files changed (4) hide show
  1. README.md +2 -38
  2. config.json +6 -0
  3. model/language_classifier.joblib +3 -0
  4. model_card.md +11 -0
README.md CHANGED
@@ -1,40 +1,4 @@
1
- ---
2
- license: mit
3
- language:
4
- - ru
5
- pipeline_tag: text-classification
6
- tags:
7
- - tuvan
8
- - russian
9
- - binary classifier
10
- ---
11
- # GitHub
12
-
13
- <!-- Provide a quick summary of what the model is/does. -->
14
-
15
- TuRu - Tuvan/Russian binary classifier model [GitHub](https://github.com/tarbagan/tuvalang/tree/main/turu).
16
-
17
-
18
- ## How to use
19
-
20
-
21
-
22
- ```python
23
- from tensorflow.keras.models import load_model
24
-
25
- model = load_model('turu.h5')
26
-
27
- text_to_predict = ["""
28
- Президент ооң бодалы-биле алырга, регионалдыг-даа, муниципалдыг-даа деңнелде деткиир ужурлуг регионнарда спортчу инфраструктура хөгжүлдезиниң айтырыын көрген.
29
- Ооң келир үеде президент программазының угланыышкыны ол апаарын Владимир Путин чугаалаан.
30
- """]
31
-
32
- sequences = tokenizer.texts_to_sequences(text_to_predict)
33
- padded = pad_sequences(sequences, maxlen=10)
34
-
35
- prediction = model.predict(padded)
36
- print(prediction)
37
-
38
- ```
39
 
 
40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
 
2
+ # Language Classifier
3
 
4
+ This model is trained to classify text as either Russian or Tuvan language.
config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+
2
+ {
3
+ "model_type": "logistic_regression",
4
+ "language": ["russian", "tuvan"],
5
+ "pipeline_tag": "text-classification"
6
+ }
model/language_classifier.joblib ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:93552bc0072004f7cceece81b1ffd546743d530c55d00fe1dd7703e5a35b87b6
3
+ size 14610753
model_card.md ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ tags:
4
+ - language-classification
5
+ - russian
6
+ - tuvan
7
+ ---
8
+
9
+ # Language Classifier
10
+
11
+ This model is trained to classify text as either Russian or Tuvan language. It is based on a logistic regression classifier.