julien-c HF staff commited on
Commit
e83e1ef
1 Parent(s): f2e3e95

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/sagorsarker/codeswitch-nepeng-lid-lince/README.md

Files changed (1) hide show
  1. README.md +54 -0
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ne
4
+ - en
5
+ datasets:
6
+ - lince
7
+ license: mit
8
+ tags:
9
+ - codeswitching
10
+ - nepali-english
11
+ - language-identification
12
+ ---
13
+
14
+ # codeswitch-nepeng-lid-lince
15
+ This is a pretrained model for **language identification** of `nepali-english` code-mixed data used from [LinCE](https://ritual.uh.edu/lince/home).
16
+
17
+ This model is trained for this below repository.
18
+
19
+ [https://github.com/sagorbrur/codeswitch](https://github.com/sagorbrur/codeswitch)
20
+
21
+ To install codeswitch:
22
+
23
+ ```
24
+ pip install codeswitch
25
+ ```
26
+
27
+ ## Identify Language
28
+
29
+ * **Method-1**
30
+
31
+ ```py
32
+
33
+ from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
34
+
35
+ tokenizer = AutoTokenizer.from_pretrained("sagorsarker/codeswitch-nepeng-lid-lince")
36
+
37
+ model = AutoModelForTokenClassification.from_pretrained("sagorsarker/codeswitch-nepeng-lid-lince")
38
+ lid_model = pipeline('ner', model=model, tokenizer=tokenizer)
39
+
40
+ lid_model("put any nepali english code-mixed sentence")
41
+
42
+ ```
43
+
44
+ * **Method-2**
45
+
46
+ ```py
47
+ from codeswitch.codeswitch import LanguageIdentification
48
+ lid = LanguageIdentification('nep-eng')
49
+ text = "" # your code-mixed sentence
50
+ result = lid.identify(text)
51
+ print(result)
52
+
53
+ ```
54
+