julien-c HF staff commited on
Commit
1d9dd26
1 Parent(s): 42efbb8

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/jannesg/takalane_zul_roberta/README.md

Files changed (1) hide show
  1. README.md +49 -0
README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - zul
4
+ thumbnail: https://pbs.twimg.com/media/EVjR6BsWoAAFaq5.jpg
5
+ tags:
6
+ - zul
7
+ - fill-mask
8
+ - pytorch
9
+ - roberta
10
+ - masked-lm
11
+ license: MIT
12
+ ---
13
+
14
+ # Takalani Sesame - Zulu 🇿🇦
15
+
16
+ <img src="https://pbs.twimg.com/media/EVjR6BsWoAAFaq5.jpg" width="600"/>
17
+
18
+ ## Model description
19
+
20
+ Takalani Sesame (named after the South African version of Sesame Street) is a project that aims to promote the use of South African languages in NLP, and in particular look at techniques for low-resource languages to equalise performance with larger languages around the world.
21
+
22
+ ## Intended uses & limitations
23
+
24
+ #### How to use
25
+
26
+ ```python
27
+ from transformers import AutoTokenizer, AutoModelWithLMHead
28
+
29
+ tokenizer = AutoTokenizer.from_pretrained("jannesg/takalane_zul_roberta")
30
+
31
+ model = AutoModelWithLMHead.from_pretrained("jannesg/takalane_zul_roberta")
32
+ ```
33
+
34
+ #### Limitations and bias
35
+
36
+ Updates will be added continously to improve performance.
37
+
38
+ ## Training data
39
+
40
+ Data collected from [https://wortschatz.uni-leipzig.de/en](https://wortschatz.uni-leipzig.de/en) <br/>
41
+ **Sentences:** 410000
42
+
43
+ ## Training procedure
44
+
45
+ No preprocessing. Standard Huggingface hyperparameters.
46
+
47
+ ## Author
48
+
49
+ Jannes Germishuys [website](http://jannesgg.github.io)