castorini
/

afriteva_small

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ToluClassics commited on May 24, 2022

Commit

475acc8

•

1 Parent(s): d60e552

Create README.md

Files changed (1) hide show

README.md +51 -0

README.md ADDED Viewed

	@@ -0,0 +1,51 @@

+Hugging Face's logo
+---
+language:
+- om
+- am
+- rw
+- rn
+- ha
+- ig
+- pcm
+- so
+- sw
+- ti
+- yo
+- multilingual
+- T5
+---
+# afriteva_small
+## Model desription
+AfriTeVa small is a sequence to sequence model pretrained on 10 African languages
+## Languages
+Afaan Oromoo(orm), Amharic(amh), Gahuza(gah), Hausa(hau), Igbo(igb), Nigerian Pidgin(pcm), Somali(som), Swahili(swa), Tigrinya(tig), Yoruba(yor)
+### More information on the model, dataset:
+### The model
+- 64M parameters encoder-decoder architecture (T5-like)
+- 6 layers, 8 attention heads and 512 token sequence length
+### The dataset
+- Multilingual: 10 African languages listed above
+- 143 Million Tokens (1GB of text data)
+- Tokenizer Vocabulary Size: 70,000 tokens
+## Training Procedure
+For information on training procedures, please refer to the AfriTeVa [paper](#) or [repository](https://github.com/castorini/afriteva)
+## BibTex entry and Citation info
+coming soon ...