Quantization made by Richard Erkhov.

gpt2-small-arabic - GGUF

Model creator: https://huggingface.co/akhooli/
Original model: https://huggingface.co/akhooli/gpt2-small-arabic/

Name	Quant method	Size
gpt2-small-arabic.Q2_K.gguf	Q2_K	0.08GB
gpt2-small-arabic.IQ3_XS.gguf	IQ3_XS	0.08GB
gpt2-small-arabic.IQ3_S.gguf	IQ3_S	0.08GB
gpt2-small-arabic.Q3_K_S.gguf	Q3_K_S	0.08GB
gpt2-small-arabic.IQ3_M.gguf	IQ3_M	0.09GB
gpt2-small-arabic.Q3_K.gguf	Q3_K	0.09GB
gpt2-small-arabic.Q3_K_M.gguf	Q3_K_M	0.09GB
gpt2-small-arabic.Q3_K_L.gguf	Q3_K_L	0.1GB
gpt2-small-arabic.IQ4_XS.gguf	IQ4_XS	0.1GB
gpt2-small-arabic.Q4_0.gguf	Q4_0	0.1GB
gpt2-small-arabic.IQ4_NL.gguf	IQ4_NL	0.1GB
gpt2-small-arabic.Q4_K_S.gguf	Q4_K_S	0.1GB
gpt2-small-arabic.Q4_K.gguf	Q4_K	0.11GB
gpt2-small-arabic.Q4_K_M.gguf	Q4_K_M	0.11GB
gpt2-small-arabic.Q4_1.gguf	Q4_1	0.11GB
gpt2-small-arabic.Q5_0.gguf	Q5_0	0.11GB
gpt2-small-arabic.Q5_K_S.gguf	Q5_K_S	0.11GB
gpt2-small-arabic.Q5_K.gguf	Q5_K	0.12GB
gpt2-small-arabic.Q5_K_M.gguf	Q5_K_M	0.12GB
gpt2-small-arabic.Q5_1.gguf	Q5_1	0.12GB
gpt2-small-arabic.Q6_K.gguf	Q6_K	0.13GB

Original model description:

language: "ar" datasets:

Arabic Wikipedia metrics:
none

GPT2-Small-Arabic

Model description

GPT2 model from Arabic Wikipedia dataset based on gpt2-small (using Fastai2).

Intended uses & limitations

How to use

An example is provided in this colab notebook. Both text and poetry (fine-tuned model) generation are included.

Limitations and bias

GPT2-small-arabic (trained on Arabic Wikipedia) has several limitations in terms of coverage (Arabic Wikipeedia quality, no diacritics) and training performance. Use as demonstration or proof of concepts but not as production code.

Training data

This pretrained model used the Arabic Wikipedia dump (around 900 MB).

Training procedure

Training was done using Fastai2 library on Kaggle, using free GPU.

Eval results

Final perplexity reached was 72.19, loss: 4.28, accuracy: 0.307

BibTeX entry and citation info

@inproceedings{Abed Khooli,
  year={2020}
}