GPT2 model for Sanskrit language, one of the oldest in world. heads=12 layers=6. This is a bit smaller version, since I trained it on my laptop with smaller gpu.