Kite

🎉 You are looking at Kite 5 Release Candidate 3, which changed the configuration!

Kite is a small, trained, 11 million parameter language model.

Training

It was trained on a tokenized version of qikp/small-data-3, which is a mixture of various datasets, using 1 epoch, 32 batch size, 1.5e-4 learning rate, and the pika 4 tokenizer.

Also, evaluation on a tokenized and truncated byunggill/gpt-2-output was done during training.

Limitations

Due to its size, the model is not suitable for production workloads.

Downloads last month
-
Safetensors
Model size
11.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train qikp/kite-5-11m-rc3