habanoz/TinyLlama-1.1B-step-2T-lr-5-5ep-oasst1-top1-instruct-V1

Nov 24, 2023

As far as I've tested, the model performance is really good !
Do you have any plans on providing another version with higher context length ? Such as 8k or 16k

habanoz

Owner Nov 24, 2023

•

edited Nov 24, 2023

Hi @cnmoro ,

Thanks for your interest.

You may check out <<TinyLlama/TinyLlama-1.1B-Chat-v0.3>> model for ~~slightly better~~ similar performance. (based on https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)

You may also check <<microsoft/phi-1_5>> which has a similar size (1.3b) but far better performance. Please note that it is not instruction fine-tuned.

I am not planning to train longer context models.

60GB GPU RAM is necessary to fine-tune a model with 1.1B parameters and 8K context using QLORA. (based on https://rahulschand.github.io/gpu_poor/).

Regards

habanoz

Owner Nov 28, 2023

closing...

habanoz changed discussion status to closed Nov 28, 2023

habanoz
/

TinyLlama-1.1B-step-2T-lr-5-5ep-oasst1-top1-instruct-V1

Good performance