Great work on this model. Initial results are very impressive. Is there any chance to be able to download the full weights of the model (70GB) to be able to run fine tuning on using a TPU? Want to fine tune GPT JT on custom prompt dataset.
Looking to run fine tuning following this guide https://github.com/kingoflolz/mesh-transformer-jax
@idop11 apparently the model cannot be fine-tuned https://huggingface.co/togethercomputer/GPT-JT-6B-v1/discussions/15 at this time
Thanks for your interest in fine-tuning our model! Unfortunately, our model was not trained using
mesh-transformer-jax, and the format of full weights (including optimizer states) might not be compatible with their code base.
@kobalsky The model can be fine-tuned, but necessary adjustments are required, check out this~