Finetuning: Loss is 0 after 1 step, Runtime error in inference

#4
by abipani - opened

I finetuned 7b chat int4 model on 200 sample dataset. It started with loss 2.00 then went to 0 for rest of the steps.
I saved the model and when I trying to inference. It shows me below error.
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either inf, nan or element < 0

Qwen org

Could you please open an issue at https://github.com/QwenLM/Qwen/issues , so that we can better track this?
We also need more context, such as, the script or framework, the devices, and the software environment you used.
Thanks!

Sign up or log in to comment