Fine-tuning the model on any dataset gives OOM

#1
by vibhas09 - opened

Hi @Serega6678 and Numind Team,

Thanks for releasing the model weights - works well OOTB.
I have used the earlier checkpoint (v0.1/ generic-entity-extractor) and those worked flawlessly for me.

However, I am not able to use v1.0 in any fine-tuning.

The training loop is going fine - but whenever it starts the validation loop, it is giving OOM error for GPU RAM.
I have tried it for datasets ranging from 500 samples to 500k samples, on different GPUs like T4 and A10G and the issue is common for all.

I researched about it and saw some recommendations to use eval_accumulation_steps - using which I am getting OOM in the CPU RAM.
All in all - I am not able to use this checkpoint for any dataset.

Sharing a sample fine-tuning I ran on google colab for your reference using publicly available dataset which shows the same error.
Sample Notebook Using Official HF Example
Let me know if I am missing anything - have tried almost everything but not able to find any solution for this.

NuMind org

Hey @vibhas09 ! Thanks for reporting the problem!

We fixed it, now you can use it! The problem was that the class AutoModelForTokenClassification worked strangely when output_hidden_states=True (which was activated in the config of the model)

Also, we just released NuNER-v2.0 - an even more powerful iteration of NuNER: https://huggingface.co/numind/NuNER-v2.0

Thanks for your question and sorry for a long response time, hopefully, NuNER-v2.0 is a good compensation for it πŸ˜€

Thanks for lot for looking into this - really appreciate it.

I did try v2.0 and it works like ✨.
Thanks for releasing the weights and the models.

Sign up or log in to comment