[Don't merge] inferentia2 workaround

#34
by philschmid HF staff - opened

This is a workaround for deploying Llama 3 on Inferentia with TGI. Since the new generation_config has now a list as eos_token_id. The deployment fails. This revision removes one of it.

philschmid changed pull request title from inferentia2 workaround to [Don't merge] inferentia2 workaround
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment