Suggestion: Host Gemma2 using keras_nlp instead of transformers library for the time being

#498
by qnixsynapse - opened

Currently, Gemma 2 is broken on ๐Ÿค— chat even after adding attention softcaps(I hope transformers library was updated there):

image.png

Best is to host it like this:

gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma2_instruct_27b_en")

More information is available on their kaggle repo

Edit: Also set Top_K value to 50 or less and temperature to 0.3.

Hugging Chat org

Hi, we use TGI for hosting Gemma2, so the fix will need to come from this side but thanks for the suggestion!

nsarrazin changed discussion status to closed

I see. No problem.

Sign up or log in to comment