Question

by mrfakename - opened 26 days ago

26 days ago

•

Hi
Thanks for releasing Granite, can’t wait to try it out. If it’s based on the Llama arch, why does it need Transformers 4.41?
Thanks!
(PS: thanks for using the Llama arch instead of a custom one - makes it so much easier to tune :))

mayank-mishra

IBM Granite org 26 days ago

hi @mrfakename , the llama arch required adding a new parameter 'mlp_bias'
PR: https://github.com/huggingface/transformers/pull/30031
rest is similar to llama

mayank-mishra

IBM Granite org 26 days ago

you can find this param in our config as well: https://huggingface.co/ibm-granite/granite-3b-code-base/blob/c2475bd7587e4e08fafb0e22223f9af7081c5c00/config.json#L14

mrfakename

25 days ago

thx for the explanation! makes sense

mrfakename changed discussion status to closed 25 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment