huggyllama/llama-65b

#1
by KnutJaegersberg - opened

the config file has this as file path which looks a little weird

huggyllama/llama-65b

LLM360/K2 would look better :)

KnutJaegersberg changed discussion status to closed

Hah yeah I agree, it looks funny.

Additionally, lack of gqa is an architectural choice that is puzzling to me.

LLM360 org

the config file has this as file path which looks a little weird

huggyllama/llama-65b

Yeah thanks for spotting this. This is because we when we did a checkpoint conversion, we loaded a model and then modify it, loading our own weights etc.

fixing them now.

LLM360 org

Hah yeah I agree, it looks funny.

Additionally, lack of gqa is an architectural choice that is puzzling to me.

Maybe not the best choice now I am looking at it. During our initial design, we tend to choose simple choices since our goal is to make research on these easier.

Sign up or log in to comment