the argument name should be past_key_value, not past_key_values

Databricks org

Hmm so this implementation is largely based on LlamaForCausalLM, which for some reason uses past_key_value in some modules and plural past_key_values in other modules...

For example here is the official LlamaModel code in transformers: https://github.com/huggingface/transformers/blob/03732dea60fba1da78c79eb59c443ebf975c2be6/src/transformers/models/llama/modeling_llama.py#L945

I'll follow up with the HF folks about this, but I think I'd like to avoid changing the source rn since it matches what LLaMa does.

Databricks org

Got an official answer from HF folks here, I believe the argument names are intentional: https://github.com/huggingface/transformers/pull/29921#issuecomment-2039903230

abhi-db changed pull request status to closed

Sign up or log in to comment