Bug: fix the rope_scaling

#2
by alfredplpl - opened

I ran the code:

from transformers import AutoTokenizer
import transformers

model = "NousResearch/Yarn-Llama-2-13b-128k"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    device_map="auto",
    load_in_8bit=True
)

Then, I got the error:

  File "/path/to/venv/lib/python3.8/site-packages/transformers/models/llama/configuration_llama.py", line 149, in __init__
    self._rope_scaling_validation()
  File "/path/to/venv/lib/python3.8/site-packages/transformers/models/llama/configuration_llama.py", line 167, in _rope_scaling_validation
    raise ValueError(
ValueError: `rope_scaling` must be a dictionary with with two fields, `type` and `factor`, got {'factor': 32.0, 'original_max_position_embeddings': 4096, 'type': 'yarn', 'finetuned': True}

But, the config is as follows:

{
...
  "rope_scaling": {
    "factor": 32.0,
    "original_max_position_embeddings": 4096,
    "type": "yarn",
    "finetuned": true
  },
...
}

Please fix https://huggingface.co/NousResearch/Yarn-Llama-2-13b-128k/blob/main/configuration_llama.py .

Thanks in advance.

NousResearch org

Pass trust_remote_code=True to the pipeline call -- this is needed to run the custom modeling code

emozilla changed discussion status to closed

I tried to deploy in sagemaker. I got the same error. How to pass the '' trust_remote_code=True '' ?

NousResearch org

I tried to deploy in sagemaker. I got the same error. How to pass the '' trust_remote_code=True '' ?

Right here add it:

from transformers import AutoTokenizer
import transformers

model = "NousResearch/Yarn-Llama-2-13b-128k"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    device_map="auto",
    load_in_8bit=True
)

Sign up or log in to comment