CodeLlamaTokenizer

#11
by pcuenq HF staff - opened
Code Llama org
No description provided.
osanseviero changed pull request status to merged

I'm having issues with this, AutoTokenizer doesn't seem to be able to import it:

Traceback (most recent call last):
  File "/home/federico/MultiPL-E/automodel.py", line 92, in <module>
    main()
  File "/home/federico/MultiPL-E/automodel.py", line 87, in main
    model = Model(args.name, args.revision, args.tokenizer_name, args.tokenizer_revision)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/federico/MultiPL-E/automodel.py", line 14, in __init__
    self.tokenizer = AutoTokenizer.from_pretrained(tokenizer_name or name, padding_side="left", trust_remote_code=True)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/federico/santacoder-finetuning-lua/.env/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 731, in from_pretrained
    raise ValueError(
ValueError: Tokenizer class CodeLlamaTokenizer does not exist or is not currently imported.

I updated the transformers library to the latest on git

Code Llama org

hmmm, strange, it works with transformers @ main for me. Could you please paste the output from transformers-cli env and provide a short reproduction snippet?

fixed! fyi, you have to uninstall tokenizers if you were on the @main before the addition of CodeLlamaTokenizer

Code Llama org

Oh, I see! Glad you could solve it :)

Code Llama org

You should not have to uninstall tokenizers, it's completely unrelated as the class we used online is CodeLlamaTokenizer and not CodeLlamaTokenizerFast

Sign up or log in to comment