Can't download the tokenizer

#15

by alerio - opened May 17, 2023

May 17, 2023

Hi, I was trying to download the model and tokenizer via the following code

model_name = "hivemind/gpt-j-6B-8bit"
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(
    model_name, 
     load_in_8bit=True, # NOTE: load GPT-2 with 8 bit did not work
     device_map={'':torch.cuda.current_device()},
    )

But I got the this error: Can't load tokenizer for 'hivemind/gpt-j-6B-8bit'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'hivemind/gpt-j-6B-8bit' is the correct path to a directory containing all relevant files for a GPT2TokenizerFast tokenizer.

Any help will be appreciated!

interstellarninja

May 18, 2023

•

edited May 18, 2023

Hey Alerio, I tried the following code from this example: https://colab.research.google.com/drive/1qOjXfQIAULfKvZqwCen8-MoWKGdSatZ4#scrollTo=W8tQtyjp75O

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "hivemind/gpt-j-6B-8bit"
model_8bit = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_8bit=True)
tokenizer = AutoTokenizer.from_pretrained(model_name)

But I still get the following error:

NameError: name 'init_empty_weights' is not defined

interstellarninja

May 18, 2023

BTW following works but crashes because of OOM:

class GPTJForCausalLM(transformers.models.gptj.modeling_gptj.GPTJForCausalLM):
    def __init__(self, config):
        super().__init__(config)
        convert_to_int8(self)

gpt = GPTJForCausalLM.from_pretrained("hivemind/gpt-j-6B-8bit", low_cpu_mem_usage=True)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment