Cannot use locally

#34
by Abhinav122 - opened

It generates gibberish. Also, there are some issues loading it with AutoModelForCausalLM or CohereForCausalLM.

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 44/44 [09:15<00:00, 12.64s/it]
Some weights of the model checkpoint at /root/azure-storage/huggingface/c4ai-command-r-plus were not used when initializing CohereForCausalLM: ['model.layers.0.self_attn.k_norm.weight', 'model.layers.0.self_attn.q_norm.weight', 'model.layers.1.self_attn.k_norm.weight', 'model.layers.1.self_attn.q_norm.weight', 'model.layers.10.self_attn.k_norm.weight',
...
... 
'model.layers.8.self_attn.q_norm.weight', 'model.layers.9.self_attn.k_norm.weight', 'model.layers.9.self_attn.q_norm.weight']
- This IS expected if you are initializing CohereForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing CohereForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

try higher transformers version

Sign up or log in to comment