What is the difference in this and original model?

#1
by hsuyab - opened

Hello,

  1. I wanted to understand how this sharded version will be useful compared to the original model weights on EleutherAI/gpt-j-6b;
  2. Also, how did you do the sharding, can you share a script for the same.
  3. I found some errors when trying to load in the tokenizer,
OSError: Can't load tokenizer for 'sgugger/sharded-gpt-j-6B'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'sgugger/sharded-gpt-j-6B' is the correct path to a directory containing all relevant files for a GPT2TokenizerFast tokenizer.

You can use the "sharded" branch on the official repo now. I created this one when it didn't exist.

ohh okay

hsuyab changed discussion status to closed

Sign up or log in to comment