Any tips on running Sea Lion locally?

#7
by GilGoldman - opened

Seems like LM Studio and GPT4ALL can't handle Sea Lion's current architecture (MPT architecture is not supported).

Is there a way to run this model locally for prototyping?

AI Singapore org

Hi,
Thank you for your interest in SEA-LION.
Unfortunately at the moment, SEA-LION does not have a GGUF format and therefore not supported on LM Studio and GPT4ALL.

One option for local usage which allows loading the model directly from HuggingFace Hub is the text-generation-webui.
https://github.com/oobabooga/text-generation-webui

Hopefully this option would fit your use case.
Raymond

I have this error when running locally.
Tokenizer classs SEABPETokenizer does not exist or is not currently imported
Need to import additional new library?

AI Singapore org

Hi @davidramous ,
May I check if you are running it locally via the transformers code?

If yes, the SEA-LION tokenizer requires code execution, hence the transformers package requires the trust_remote_code flag be set to True when calling the from_pretrained methods.

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("aisingapore/sealion7b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("aisingapore/sealion7b", trust_remote_code=True)

Hopefully this helps.
Raymond

So sorry. Need to update the transformers to the latest. Then the error is gone.
May i check how much vram is needed to run this model?

AI Singapore org

Hi @davidramous ,

For the sealion7b, you would need around 30GB of vram to run the model and around 13GB of vram for the sealion3b.

AI Singapore org

You also might check this to reduce vram requirement for inference. Thank you!

https://huggingface.co/docs/accelerate/en/usage_guides/big_modeling

Sign up or log in to comment