serving the model locally with cpu

#1
by YairF - opened

when using the model locally with my macBook pro with:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from langchain.llms import HuggingFacePipeline

os.environ["HUGGINGFACEHUB_API_TOKEN"] = '***********************'
tokenizer = AutoTokenizer.from_pretrained("Arc53/DocsGPT-7B")

model = AutoModelForCausalLM.from_pretrained("Arc53/DocsGPT-7B",
trust_remote_code=True)

pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
max_length=1024
)
local_llm = HuggingFacePipeline(pipeline=pipe)
.....
got :
the model 'MPTForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM'.....

Entering new chain...
Input length of input_ids is 1636, but max_length is set to 1024. This can lead to unexpected behavior. You should consider increasing max_new_tokens.

do i have any way of using this model with a langchain's agent on a CPU computer ?

Sign up or log in to comment