serving the model locally with cpu
when using the model locally with my macBook pro with:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from langchain.llms import HuggingFacePipeline
os.environ["HUGGINGFACEHUB_API_TOKEN"] = '***********************'
tokenizer = AutoTokenizer.from_pretrained("Arc53/DocsGPT-7B")
model = AutoModelForCausalLM.from_pretrained("Arc53/DocsGPT-7B",
trust_remote_code=True)
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
max_length=1024
)
local_llm = HuggingFacePipeline(pipeline=pipe)
.....
got :
the model 'MPTForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM'.....
Entering new chain...
Input length of input_ids is 1636, butmax_length
is set to 1024. This can lead to unexpected behavior. You should consider increasingmax_new_tokens
.
do i have any way of using this model with a langchain's agent on a CPU computer ?