Error raised by inference API: Model google/flan-t5-xl time out

#43
by phdykd - opened

Hi,

The following code snippet is giving this error : "ValueError: Error raised by inference API: Model google/flan-t5-xl time out".

Do you mind helping out how to solve it please?

index = VectorstoreIndexCreator(
embedding=HuggingFaceEmbeddings(),
text_splitter=CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)).from_loaders(loaders)

from langchain.chains import RetrievalQA
chain = RetrievalQA.from_chain_type(llm=llm,
chain_type="stuff",
retriever=index.vectorstore.as_retriever(),
input_key="question")

chain.run('Who are the authors of GPT4all technical report?')

note: I am on macOS 13.3.1, Apple M2 Max system.

Any updates?

I'm facing the same issue. Is it becaues I'm using the free tier?

I am also facing this issue, any solution yet?

Also I am receiving he time out error.

I have a similar issue with my script. potential - when I used the text-to-text question box on this page, https://huggingface.co/google/flan-t5-xl?text=who+won+the+1994+FIFA+World+cup, I got this response...The model google/flan-t5-xl is too large to be loaded automatically (11GB > 10GB). For commercial use please use PRO spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints). maybe this model does require a subscription ?

Any solution for this issue?

I am also facing the same issue, any solution can be provided?

google/flan-t5-xl time out. But google/flan-t5-xxl works:)

I can confirm that google/flan-t5-xxl works, although it requires that "temperature" be a positive value, not simply 0.

Thanks @nhyydt for the advice, google/flan-t5-xxl works for me too, but in maybe 20-30 attempts with other models, google/flan-t5-xxl seems to e ONLY one that works

ValueError: Error raised by inference API: Model google/flan-t5-xl time out
facing the same issue

llm=HuggingFaceHub(repo_id="google/flan-t5-xxl", model_kwargs={"temperature":0.8, "max_length":512})
us this instead t5-xxl is working perfectly fine

Face the same issue with many difference repo_id. Is this issue specific to the model size or it's an issue with HuggingFaceHub default timeout parameter setting?

i think it's woth the default parameter try to change it u can find that code on stackoverflow or chatgpt

I am not getting any response for the following llm. Any help would be highly appreciated.
llm = HuggingFaceHub(
repo_id="google/flan-t5-xxl",
model_kwargs={"temperature": 0.8, "max_length": 512}
)

Sign up or log in to comment