gtvracer (Thomas Tong)

posted an update about 22 hours ago

Post

668

Is HuggingFace having issues with meta-llama/Llama-3.2-3B-Instruct? InferenceClient isn't returning any results.

3 replies

·

replied to their post 26 days ago

yes, probably..

replied to their post 27 days ago

ok, it came back on... HF! What happened?!

replied to their post 27 days ago

if I use the cloud version: https://jc26mwg228mkj8dw.us-east-1.aws.endpoints.huggingface.cloud, it still works, but directly to HF, all 401 for Meta-Llama-3-8b-Instruct, Llama-3.2-3B-Instruct, Llama-3.3-70B-Instruct and Mixtral-8x7B-Instruct-v0.1 Is something down in HF to authenticate user and token?

posted an update 27 days ago

Post

462

I'm getting this all of a sudden, even generated a new token but still get a 401. anyone else seeing this?
Exception:401 Client Error: Unauthorized for url: https://router.huggingface.co/hf-inference/models/meta-llama/Llama-3.2-3B-Instruct/v1/chat/completions (Request ID: Root=1-67dc6b20-3a4697761ad9315c06ca928a;d914bcf1-063a-4df2-acc2-8e0170ddccb3)

4 replies

·

posted an update 6 months ago

Post

632

Model is always disabled?
#script...
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("distilbert/distilgpt2",
token="xxxxxx")

That loads the model fine. But if used by index returned from VectorStoreIndex for QDrant like this:

#script...
query_engine = index_from_nodes.as_query_engine(llm=model, streaming=True)

response = query_engine.query(
"What is formula 1?"
)

response.print_response_stream()

It errors out with a disabled error:
AssertionError Traceback (most recent call last)
Cell In[34], line 1
----> 1 query_engine = index_from_nodes.as_query_engine(llm=model, streaming=True)
3 response = query_engine.query(
4 "What is formula 1?"
5 )
7 response.print_response_stream()

File ~/miniconda/lib/python3.9/site-packages/llama_index/core/indices/base.py:376, in BaseIndex.as_query_engine(self, llm, **kwargs)
370 from llama_index.core.query_engine.retriever_query_engine import (
371 RetrieverQueryEngine,
372 )
374 retriever = self.as_retriever(**kwargs)
375 llm = (
--> 376 resolve_llm(llm, callback_manager=self._callback_manager)
377 if llm
378 else Settings.llm
379 )
381 return RetrieverQueryEngine.from_args(
382 retriever,
383 llm=llm,
384 **kwargs,
385 )

File ~/miniconda/lib/python3.9/site-packages/llama_index/core/llms/utils.py:102, in resolve_llm(llm, callback_manager)
99 print("LLM is explicitly disabled. Using MockLLM.")
100 llm = MockLLM()
--> 102 assert isinstance(llm, LLM)
104 llm.callback_manager = callback_manager or Settings.callback_manager
106 return llm

AssertionError:

So why is the LLM disabled?
Thanks!

1 reply

·

replied to their post 6 months ago

Thanks!

replied to their post 6 months ago

They were all default files loaded by created the space. So some template they are using are not synched or mislabeled for a Gradio/chatbot project.

You'd expect them to compile and run out of the box...and have to adjust it to the environment you chose (ZeroGPU).

replied to their post 6 months ago

Thanks John6666! Adding the import and spaces.GPU fixed the issue and the base ZeroGPU gradio app runs!

Hope HuggingFace fixes their template so others aren't facing this as their first experience with this platform. Because it really doesn't look good if the basic project supplied by HF won't run.

reacted to their post with 👀 6 months ago

Post

1518

Hello Everyone,
I signed up as Pro and started a ZeroGPU space with a Gradio chatbot project as default. When building the space, it won't even start the sample Gradio app.. Pretty disappointing when right out of the box, it fails...

Have anyone encountered this yet?
Thanks...

This is the output, odd since it seems to be just a warning. So why wouldn't it start?

/usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:228: UserWarning: The 'tuples' format for chatbot messages is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
warnings.warn(
* Running on local URL: http://0.0.0.0:7860, with SSR ⚡

To create a public link, set share=True in launch().

Stopping Node.js server...

6 replies

·

posted an update 6 months ago

Post

1518

Hello Everyone,
I signed up as Pro and started a ZeroGPU space with a Gradio chatbot project as default. When building the space, it won't even start the sample Gradio app.. Pretty disappointing when right out of the box, it fails...

Have anyone encountered this yet?
Thanks...

This is the output, odd since it seems to be just a warning. So why wouldn't it start?

/usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:228: UserWarning: The 'tuples' format for chatbot messages is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
warnings.warn(
* Running on local URL: http://0.0.0.0:7860, with SSR ⚡

To create a public link, set share=True in launch().

Stopping Node.js server...

6 replies

·

updated a dataset 6 months ago

gtvracer/forensic

Updated Oct 13, 2024 • 6

Thomas Tong PRO

AI & ML interests

Recent Activity

Organizations

gtvracer's activity

gtvracer/forensic