Spaces:

zero-gpu-explorers
/

README

Running

App Files Files Community

163

Runtime error ZeroGPU with transoformers MistralForCasualLM

#21

by ehristoforu - opened Apr 21, 2024

Discussion

ehristoforu

ZeroGPU Explorers org Apr 21, 2024

Error:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 527, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 261, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1788, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1352, in call_function
prediction = await utils.async_iteration(iterator)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 583, in async_iteration
return await iterator.anext()
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 709, in asyncgen_wrapper
response = await iterator.anext()
File "/usr/local/lib/python3.10/site-packages/gradio/chat_interface.py", line 552, in _stream_fn
first_response = await async_iteration(generator)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 583, in async_iteration
return await iterator.anext()
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 576, in anext
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 559, in run_sync_iterator_async
return next(iterator)
File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 294, in gradio_handler
raise res.value
RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":830, please report a bug to PyTorch.

Space URL: https://huggingface.co/spaces/ehristoforu/0000/

ehristoforu changed discussion title from Runtime error ZeroGPU with transoformers to Runtime error ZeroGPU with transoformers MistralForCasualLM Apr 21, 2024

mrfakename

ZeroGPU Explorers org Apr 21, 2024

Not sure if this will fix it but you probably shouldn’t be loading the model inside the @zero .gpu function

Also to use CUDA don’t set pytorch default but use .to(cuda instead

xu3kev

Jun 13, 2024

Also encountered this issue

Kkordik

Jun 19, 2024

The same error, any guide on how to fix ?

gsayak

Sep 22, 2024

Did y'all find the fix to this?

John6666

Sep 22, 2024

•

edited Sep 22, 2024

https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/104
I encountered the same errors too, most of which were fixed by HF, but still not all of them.
As mrfakename said, don't let torch manage your GPU anyway, and be very careful in the type of space where you select and load models after startup.

Anyway, basically offload it to the CPU and only explicitly .to("cuda") it when you use it.
Definitely avoid situations where parts of the pipes or models are spread across RAM and VRAM.
You don't have to be that careful about offloading after use it.
Import spaces at the beginning of the code.
Be very careful when adding @spaces decorators.
accelerate is not the culprit, but the error content changes when this guy is there and when he's not.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment