Runtime error ZeroGPU with transoformers MistralForCasualLM

#21
by ehristoforu - opened
ZeroGPU Explorers org

Error:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 527, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 261, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1788, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1352, in call_function
prediction = await utils.async_iteration(iterator)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 583, in async_iteration
return await iterator.anext()
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 709, in asyncgen_wrapper
response = await iterator.anext()
File "/usr/local/lib/python3.10/site-packages/gradio/chat_interface.py", line 552, in _stream_fn
first_response = await async_iteration(generator)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 583, in async_iteration
return await iterator.anext()
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 576, in anext
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 559, in run_sync_iterator_async
return next(iterator)
File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 294, in gradio_handler
raise res.value
RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":830, please report a bug to PyTorch.

Space URL: https://huggingface.co/spaces/ehristoforu/0000/

ehristoforu changed discussion title from Runtime error ZeroGPU with transoformers to Runtime error ZeroGPU with transoformers MistralForCasualLM
ZeroGPU Explorers org

Not sure if this will fix it but you probably shouldn’t be loading the model inside the @zero .gpu function

Also to use CUDA don’t set pytorch default but use .to(cuda instead

Sign up or log in to comment