Apply for community grant: Academic project (gpu)

#1
by Andy1621 - opened
OpenGVLab org

I'm applying for a community grant for VideoMamba, a pure SSM-based model for video understanding, available at https://github.com/OpenGVLab/VideoMamba. Our project aims to advance video analytics through state-of-the-art deep learning, benefiting researchers and developers.

Funding will support further development and outreach, aligning with our goal to foster innovation and collaboration in video understanding technology.

Thank you for considering VideoMamba.

Hi @Andy1621 , we've assigned ZeroGPU to this Space. Please check the compatibility and usage sections of this page so your Space can run on ZeroGPU. If ZeroGPU doesn't work for your Space, let us know. We will assign a T4 or A10G with short sleep time.

OpenGVLab org

@hysts Thanks for your help, but I met some problems when using ZeroGPU. Is it suitable for compling my packages?

Since I use some packages that depend on GPU to compline, I created an install.sh and called it in the beginning of app.py. But it seems that the nvcc is not found, are there any problem in my code?

/home/user/app/mamba/setup.py:78: UserWarning: mamba_ssm was requested, but nvcc was not found.  Are you sure your environment has nvcc available?

@Andy1621 My understanding is that nvcc is not available on ZeroGPU environment. Would it be possible for you to build a wheel for your packages in your local environment with CUDA beforehand and install it at startup like this?
You can build whl for your package by running something like this:

python setup.py bdist_wheel
OpenGVLab org

Great! Let me have a try!

OpenGVLab org

@hysts I have tried to do it, but met new problem: ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory, which may be caused by lack of NVCC of wrong version of CUDA...

@Andy1621 Thanks for checking. Hmm, I think ZeroGPU is using CUDA 12, but not sure if it works, so I just assigned a10g-small to this Space for now.
I'll look into the CUDA issue. Can you tell me where I can find the repo you need to build? (It seems that this repo has wheels for causal_conv1d and mamba_ssm, but I'd like to know the paths to the original repo.)

OpenGVLab org

@hysts Thanks for your quick response. The repo is here, where I use pip install -e mamba or causal-conv1d.

OpenGVLab org

@hysts How can I install PyTorch with a specific version like cuda118. Currently, direct pip install -e meets some bugs as follows:

The detected CUDA version (11.8) mismatches the version that was used to compile
    PyTorch (12.1). Please make sure to use the same CUDA versions.

@Andy1621 I think you can add --extra-index-url https://download.pytorch.org/whl/cu118 to your requirements.txt to install torchwith cuda118.

But I'm confused. So, your packages requires CUDA 11.8 and doesn't work with CUDA 12? I wonder why it doesn't work with CUDA 12. Also, from the log of your Space, the current CUDA version of HF Spaces seems to be CUDA 12.2 and the `torch` installed to your Space is built with CUDA 12.1, but the build error says `The detected CUDA version (11.8) mismatches the version that was used to compile PyTorch (12.1).` I don't understand where this CUDA 11.8 came from. (EDIT) Ah, looks like HF Spaces use the CUDA 11.8 image as the base docker image. Sorry for the confusion.

@Andy1621 It seems that I get this error even after fixing the dependency issues and I'm wondering why. Does this app work in your local environment?

Traceback (most recent call last):
  File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/gradio/queueing.py", line 501, in call_prediction
    output = await route_utils.call_process_api(
  File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/gradio/route_utils.py", line 253, in call_process_api
    output = await app.get_blocks().process_api(
  File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/gradio/blocks.py", line 1695, in process_api
    result = await self.call_function(
  File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/gradio/blocks.py", line 1235, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
    return await future
  File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/gradio/utils.py", line 692, in wrapper
    response = f(*args, **kwargs)
  File "/home/user/app/app.py", line 111, in inference_video
    prediction = model_video(inputs.to(device))
  File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/app/videomamba_video.py", line 357, in forward
    x = self.forward_features(x, inference_params)
  File "/home/user/app/videomamba_video.py", line 308, in forward_features
    x = x + self.pos_embed
RuntimeError: The size of tensor a (101) must match the size of tensor b (197) at non-singleton dimension 1
OpenGVLab org

@hysts It's my fault, and just change the right resolution via this

image.png

BTW, how do you fix the bug to install the packages? It looks great!

Thanks for the fix! It worked!

how do you fix the bug to install the packages?

I'll open a PR soon.

OpenGVLab org

@hysts Great! It works.

Thanks for merging the PR! I've switched the hardware to ZeroGPU and increase the sleep time.

OpenGVLab org

Great!

Andy1621 changed discussion status to closed

Sign up or log in to comment