Help! How to Run an A.I with an AMD GPU (Rx 580 8Gb)

by A2Hero - opened Jun 27, 2023

Jun 27, 2023

•

edited Jun 27, 2023

Lads is there any one know how to do that?
even the chat gpt 4 can't help me well.
I don't know if I doing something wrong or not but I feel like there is way like bypassing ROCm in windows with help of wsl 2 or something like that. Please any one have a solution for it ?

Yhyu13

Jun 29, 2023

•

edited Jun 29, 2023

13B GPTQ models is about 14GB size, which won;t fit on 8GB cards, There is no offloading technique for GPTQ so far, you probably need to refer to flexgen and with unquantized weights

TheBloke

Owner Jun 29, 2023

There is offloading in GPTQ-for-LLaMa but it's really, really slow, and I don't know if it works for ROCm implementations of GPTQ-for-LLaMa. ExLlama has ROCm but no offloading, which I imagine is what you're referring to.

But it sounds like the OP is using Windows and there's no ROCm for Windows, not even in WSL, so that's a deadend I'm afraid.

@A2Hero I would suggest you use GGML, which can work on your AMD card via OpenCL acceleration.

A2Hero

Jun 30, 2023

@Yhyu13 I meant any kidna of a.i model. even 6b or lower.

A2Hero

Jun 30, 2023

@TheBloke
sure i will try ggml with something like TheBloke/orca_mini_3B-GGML. (my cpu is i5 4690 and i have 16ram)
but i really hope that someday ( and i hope is near ) amd support rocm in windows or anything can help to run TheBloke/wizard-mega-13B-GPTQ.
thanks for the advice!

shihab456321

Jul 11, 2024

•

edited Jul 11, 2024

@A2Hero were you successful with your rx 580?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment