anon8231489123/vicuna-13b-GPTQ-4bit-128g · Responses stop loading in.

Apr 27, 2023

Okay, I cannot figure out what, or what not, I may be doing to cause this, but when I start a conversation, after a while, the Vicuna model starts pausing. I can try to press continue, but it will only continue with one word, and then end up pausing again. Is there any way that I can fix this? Thanks for the help.

Setphus

Apr 27, 2023

•

edited Apr 27, 2023

Oh dang! I think my GPU is crap. I have a 3080 with only 10gb of vram. I have a Ryzen 5950x right now, I can hear them all now, "should have gone for intel," I know already. Sheeshk, but should I try to offload it onto my CPU anyway, or is it even worth it? Also, does this mean that I cannot have long conversations, or is it just this much for each response usually? Will I have to delete previous messages in order to make new ones, or can I somehow save them in a way that the AI model can view them for context? Thanks again for the help.

[UPDATE]: Alright, alright. I thought about it for a bit, and could I not, technically, allocate excess ram to my C: drive so that I can use it for AI purposes? Does anyone have thoughts on this, or am I an absolute nut case? IDK, maybe you can even use an SSD NVMe if you have one that can do 7gb per second....

Goldenblood56

Apr 30, 2023

•

edited Apr 30, 2023

Oh dang! I think my GPU is crap. I have a 3080 with only 10gb of vram. I have a Ryzen 5950x right now, I can hear them all now, "should have gone for intel," I know already. Sheeshk, but should I try to offload it onto my CPU anyway, or is it even worth it? Also, does this mean that I cannot have long conversations, or is it just this much for each response usually? Will I have to delete previous messages in order to make new ones, or can I somehow save them in a way that the AI model can view them for context? Thanks again for the help.

[UPDATE]: Alright, alright. I thought about it for a bit, and could I not, technically, allocate excess ram to my C: drive so that I can use it for AI purposes? Does anyone have thoughts on this, or am I an absolute nut case? IDK, maybe you can even use an SSD NVMe if you have one that can do 7gb per second....

I think you can kind of do that. I am not an expert but if you split the model between GPU and CPU. By allocating layers etc. It's now giving you more ram to use since it's not only letting you use VRAM. The trade off is speed. I would not suggest dumping to the NVMe though. I still think system memory is a lot faster. Let us know how it goes as I am interested.