Not able to use it by Run With Docker

#5
by m-ali-awan - opened

Hi there,
I have cloned the repo, and did all required installations, but on port 7860, I am not able to use the app.
It keeps on showing: "Downloading the models.." while in HF space, in place of it we have following in dropdown:
"llava-v1.5-13b-4bit"

Kindly help me with this, I plan to finetune it to compare the results to GPT4-V.
These are my logs

env-llava-3.10.4) ubuntu@ip-172-31-9-24:~/Repos/LLaVA$ python app.py 
[2023-10-18 14:46:10,636] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
2023-10-18 14:46:11 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:10000', concurrency_count=8, model_list_mode='reload', share=False, moderate=False, embed=False)
2023-10-18 14:46:11 | INFO | gradio_web_server | Starting the controller
2023-10-18 14:46:11 | INFO | gradio_web_server | Starting the model worker for the model liuhaotian/llava-v1.5-13b
[2023-10-18 14:46:14,739] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-10-18 14:46:14,748] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
2023-10-18 14:46:15 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:10000', model_path='liuhaotian/llava-v1.5-13b', model_base=None, model_name='llava-v1.5-13b-4bit', multi_modal=False, limit_model_concurrency=5, stream_interval=1, no_register=False, load_8bit=False, load_4bit=True)
2023-10-18 14:46:15 | INFO | model_worker | Loading the model llava-v1.5-13b-4bit on worker dd86b8 ...
2023-10-18 14:46:15 | INFO | controller | args: Namespace(host='0.0.0.0', port=10000, dispatch_method='shortest_queue')
2023-10-18 14:46:15 | INFO | controller | Init controller
2023-10-18 14:46:15 | ERROR | stderr | INFO:     Started server process [21719]
2023-10-18 14:46:15 | ERROR | stderr | INFO:     Waiting for application startup.
2023-10-18 14:46:15 | ERROR | stderr | INFO:     Application startup complete.
2023-10-18 14:46:15 | ERROR | stderr | INFO:     Uvicorn running on http://0.0.0.0:10000 (Press CTRL+C to quit)
Loading checkpoint shards:   0%|                                                                                                        | 0/3 [00:00<?, ?it/s]
2023-10-18 14:46:21 | INFO | stdout | INFO:     127.0.0.1:49682 - "POST /refresh_all_workers HTTP/1.1" 200 OK
2023-10-18 14:46:21 | INFO | stdout | INFO:     127.0.0.1:49698 - "POST /list_models HTTP/1.1" 200 OK
2023-10-18 14:46:21 | INFO | gradio_web_server | Models: []
2023-10-18 14:46:21 | INFO | stdout | Running on local URL:  http://0.0.0.0:7860
2023-10-18 14:46:21 | INFO | stdout | 
2023-10-18 14:46:21 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2023-10-18 14:46:59 | INFO | gradio_web_server | load_demo. ip: 203.99.184.129
2023-10-18 14:46:59 | INFO | stdout | INFO:     127.0.0.1:44730 - "POST /refresh_all_workers HTTP/1.1" 200 OK
2023-10-18 14:46:59 | INFO | stdout | INFO:     127.0.0.1:44732 - "POST /list_models HTTP/1.1" 200 OK
2023-10-18 14:46:59 | INFO | gradio_web_server | Models: []

Thanks for help

hey @m-ali-awan !
one of the processes is downloading the model in the background. that's why it keeps showing "Downloading the models..".
Your logs should be displaying the download progress, and it takes sometime to start the download. Can you wait for some time ~3 mins and check whether the download progress is available in the logs.

Thanks @badayvedat
Can't I pass this path, as I think it has downloaded the models here
/home/ubuntu/.cache/huggingface/hub/models--liuhaotian--llava-v1.5-13b/snapshots/d64eb781be6876a5facc160ab1899281f59ef684/pytorch_model-00003-of-00003.bin

And, I waited for some time, I can't see any directory like : "liuhaotian/llava-v1.5-13b". gets created

And, I don't know what is this, but it gets stucked on this

Loading checkpoint shards:   0%|                                                                                                        | 0/3 [00:00<?, ?it/s]

And, I waited for some time, I can't see any directory like : "liuhaotian/llava-v1.5-13b". gets created

that's interesting 🤔

can you try with overriding the model_path variable to point to /home/ubuntu/.cache/huggingface/hub/models--liuhaotian--llava-v1.5-13b

yes, I did that as well :)

Thanks , it worked now. But previously I tried by setting

export bit=4

And this time, I didn't do that perhaps that's the difference

Thanks , it worked now

great!

since bit=4 make use of bitsandbytes, there might be an issue with the docker environment <=> cuda <=> bitsandbytes

Great, thanks

is it expected to have a performance degradation with bits=4, as compared to bit=8?

yes, performance degradation is expected, but I'm not aware of any LLaVA research that compares the performance metrics of the various quantizations (4/8/16/32) in detail.
maybe @liuhaotian can provide more info on that?

Sure, I will also try to document some comparisons, and post here.

For finetuning, I saw lot of people were facing issues. Is there any demo doc, which I can follow.
Moreover would be grateful, if you can give me some rough guess/estimate, about the size of my training dataset.
I want to fnietune LLava for Vehicle Damage Estimation, so it can tell different damages with localization, i.e: PROMPT:give me a damage estimation of this image of vehicle >. Response: we have visible parts: rear bumper, rear left quater panel, we have one significant dent and 2 small scratches on rear bumper, etc

What do you think, how much samples, should I use. And based on your experience any guidelines/tricks I should follow..

Thanks alot for all your help

Sign up or log in to comment