2024-02-26 21:44:44 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=40001, worker_address='http://localhost:40001', controller_address='http://localhost:10000', model_path='MBZUAI/MobiLlama-05B-Chat', revision='main', device='cuda', gpus=None, num_gpus=1, max_gpu_memory=None, dtype=None, load_8bit=False, cpu_offloading=False, gptq_ckpt=None, gptq_wbits=16, gptq_groupsize=-1, gptq_act_order=False, awq_ckpt=None, awq_wbits=16, awq_groupsize=-1, enable_exllama=False, exllama_max_seq_len=4096, exllama_gpu_split=None, exllama_cache_8bit=False, enable_xft=False, xft_max_seq_len=4096, xft_dtype=None, model_names=None, conv_template=None, embed_in_truncate=False, limit_worker_concurrency=5, stream_interval=2, no_register=False, seed=None, debug=False, ssl=False) 2024-02-26 21:44:44 | INFO | model_worker | Loading the model ['MobiLlama-05B-Chat'] on worker 4d3d60b1 ... 2024-02-26 21:44:46 | INFO | model_worker | Register to controller 2024-02-26 21:44:46 | ERROR | stderr | INFO: Started server process [455481] 2024-02-26 21:44:46 | ERROR | stderr | INFO: Waiting for application startup. 2024-02-26 21:44:46 | ERROR | stderr | INFO: Application startup complete. 2024-02-26 21:44:46 | ERROR | stderr | INFO: Uvicorn running on http://0.0.0.0:40001 (Press CTRL+C to quit) 2024-02-26 21:45:04 | ERROR | stderr | INFO: Shutting down 2024-02-26 21:45:05 | ERROR | stderr | INFO: Waiting for application shutdown. 2024-02-26 21:45:05 | ERROR | stderr | INFO: Application shutdown complete. 2024-02-26 21:45:05 | ERROR | stderr | INFO: Finished server process [455481]