CodeShell-7B-Chat-int4 在TGI下运行失败

#1
by melonrindrind - opened

你好,感谢开发者们开源项目,在本地使用TGI运行过程中我遇到了一些问题,由于3090显存不足无法运行7B未量化模型,我打算运行int4的量化模型,但 CodeShell-7B-Chat-int4 在TGI下运行失败,请问是什么问题

docker run --gpus 'all' --shm-size 20g -p 9090:80 -v /root/codeshell/model:/data         --env LOG_LEVEL="info,text_generation_router=debug"         ghcr.nju.edu.cn/huggingface/text-generation-inference:1.0.3         --model-id /data/C
odeShell-7B-Chat-int4 --num-shard 1         --max-total-tokens 5000 --max-input-length 4096         --max-stop-sequences 12 --trust-remote-code
2023-10-23T02:59:41.362133Z  INFO text_generation_launcher: Args { model_id: "/data/CodeShell-7B-Chat-int4", revision: None, validation_workers: 2, sharded: None, num_shard: Some(1), quantize: None, dtype: None, trust_remote_code: true, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 12, max_top_n_tokens: 5, max_input_length: 4096, max_total_tokens: 5000, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, hostname: "f1b26d576e27", port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false }
2023-10-23T02:59:41.362173Z  WARN text_generation_launcher: `trust_remote_code` is set. Trusting that model `/data/CodeShell-7B-Chat-int4` do not contain malicious code.
2023-10-23T02:59:41.362290Z  INFO download: text_generation_launcher: Starting download process.
2023-10-23T02:59:44.612287Z  WARN text_generation_launcher: No safetensors weights found for model /data/CodeShell-7B-Chat-int4 at revision None. Converting PyTorch weights to safetensors.

Error: DownloadError
2023-10-23T02:59:47.269640Z ERROR download: text_generation_launcher: Download encountered an error: Traceback (most recent call last):

  File "/opt/conda/bin/text-generation-server", line 8, in <module>
    sys.exit(app())

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 195, in download_weights
    utils.convert_files(local_pt_files, local_st_files, discard_names)

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 106, in convert_files
    convert_file(pt_file, sf_file, discard_names)

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 68, in convert_file
    to_removes = _remove_duplicate_names(loaded, discard_names=discard_names)

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 25, in _remove_duplicate_names
    shareds = _find_shared_tensors(state_dict)

  File "/opt/conda/lib/python3.9/site-packages/safetensors/torch.py", line 72, in _find_shared_tensors
    if v.device != torch.device("meta") and storage_ptr(v) != 0 and storage_size(v) != 0:

AttributeError: 'list' object has no attribute 'device'

蹲, 同样的问题

Sign up or log in to comment