runtime error

Exit code: 1. Reason: se, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_top_n_tokens: 5, max_input_tokens: None, max_input_length: None, max_total_tokens: None, waiting_served_ratio: 0.3, max_batch_prefill_tokens: Some( 4096, ), max_batch_total_tokens: None, max_waiting_tokens: 20, max_batch_size: None, cuda_graphs: None, hostname: "r-sidreds06-mhcbv1-scmsmucm-72cb9-lndyu", port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: None, weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-generation-inference.router", cors_allow_origin: [], api_key: None, watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, tokenizer_config_path: None, disable_grammar_support: false, env: false, max_client_batch_size: 4, lora_adapters: None, usage_stats: On, payload_limit: 2000000, enable_prefill_logprobs: false, } 2025-03-19T00:42:36.170778Z  WARN text_generation_launcher::gpu: Cannot determine GPU compute capability: ModuleNotFoundError: No module named 'torch' 2025-03-19T00:42:36.170806Z  INFO text_generation_launcher: Using attention flashinfer - Prefix caching true 2025-03-19T00:42:36.170895Z  INFO text_generation_launcher: Using default cuda graphs [1, 2, 4, 8, 16, 32] 2025-03-19T00:42:36.171020Z  INFO download: text_generation_launcher: Starting check and download process for Sidreds06/MHCV1 2025-03-19T00:42:36.179566Z ERROR download: text_generation_launcher: Permission denied (os error 13) Error: DownloadError

Container logs:

Fetching error logs...