runtime error

β”‚ β”‚ β”‚ ) β”‚ β”‚ 487 β”‚ β”‚ raise ValueError( β”‚ β”‚ β”‚ β”‚ /home/user/.local/lib/python3.8/site-packages/transformers/modeling_utils.py β”‚ β”‚ :2819 in from_pretrained β”‚ β”‚ β”‚ β”‚ 2816 β”‚ β”‚ β”‚ β”‚ β”‚ key: device_map[key] for key in device_map.keys() β”‚ β”‚ 2817 β”‚ β”‚ β”‚ β”‚ } β”‚ β”‚ 2818 β”‚ β”‚ β”‚ β”‚ if "cpu" in device_map_without_lm_head.values() or "d β”‚ β”‚ ❱ 2819 β”‚ β”‚ β”‚ β”‚ β”‚ raise ValueError( β”‚ β”‚ 2820 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ """ β”‚ β”‚ 2821 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Some modules are dispatched on the CPU or the β”‚ β”‚ 2822 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ the quantized model. If you want to dispatch β”‚ ╰──────────────────────────────────────────────────────────────────────────────╯ ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules in 32-bit, you need to set `load_in_8bit_fp32_cpu_offload=True` and pass a custom `device_map` to `from_pretrained`. Check https://huggingface.co/docs/transformers/main/en/main_cl asses/quantization#offload-between-cpu-and-gpu for more details.

Container logs:

Fetching error logs...