--- base_model: yentinglin/Llama-3-Taiwan-70B-Instruct language: - zh - en license: llama3 model_creator: yentinglin model_name: Llama-3-Taiwan-70B-Instruct model_type: llama pipeline_tag: text-generation quantized_by: minyichen tags: - llama-3 --- # Llama-3-Taiwan-70B-Instruct - GPTQ - Model creator: [Yen-Ting Lin](https://huggingface.co/yentinglin) - Original model: [Llama-3-Taiwan-70B-Instruct](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct) ## Description This repo contains GPTQ model files for [Llama-3-Taiwan-70B-Instruct](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct). * [GPTQ models for GPU inference](minyichen/Llama-3-Taiwan-70B-Instruct-GPTQ) * [Yen-Ting Lin's original unquantized model](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct) ## Quantization parameter - Bits : 4 - Group Size : 128 - Act Order : Yes - Damp % : 0.1 - Seq Len : 2048 - Size : 37.07 GB It tooks about 6.5 hrs to quantize on H100.