minyichen's picture
Update README.md
f296fd7 verified
metadata
base_model: yentinglin/Llama-3-Taiwan-70B-Instruct
language:
  - zh
  - en
license: llama3
model_creator: yentinglin
model_name: Llama-3-Taiwan-70B-Instruct
model_type: llama
pipeline_tag: text-generation
quantized_by: minyichen
tags:
  - llama-3

Llama-3-Taiwan-70B-Instruct-fp8

Description

This repo contains fp8 model files for Llama-3-Taiwan-70B-Instruct.

Quantization parameter

  • activation_scheme : static
  • quant_method : fp8
  • ignored_layers : lm_head

It tooks about 8.5 hrs to quantize on H100.