shibing624
/

llama-3-8b-instruct-262k-chinese

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

shibing624 commited on Apr 29

Commit

42bd84d

•

1 Parent(s): 2e1633e

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -34,8 +34,8 @@ llama-3-8b-instruct-262k-chinese基于[Llama-3-8B-Instruct-262k](https://hugging
 Quantization |	Peak Usage for Encoding 2048 Tokens |	Peak Usage for Generating 8192 Tokens
 -- | -- | --
-FP16/BF16 |	17.66GB |	22.58GB
-Int4  |		8.21GB |	13.62GB
 缺点：

 Quantization |	Peak Usage for Encoding 2048 Tokens |	Peak Usage for Generating 8192 Tokens
 -- | -- | --
+FP16/BF16 |	18.66GB |	24.58GB
+Int4  |		9.21GB |	14.62GB
 缺点：