LearnItAnyway
/

llama-13b-hf-35q_4bit-128g_WVU

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

LearnItAnyway commited on May 22, 2023

Commit

8e90e4e

•

1 Parent(s): 27a829d

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -1,16 +1,16 @@
 ---
 license: other
 ---
-# Model Card for llama-7b-hf-28q_4bit-128g_WVU
 ## Model Description
-`llama-7b-hf-28q_4bit-128g_WVU` is a model based on the
-Llama architecture with 7 billion parameters.
-This model adopts a quantization in which the first 28 layers
 of the decoder have been quantized with the [`gptq`](https://github.com/qwopqwop200/GPTQ-for-LLaMa) method,
 which uses 4-bit precision and 128 groups.
-Then, the last 4 decoder layers (1/8 of decoding layers), and lm_head have been fine-tuned using the [wizard_vicuna_70k_unfiltered dataset](https://huggingface.co/datasets/ehartford/wizard_vicuna_70k_unfiltered), 1 epoch.
 ## Note

 ---
 license: other
 ---
+# Model Card for llama-13b-hf-35q_4bit-128g_WVU
 ## Model Description
+`llama-13b-hf-35q_4bit-128g_WVU` is a model based on the
+Llama architecture with 13 billion parameters.
+This model adopts a quantization in which the first 35 layers
 of the decoder have been quantized with the [`gptq`](https://github.com/qwopqwop200/GPTQ-for-LLaMa) method,
 which uses 4-bit precision and 128 groups.
+Then, the last 5 decoder layers (1/8 of decoding layers), and lm_head have been fine-tuned using the [wizard_vicuna_70k_unfiltered dataset](https://huggingface.co/datasets/ehartford/wizard_vicuna_70k_unfiltered), 1 epoch.
 ## Note