valeriojob
/

MedGPT-Llama3.1-8B-BA-v.1-GGUF

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

valeriojob commited on Aug 3, 2024

Commit

e0fceed

·

verified ·

1 Parent(s): 4ead056

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -18,6 +18,11 @@ tags:
 - Version 1 (v.1) of MedGPT is the very first version of MedGPT and the training dataset has been kept simple and small with only 60 examples.
 - This repo includes the quantized models in the GGUF format.  There is a separate repo called [valeriojob/MedGPT-Llama3.1-8B-BA-v.1](https://huggingface.co/valeriojob/MedGPT-Llama3.1-8B-BA-v.1) that includes the default 16bit format of the model as well as the LoRA adapters of the model.
 - This model was quantized using [llama.cpp](https://github.com/ggerganov/llama.cpp).
 ## Model description

 - Version 1 (v.1) of MedGPT is the very first version of MedGPT and the training dataset has been kept simple and small with only 60 examples.
 - This repo includes the quantized models in the GGUF format.  There is a separate repo called [valeriojob/MedGPT-Llama3.1-8B-BA-v.1](https://huggingface.co/valeriojob/MedGPT-Llama3.1-8B-BA-v.1) that includes the default 16bit format of the model as well as the LoRA adapters of the model.
 - This model was quantized using [llama.cpp](https://github.com/ggerganov/llama.cpp).
+- This model is available in the following quantization formats:
+  - BF16
+  - Q4_K_M
+  - Q5_K_M
+  - Q8_0
 ## Model description