phanerozoic
/

PirateTalk-13b-v1-GGUF-q8_0

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

phanerozoic commited on Oct 7, 2023

Commit

94d21b4

•

1 Parent(s): 148c9d6

Update README.md

Files changed (1) hide show

README.md +0 -2

README.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 license: cc-by-4.0
 ---
-Repository Description:
 Introducing PirateTalk-13b-v1-GGUF-Q8, a CPU-optimized iteration of the original PirateTalk-13b-v1 model, founded on the robust 13b Llama 2 Chat architecture. This version has been meticulously quantized to 8 bits utilizing the llama.cpp utility, making it apt for CPU inferences without the necessity of a GPU. With a minimal requirement of 16 GB RAM, the performance of this model is contingent on the CPU speed, which may range from slow to reasonable. Despite the transition from fp16 to 8-bit quantization, a discernible piratical flair is retained, albeit with a modest compromise in vernacular quality.
 Objective:

 ---
 license: cc-by-4.0
 ---
 Introducing PirateTalk-13b-v1-GGUF-Q8, a CPU-optimized iteration of the original PirateTalk-13b-v1 model, founded on the robust 13b Llama 2 Chat architecture. This version has been meticulously quantized to 8 bits utilizing the llama.cpp utility, making it apt for CPU inferences without the necessity of a GPU. With a minimal requirement of 16 GB RAM, the performance of this model is contingent on the CPU speed, which may range from slow to reasonable. Despite the transition from fp16 to 8-bit quantization, a discernible piratical flair is retained, albeit with a modest compromise in vernacular quality.
 Objective: