rozek
/

42dot_LLM-SFT-1.3B_GGUF

Inference Endpoints

Model card Files Files and versions Community

rozek commited on Nov 21, 2023

Commit

c3b59d9

•

1 Parent(s): 66dc1cd

Update README.md

Files changed (1) hide show

README.md +72 -0

README.md CHANGED Viewed

@@ -1,3 +1,75 @@
 ---
 license: cc-by-nc-4.0
 ---

 ---
 license: cc-by-nc-4.0
 ---
+# 42dot_LLM-SFT-1.3B_GGUF #
+* Model Creator: [42dot](https://huggingface.co/42dot)
+* original Model: [42dot_LLM-SFT-1.3B](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B)
+## Description ##
+This repository contains the GGUF conversion and the most relevant quantizations
+of 42dot's
+[42dot_LLM-SFT-1.3B](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B) model - ready
+to be used with [llama.cpp](https://github.com/ggerganov/llama.cpp) and similar
+applications.
+## Files ##
+In order to allow for fine-tuning (the model has the required LLaMA architecture)
+the original GGUF conversion has been made available
+* [42dot_LLM-SFT-1.3B.gguf](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B.gguf)
+From this file, the following quantizations were derived:
+* [42dot_LLM-SFT-1.3B-Q4_K_M](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q5_K_M.gguf)
+* [42dot_LLM-SFT-1.3B-Q5_K_M](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q5_K_M.gguf)
+* [42dot_LLM-SFT-1.3B-Q6_K](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q6_0.gguf)
+* [42dot_LLM-SFT-1.3B-Q8_K](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q8_0.gguf)
+(tell me if you need more)
+## Usage Details ##
+Any technical details can be found on the
+[original model card](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B)
+The most important ones for using this model are
+* context length is 4096
+* there does not seem to be a specific prompt structure - just provide the text
+you want to be completed
+### Text Completion with LLaMA.cpp ###
+For simple inferencing, use a command similar to
+```
+./main -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --temp 0 --top-k 4 --prompt "who was Joseph Weizenbaum?"
+```
+### Text Tokenization with LLaMA.cpp ###
+To get a list of tokens, use a command similar to
+```
+./tokenization -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --prompt "who was Joseph Weizenbaum?"
+```
+### Embeddings Calculation with LLaMA.cpp ###
+Text embeddings are calculated with a command similar to
+```
+./embedding -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --prompt "who was Joseph Weizenbaum?"
+```
+## License ##
+The original model "_is licensed under the Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC 4.0)_"
+So, in order to be fair and give credits to whom they belong:
+* the original model was created and published by [42dot](https://huggingface.co/42dot)
+* besides quantization, no changes were applied to the model itself