nixiesearch
/

nixie-querygen-v2

@@ -42,9 +42,9 @@ We used [200k query-document pairs](https://huggingface.co/datasets/nixiesearch/
 This repo has multiple versions of the model:
-* model-*.safetensors: FP16 checkpoint, suitable for down-stream fine-tuning
-* ggml-model-f16.gguf: F16 non-quantized llama-cpp checkpoint, for CPU inference
-* ggml-model-q4.gguf: Q4_0 quantized llama-cpp checkpoint, for fast (and less precise) CPU inference.
 ## Prompt formats
@@ -61,12 +61,14 @@ Some notes on format:
 ## Inference example
-With llama-cpp and Q4 model the inference can be done on a CPU:
 ```bash
-$ ./main -m ~/models/nixie-querygen-v2/ggml-model-q4.gguf -p "git lfs track will begin tracking a new file or an existing file that is already checked in to your repository. When you run git lfs track and then commit that change, it will update the file, replacing it with the LFS pointer contents. short query:" -s 1
-system_info: n_threads = 8 / 16 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
 sampling:
         repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
         top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
@@ -76,7 +78,9 @@ CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temp
 generate: n_ctx = 512, n_batch = 512, n_predict = -1, n_keep = 0
- git lfs track will begin tracking a new file or an existing file that is already checked in to your repository. When you run git lfs track and then commit that change, it will update the file, replacing it with the LFS pointer contents. short regular query: git-lfs track [end of text]
 ```
 ## Training config

 This repo has multiple versions of the model:
+* model-*.safetensors: Pytorch FP16 checkpoint, suitable for down-stream fine-tuning
+* ggml-model-f16.gguf: GGUF F16 non-quantized [llama-cpp](https://github.com/ggerganov/llama.cpp) checkpoint, for CPU inference
+* ggml-model-q4.gguf: GGUF Q4_0 quantized [llama-cpp](https://github.com/ggerganov/llama.cpp) checkpoint, for fast (and less precise) CPU inference.
 ## Prompt formats
 ## Inference example
+With [llama-cpp](https://github.com/ggerganov/llama.cpp) and Q4 model the inference can be done on a CPU:
 ```bash
+$ ./main -m ~/models/nixie-querygen-v2/ggml-model-q4.gguf -p "git lfs track will begin tracking \
+a new file or an existing file that is already checked in to your repository. When you run git \
+lfs track and then commit that change, it will update the file, replacing it with the LFS \
+pointer contents. short query:" -s 1
 sampling:
         repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
         top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
 generate: n_ctx = 512, n_batch = 512, n_predict = -1, n_keep = 0
+ git lfs track will begin tracking a new file or an existing file that is already checked in to your
+ repository. When you run git lfs track and then commit that change, it will update the file,
+ replacing it with the LFS pointer contents. short regular query: git-lfs track [end of text]
 ```
 ## Training config