Text Generation
Transformers
Safetensors
GGUF
English
mistral
Inference Endpoints
text-generation-inference
shuttie commited on
Commit
f849654
1 Parent(s): 6bd6c7e

make readme example code more readable

Browse files
Files changed (1) hide show
  1. README.md +11 -7
README.md CHANGED
@@ -42,9 +42,9 @@ We used [200k query-document pairs](https://huggingface.co/datasets/nixiesearch/
42
 
43
  This repo has multiple versions of the model:
44
 
45
- * model-*.safetensors: FP16 checkpoint, suitable for down-stream fine-tuning
46
- * ggml-model-f16.gguf: F16 non-quantized llama-cpp checkpoint, for CPU inference
47
- * ggml-model-q4.gguf: Q4_0 quantized llama-cpp checkpoint, for fast (and less precise) CPU inference.
48
 
49
  ## Prompt formats
50
 
@@ -61,12 +61,14 @@ Some notes on format:
61
 
62
  ## Inference example
63
 
64
- With llama-cpp and Q4 model the inference can be done on a CPU:
65
 
66
  ```bash
67
- $ ./main -m ~/models/nixie-querygen-v2/ggml-model-q4.gguf -p "git lfs track will begin tracking a new file or an existing file that is already checked in to your repository. When you run git lfs track and then commit that change, it will update the file, replacing it with the LFS pointer contents. short query:" -s 1
 
 
 
68
 
69
- system_info: n_threads = 8 / 16 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
70
  sampling:
71
  repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
72
  top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
@@ -76,7 +78,9 @@ CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temp
76
  generate: n_ctx = 512, n_batch = 512, n_predict = -1, n_keep = 0
77
 
78
 
79
- git lfs track will begin tracking a new file or an existing file that is already checked in to your repository. When you run git lfs track and then commit that change, it will update the file, replacing it with the LFS pointer contents. short regular query: git-lfs track [end of text]
 
 
80
  ```
81
 
82
  ## Training config
 
42
 
43
  This repo has multiple versions of the model:
44
 
45
+ * model-*.safetensors: Pytorch FP16 checkpoint, suitable for down-stream fine-tuning
46
+ * ggml-model-f16.gguf: GGUF F16 non-quantized [llama-cpp](https://github.com/ggerganov/llama.cpp) checkpoint, for CPU inference
47
+ * ggml-model-q4.gguf: GGUF Q4_0 quantized [llama-cpp](https://github.com/ggerganov/llama.cpp) checkpoint, for fast (and less precise) CPU inference.
48
 
49
  ## Prompt formats
50
 
 
61
 
62
  ## Inference example
63
 
64
+ With [llama-cpp](https://github.com/ggerganov/llama.cpp) and Q4 model the inference can be done on a CPU:
65
 
66
  ```bash
67
+ $ ./main -m ~/models/nixie-querygen-v2/ggml-model-q4.gguf -p "git lfs track will begin tracking \
68
+ a new file or an existing file that is already checked in to your repository. When you run git \
69
+ lfs track and then commit that change, it will update the file, replacing it with the LFS \
70
+ pointer contents. short query:" -s 1
71
 
 
72
  sampling:
73
  repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
74
  top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
 
78
  generate: n_ctx = 512, n_batch = 512, n_predict = -1, n_keep = 0
79
 
80
 
81
+ git lfs track will begin tracking a new file or an existing file that is already checked in to your
82
+ repository. When you run git lfs track and then commit that change, it will update the file,
83
+ replacing it with the LFS pointer contents. short regular query: git-lfs track [end of text]
84
  ```
85
 
86
  ## Training config