RachidAR commited on
Commit
05602b5
1 Parent(s): 8f62317

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +29 -14
README.md CHANGED
@@ -1,42 +1,57 @@
1
  ---
 
 
 
2
  language:
3
  - ru
4
  license: other
 
 
5
  tags:
6
  - llama-cpp
7
  - gguf-my-repo
8
- datasets:
9
- - IlyaGusev/saiga_scored
10
- license_name: llama3
11
- license_link: https://llama.meta.com/llama3/license/
12
  ---
13
 
14
  # RachidAR/saiga_llama3_8b-Q6_K-GGUF
15
  This model was converted to GGUF format from [`IlyaGusev/saiga_llama3_8b`](https://huggingface.co/IlyaGusev/saiga_llama3_8b) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
16
  Refer to the [original model card](https://huggingface.co/IlyaGusev/saiga_llama3_8b) for more details on the model.
17
- ## Use with llama.cpp
18
 
19
- Install llama.cpp through brew.
 
20
 
21
  ```bash
22
- brew install ggerganov/ggerganov/llama.cpp
 
23
  ```
24
  Invoke the llama.cpp server or the CLI.
25
 
26
- CLI:
27
-
28
  ```bash
29
- llama-cli --hf-repo RachidAR/saiga_llama3_8b-Q6_K-GGUF --model saiga_llama3_8b.Q6_K.gguf -p "The meaning to life and the universe is"
30
  ```
31
 
32
- Server:
33
-
34
  ```bash
35
- llama-server --hf-repo RachidAR/saiga_llama3_8b-Q6_K-GGUF --model saiga_llama3_8b.Q6_K.gguf -c 2048
36
  ```
37
 
38
  Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
39
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  ```
41
- git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make && ./main -m saiga_llama3_8b.Q6_K.gguf -n 128
42
  ```
 
1
  ---
2
+ base_model: IlyaGusev/saiga_llama3_8b
3
+ datasets:
4
+ - IlyaGusev/saiga_scored
5
  language:
6
  - ru
7
  license: other
8
+ license_name: llama3
9
+ license_link: https://llama.meta.com/llama3/license/
10
  tags:
11
  - llama-cpp
12
  - gguf-my-repo
 
 
 
 
13
  ---
14
 
15
  # RachidAR/saiga_llama3_8b-Q6_K-GGUF
16
  This model was converted to GGUF format from [`IlyaGusev/saiga_llama3_8b`](https://huggingface.co/IlyaGusev/saiga_llama3_8b) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
17
  Refer to the [original model card](https://huggingface.co/IlyaGusev/saiga_llama3_8b) for more details on the model.
 
18
 
19
+ ## Use with llama.cpp
20
+ Install llama.cpp through brew (works on Mac and Linux)
21
 
22
  ```bash
23
+ brew install llama.cpp
24
+
25
  ```
26
  Invoke the llama.cpp server or the CLI.
27
 
28
+ ### CLI:
 
29
  ```bash
30
+ llama-cli --hf-repo RachidAR/saiga_llama3_8b-Q6_K-GGUF --hf-file saiga_llama3_8b-q6_k.gguf -p "The meaning to life and the universe is"
31
  ```
32
 
33
+ ### Server:
 
34
  ```bash
35
+ llama-server --hf-repo RachidAR/saiga_llama3_8b-Q6_K-GGUF --hf-file saiga_llama3_8b-q6_k.gguf -c 2048
36
  ```
37
 
38
  Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
39
 
40
+ Step 1: Clone llama.cpp from GitHub.
41
+ ```
42
+ git clone https://github.com/ggerganov/llama.cpp
43
+ ```
44
+
45
+ Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
46
+ ```
47
+ cd llama.cpp && LLAMA_CURL=1 make
48
+ ```
49
+
50
+ Step 3: Run inference through the main binary.
51
+ ```
52
+ ./llama-cli --hf-repo RachidAR/saiga_llama3_8b-Q6_K-GGUF --hf-file saiga_llama3_8b-q6_k.gguf -p "The meaning to life and the universe is"
53
+ ```
54
+ or
55
  ```
56
+ ./llama-server --hf-repo RachidAR/saiga_llama3_8b-Q6_K-GGUF --hf-file saiga_llama3_8b-q6_k.gguf -c 2048
57
  ```