andrijdavid
commited on
Commit
•
e3534b9
1
Parent(s):
e54089f
Upload folder using huggingface_hub
Browse files- .gitattributes +18 -0
- MiniMA-2-3B-Q2_K.gguf +3 -0
- MiniMA-2-3B-Q3_K.gguf +3 -0
- MiniMA-2-3B-Q3_K_L.gguf +3 -0
- MiniMA-2-3B-Q3_K_M.gguf +3 -0
- MiniMA-2-3B-Q3_K_S.gguf +3 -0
- MiniMA-2-3B-Q4_0.gguf +3 -0
- MiniMA-2-3B-Q4_1.gguf +3 -0
- MiniMA-2-3B-Q4_K.gguf +3 -0
- MiniMA-2-3B-Q4_K_M.gguf +3 -0
- MiniMA-2-3B-Q4_K_S.gguf +3 -0
- MiniMA-2-3B-Q5_0.gguf +3 -0
- MiniMA-2-3B-Q5_1.gguf +3 -0
- MiniMA-2-3B-Q5_K.gguf +3 -0
- MiniMA-2-3B-Q5_K_M.gguf +3 -0
- MiniMA-2-3B-Q5_K_S.gguf +3 -0
- MiniMA-2-3B-Q6_K.gguf +3 -0
- MiniMA-2-3B-Q8_0.gguf +3 -0
- MiniMA-2-3B-f16.gguf +3 -0
- README.md +6 -6
.gitattributes
CHANGED
@@ -33,3 +33,21 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
MiniMA-2-3B-Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
|
37 |
+
MiniMA-2-3B-Q3_K.gguf filter=lfs diff=lfs merge=lfs -text
|
38 |
+
MiniMA-2-3B-Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
|
39 |
+
MiniMA-2-3B-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
40 |
+
MiniMA-2-3B-Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
41 |
+
MiniMA-2-3B-Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
|
42 |
+
MiniMA-2-3B-Q4_1.gguf filter=lfs diff=lfs merge=lfs -text
|
43 |
+
MiniMA-2-3B-Q4_K.gguf filter=lfs diff=lfs merge=lfs -text
|
44 |
+
MiniMA-2-3B-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
45 |
+
MiniMA-2-3B-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
46 |
+
MiniMA-2-3B-Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
|
47 |
+
MiniMA-2-3B-Q5_1.gguf filter=lfs diff=lfs merge=lfs -text
|
48 |
+
MiniMA-2-3B-Q5_K.gguf filter=lfs diff=lfs merge=lfs -text
|
49 |
+
MiniMA-2-3B-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
50 |
+
MiniMA-2-3B-Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
51 |
+
MiniMA-2-3B-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
52 |
+
MiniMA-2-3B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
53 |
+
MiniMA-2-3B-f16.gguf filter=lfs diff=lfs merge=lfs -text
|
MiniMA-2-3B-Q2_K.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9b825e5ae63583dc61ba25d6af1287beeb0f5568ebdab761a661f2f7c9fdfb32
|
3 |
+
size 1297187936
|
MiniMA-2-3B-Q3_K.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:44ad7723ccfb80402f9e9c58f7c315ee485a201be525427e071eb24e2ca172c0
|
3 |
+
size 1507578464
|
MiniMA-2-3B-Q3_K_L.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5e58fb3cc55e5f18931e3362dfc4067afdbde98ad1bf35d166ae850d7bfe2b86
|
3 |
+
size 1631048288
|
MiniMA-2-3B-Q3_K_M.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:44ad7723ccfb80402f9e9c58f7c315ee485a201be525427e071eb24e2ca172c0
|
3 |
+
size 1507578464
|
MiniMA-2-3B-Q3_K_S.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5e9455d900df1a24da51ee67f3bac9bcca729bdd426468ac296f712f2c784792
|
3 |
+
size 1358549600
|
MiniMA-2-3B-Q4_0.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bf9773553da9001bc733a2429022096f2028cdda6f0568e7dea5be8b1c54adc9
|
3 |
+
size 1739602016
|
MiniMA-2-3B-Q4_1.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a39ccdbb7cb17d40924b3c4601c09885764dc9ed7dc7191eea70c149d2beb039
|
3 |
+
size 1918920800
|
MiniMA-2-3B-Q4_K.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:65b72cfc709c53664e725bb84fa57b8f631670960a46aded0d25d2aa22aebe41
|
3 |
+
size 1846655072
|
MiniMA-2-3B-Q4_K_M.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:65b72cfc709c53664e725bb84fa57b8f631670960a46aded0d25d2aa22aebe41
|
3 |
+
size 1846655072
|
MiniMA-2-3B-Q4_K_S.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bf6c666583cc666413f086200bd0264f5adf8fb75abd0560cc04c50e0072baad
|
3 |
+
size 1756903520
|
MiniMA-2-3B-Q5_0.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1d847b35f8500782418461d6f89e37beb0cea67d960629a0cfc5987f4c7a2d01
|
3 |
+
size 2098239584
|
MiniMA-2-3B-Q5_1.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:528a74a27e05080ef9787431f64d6963e1facad90aea418380d2b6ba8f8c906b
|
3 |
+
size 2277558368
|
MiniMA-2-3B-Q5_K.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ede793da2b0b652ab4e4692407c8b9a6ab3c6bf4b8b278eb1ca4a855f78d94e5
|
3 |
+
size 2153388128
|
MiniMA-2-3B-Q5_K_M.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ede793da2b0b652ab4e4692407c8b9a6ab3c6bf4b8b278eb1ca4a855f78d94e5
|
3 |
+
size 2153388128
|
MiniMA-2-3B-Q5_K_S.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c3e753370aeecbcbbdd5f4bf7e38d5045de1b96328519cdc4e83f6be9fd22edf
|
3 |
+
size 2098239584
|
MiniMA-2-3B-Q6_K.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:67b29d745d386b974991a46d9ca1bdc96cb07c45a6c5615d2d065ad108c4a4f7
|
3 |
+
size 2479292000
|
MiniMA-2-3B-Q8_0.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:21c1cfb7d853accd06b6f5b07b5b760547a5598cf5ffe814446ca94852d337f7
|
3 |
+
size 3210768992
|
MiniMA-2-3B-f16.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4211a5597771040961e17636fdca599f1eeeba382a96a4ae746c821cda916823
|
3 |
+
size 6042292800
|
README.md
CHANGED
@@ -68,7 +68,7 @@ The following clients/libraries will automatically download models for you, prov
|
|
68 |
|
69 |
### In `text-generation-webui`
|
70 |
|
71 |
-
Under Download Model, you can enter the model repo: andrijdavid/MiniMA-2-3B-GGUF and below it, a specific filename to download, such as: MiniMA-2-3B.gguf.
|
72 |
|
73 |
Then click Download.
|
74 |
|
@@ -83,7 +83,7 @@ pip3 install huggingface-hub
|
|
83 |
Then you can download any individual model file to the current directory, at high speed, with a command like this:
|
84 |
|
85 |
```shell
|
86 |
-
huggingface-cli download andrijdavid/MiniMA-2-3B-GGUF MiniMA-2-3B.gguf --local-dir . --local-dir-use-symlinks False
|
87 |
```
|
88 |
|
89 |
<details>
|
@@ -106,7 +106,7 @@ pip3 install hf_transfer
|
|
106 |
And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
|
107 |
|
108 |
```shell
|
109 |
-
HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download andrijdavid/MiniMA-2-3B-GGUF MiniMA-2-3B.gguf --local-dir . --local-dir-use-symlinks False
|
110 |
```
|
111 |
|
112 |
Windows Command Line users: You can set the environment variable by running `set HF_HUB_ENABLE_HF_TRANSFER=1` before the download command.
|
@@ -118,7 +118,7 @@ Windows Command Line users: You can set the environment variable by running `set
|
|
118 |
Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
|
119 |
|
120 |
```shell
|
121 |
-
./main -ngl 35 -m MiniMA-2-3B.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "<PROMPT>"
|
122 |
```
|
123 |
|
124 |
Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
|
@@ -169,7 +169,7 @@ pip install llama-cpp-python
|
|
169 |
from llama_cpp import Llama
|
170 |
# Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
|
171 |
llm = Llama(
|
172 |
-
model_path="./MiniMA-2-3B.gguf", # Download the model file first
|
173 |
n_ctx=32768, # The max sequence length to use - note that longer sequence lengths require much more resources
|
174 |
n_threads=8, # The number of CPU threads to use, tailor to your system and the resulting performance
|
175 |
n_gpu_layers=35 # The number of layers to offload to GPU, if you have GPU acceleration available
|
@@ -182,7 +182,7 @@ output = llm(
|
|
182 |
echo=True # Whether to echo the prompt
|
183 |
)
|
184 |
# Chat Completion API
|
185 |
-
llm = Llama(model_path="./MiniMA-2-3B.gguf", chat_format="llama-2") # Set chat_format according to the model you are using
|
186 |
llm.create_chat_completion(
|
187 |
messages = [
|
188 |
{"role": "system", "content": "You are a story writing assistant."},
|
|
|
68 |
|
69 |
### In `text-generation-webui`
|
70 |
|
71 |
+
Under Download Model, you can enter the model repo: andrijdavid/MiniMA-2-3B-GGUF and below it, a specific filename to download, such as: MiniMA-2-3B-f16.gguf.
|
72 |
|
73 |
Then click Download.
|
74 |
|
|
|
83 |
Then you can download any individual model file to the current directory, at high speed, with a command like this:
|
84 |
|
85 |
```shell
|
86 |
+
huggingface-cli download andrijdavid/MiniMA-2-3B-GGUF MiniMA-2-3B-f16.gguf --local-dir . --local-dir-use-symlinks False
|
87 |
```
|
88 |
|
89 |
<details>
|
|
|
106 |
And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
|
107 |
|
108 |
```shell
|
109 |
+
HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download andrijdavid/MiniMA-2-3B-GGUF MiniMA-2-3B-f16.gguf --local-dir . --local-dir-use-symlinks False
|
110 |
```
|
111 |
|
112 |
Windows Command Line users: You can set the environment variable by running `set HF_HUB_ENABLE_HF_TRANSFER=1` before the download command.
|
|
|
118 |
Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
|
119 |
|
120 |
```shell
|
121 |
+
./main -ngl 35 -m MiniMA-2-3B-f16.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "<PROMPT>"
|
122 |
```
|
123 |
|
124 |
Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
|
|
|
169 |
from llama_cpp import Llama
|
170 |
# Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
|
171 |
llm = Llama(
|
172 |
+
model_path="./MiniMA-2-3B-f16.gguf", # Download the model file first
|
173 |
n_ctx=32768, # The max sequence length to use - note that longer sequence lengths require much more resources
|
174 |
n_threads=8, # The number of CPU threads to use, tailor to your system and the resulting performance
|
175 |
n_gpu_layers=35 # The number of layers to offload to GPU, if you have GPU acceleration available
|
|
|
182 |
echo=True # Whether to echo the prompt
|
183 |
)
|
184 |
# Chat Completion API
|
185 |
+
llm = Llama(model_path="./MiniMA-2-3B-f16.gguf", chat_format="llama-2") # Set chat_format according to the model you are using
|
186 |
llm.create_chat_completion(
|
187 |
messages = [
|
188 |
{"role": "system", "content": "You are a story writing assistant."},
|