gguf 4bit
#1
by
pacozaa
- opened
- .gitattributes +0 -1
- Modelfile +0 -4
- README.md +2 -27
- openthaigpt-Q4_K_M.gguf +0 -3
.gitattributes
CHANGED
@@ -34,4 +34,3 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
ggml-model-f16.gguf filter=lfs diff=lfs merge=lfs -text
|
37 |
-
openthaigpt-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
|
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
ggml-model-f16.gguf filter=lfs diff=lfs merge=lfs -text
|
|
Modelfile
DELETED
@@ -1,4 +0,0 @@
|
|
1 |
-
FROM ./openthaigpt-Q4_K_M.gguf
|
2 |
-
TEMPLATE """[INST] <<SYS>>{{ .System }}<</SYS>>
|
3 |
-
|
4 |
-
{{ .Prompt }} [/INST]"""
|
|
|
|
|
|
|
|
|
|
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
license:
|
3 |
language:
|
4 |
- th
|
5 |
- en
|
@@ -44,17 +44,6 @@ Thai language multiple choice exams, Test on unseen test sets, Zero-shot learnin
|
|
44 |
|
45 |
(Updated on: 7 April 2024)
|
46 |
|
47 |
-
## Benchmark on M3Exam evaluated by an external party (Float16.cloud)
|
48 |
-
|
49 |
-
| **Models** | **ENGLISH (M3EXAM)** | **THAI (M3EXAM)** |
|
50 |
-
|---------------------|------------------|---------------|
|
51 |
-
| <b style="color:blue">OTG-7b</b> | <b style="color:blue">40.92 %</b> | <b style="color:blue">25.14 %</b> |
|
52 |
-
| OTG-13b | 53.69 % | 36.49 % |
|
53 |
-
| OTG-70b | 72.58 %< | 48.29 % |
|
54 |
-
| GPT-3.5-turbo-0613* | - | 34.1 % |
|
55 |
-
| GPT-4-0613* | - | 56.0 % |
|
56 |
-
More information: https://blog.float16.cloud/the-first-70b-thai-llm/
|
57 |
-
|
58 |
## Licenses
|
59 |
**Source Code**: License Apache Software License 2.0.<br>
|
60 |
**Weight**: Research and **Commercial uses**.<br>
|
@@ -236,20 +225,6 @@ curl --location 'http://localhost:8000/completion' \
|
|
236 |
}'
|
237 |
```
|
238 |
|
239 |
-
### Ollama
|
240 |
-
|
241 |
-
There are two ways to run on ollama
|
242 |
-
|
243 |
-
1. From this repo Modelfile and 4 bit quantized gguf
|
244 |
-
```bash
|
245 |
-
ollama create -f ./Modelfile
|
246 |
-
```
|
247 |
-
|
248 |
-
2. From Ollama CLI
|
249 |
-
```bash
|
250 |
-
ollama run pacozaa/openthaigpt
|
251 |
-
```
|
252 |
-
|
253 |
### GPU Memory Requirements
|
254 |
| **Number of Parameters** | **FP 16 bits** | **8 bits (Quantized)** | **4 bits (Quantized)** | **Example Graphic Card for 4 bits** |
|
255 |
|------------------|----------------|------------------------|------------------------|---------------------------------------------|
|
@@ -275,4 +250,4 @@ ollama run pacozaa/openthaigpt
|
|
275 |
* Kriangkrai Saetan (kraitan.ss21@gmail.com)
|
276 |
* Pitikorn Khlaisamniang (pitikorn32@gmail.com)
|
277 |
|
278 |
-
<i>Disclaimer: Provided responses are not guaranteed.</i>
|
|
|
1 |
---
|
2 |
+
license: apache-2.0
|
3 |
language:
|
4 |
- th
|
5 |
- en
|
|
|
44 |
|
45 |
(Updated on: 7 April 2024)
|
46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
## Licenses
|
48 |
**Source Code**: License Apache Software License 2.0.<br>
|
49 |
**Weight**: Research and **Commercial uses**.<br>
|
|
|
225 |
}'
|
226 |
```
|
227 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
228 |
### GPU Memory Requirements
|
229 |
| **Number of Parameters** | **FP 16 bits** | **8 bits (Quantized)** | **4 bits (Quantized)** | **Example Graphic Card for 4 bits** |
|
230 |
|------------------|----------------|------------------------|------------------------|---------------------------------------------|
|
|
|
250 |
* Kriangkrai Saetan (kraitan.ss21@gmail.com)
|
251 |
* Pitikorn Khlaisamniang (pitikorn32@gmail.com)
|
252 |
|
253 |
+
<i>Disclaimer: Provided responses are not guaranteed.</i>
|
openthaigpt-Q4_K_M.gguf
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:7c9fe7f5dfb4adaac2211b8159297d3689abb9c94259a9e8b75a86348cf6fdda
|
3 |
-
size 4132760544
|
|
|
|
|
|
|
|