kuotient commited on
Commit
05afa0b
1 Parent(s): 3c4bd05

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -12,8 +12,10 @@ tags:
12
  ![Mamba-ko-2.8B](./Seagull-mamba.png)
13
  **Mamba-ko-2.8B** is the state space model, further pretrained(or continous trained) with synthetically generated dataset - [**korean_textbooks**](https://huggingface.co/datasets/maywell/korean_textbooks).
14
 
15
- If you're interested in building large-scale language models to solve a wide variety of problems in a wide variety of domains, you should consider joining [Allganize](https://allganize.career.greetinghr.com/o/65146).
16
  For a coffee chat or if you have any questions, please do not hesitate to contact me as well! - jisoo.kim@allganize.ai
 
 
17
  ## TODO
18
  - Complete training with korean_textbooks - 6B tokens down, 2B to go.
19
  - More training with publicly available Korean corpora
@@ -31,7 +33,7 @@ Jisoo Kim(kuotient)
31
  ### KoBEST
32
  | Model | boolq | copa | hellaswag | sentinag |
33
  | --- | --- | --- | --- | --- |
34
- | kuotient/mamba-ko-2.8b | 0.5825 | 0.6166 | 0.4051 | 0.3383 |
35
  | state_spaces/mamba-2.8b-slimpj | 0.3343 | 0.4867 | 0.3452 | 0.3547 |
36
  | kuotient/mamba-ko-2.8b-old (2B trained only) | 0.4236 | 0.5896 | 0.4012 | 0.4348 |
37
  | kuotient/mamba-ko-2.8b-old-instruct | 0.4041 | 0.6505 | 0.4906 | 0.3348 |
@@ -39,7 +41,7 @@ Jisoo Kim(kuotient)
39
  | maywell/TinyWand-SFT | 0.3455 | 0.6142 | 0.3944 | N/A |
40
  | microsoft/phi-2 | 0.3343 | 0.4792 | 0.3235 | N/A |
41
  | TinyLlama/TinyLlama-1.1B | 0.3343 | 0.4784 | 0.3396 | N/A |
42
-
43
  ### Thanks
44
  한국어 LLM 커뮤니티에 많은 기여와 동기부여를 해주고 계신 [maywell](https://huggingface.co/maywell)님 감사드립니다.
45
  ## Usage
 
12
  ![Mamba-ko-2.8B](./Seagull-mamba.png)
13
  **Mamba-ko-2.8B** is the state space model, further pretrained(or continous trained) with synthetically generated dataset - [**korean_textbooks**](https://huggingface.co/datasets/maywell/korean_textbooks).
14
 
15
+ > If you're interested in building large-scale language models to solve a wide variety of problems in a wide variety of domains, you should consider joining [Allganize](https://allganize.career.greetinghr.com/o/65146).
16
  For a coffee chat or if you have any questions, please do not hesitate to contact me as well! - jisoo.kim@allganize.ai
17
+
18
+ I would like to thank Allganize Korea for their generosity in providing resources for this personal project. This project is not directly related to the company's goals or research.
19
  ## TODO
20
  - Complete training with korean_textbooks - 6B tokens down, 2B to go.
21
  - More training with publicly available Korean corpora
 
33
  ### KoBEST
34
  | Model | boolq | copa | hellaswag | sentinag |
35
  | --- | --- | --- | --- | --- |
36
+ | kuotient/mamba-ko-2.8b* | 0.5825 | 0.6166 | 0.4051 | 0.3383 |
37
  | state_spaces/mamba-2.8b-slimpj | 0.3343 | 0.4867 | 0.3452 | 0.3547 |
38
  | kuotient/mamba-ko-2.8b-old (2B trained only) | 0.4236 | 0.5896 | 0.4012 | 0.4348 |
39
  | kuotient/mamba-ko-2.8b-old-instruct | 0.4041 | 0.6505 | 0.4906 | 0.3348 |
 
41
  | maywell/TinyWand-SFT | 0.3455 | 0.6142 | 0.3944 | N/A |
42
  | microsoft/phi-2 | 0.3343 | 0.4792 | 0.3235 | N/A |
43
  | TinyLlama/TinyLlama-1.1B | 0.3343 | 0.4784 | 0.3396 | N/A |
44
+ *>6B tokens trained. Further up to 8B tokens.
45
  ### Thanks
46
  한국어 LLM 커뮤니티에 많은 기여와 동기부여를 해주고 계신 [maywell](https://huggingface.co/maywell)님 감사드립니다.
47
  ## Usage