sho-takase
commited on
Commit
•
3fa4cb4
1
Parent(s):
b083c4b
Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,9 @@ This repository provides Japanese language models trained by [SB Intuitions](htt
|
|
11 |
|
12 |
## How to use
|
13 |
|
14 |
-
|
|
|
|
|
15 |
import torch
|
16 |
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, set_seed
|
17 |
|
@@ -35,7 +37,7 @@ for t in text:
|
|
35 |
|
36 |
## Configuration
|
37 |
|
38 |
-
| Parameters | Vocab size |
|
39 |
| :-----: | :-----------: | :-------------: | :----------- | :-----------: | :----: | :--------: | :-------------: |
|
40 |
| [7B](https://huggingface.co/sbintuitions/sarashina1-7b) | 51200 | 1.0T | GPTNeoX | RoPE | 32 | 4096 | 32 |
|
41 |
| [13B](https://huggingface.co/sbintuitions/sarashina1-13b) | 51200 | 1.0T | GPTNeoX | RoPE | 40 | 5120 | 40 |
|
|
|
11 |
|
12 |
## How to use
|
13 |
|
14 |
+
Please set **use_fast=False** to use our tokenizer properly.
|
15 |
+
|
16 |
+
```python
|
17 |
import torch
|
18 |
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, set_seed
|
19 |
|
|
|
37 |
|
38 |
## Configuration
|
39 |
|
40 |
+
| Parameters | Vocab size | Training tokens | Architecture | Position type | Layers | Hidden dim | Attention heads |
|
41 |
| :-----: | :-----------: | :-------------: | :----------- | :-----------: | :----: | :--------: | :-------------: |
|
42 |
| [7B](https://huggingface.co/sbintuitions/sarashina1-7b) | 51200 | 1.0T | GPTNeoX | RoPE | 32 | 4096 | 32 |
|
43 |
| [13B](https://huggingface.co/sbintuitions/sarashina1-13b) | 51200 | 1.0T | GPTNeoX | RoPE | 40 | 5120 | 40 |
|