tokyo-electron-device-ai commited on
Commit
eb59e64
·
verified ·
1 Parent(s): 51185a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -9,8 +9,7 @@ base_model:
9
 
10
  ## Model Details
11
  Llama 3 tedllm is the large language models (8B) that were built by continual pre-training on the Meta Llama 3 8B models. Llama 3 tedllm is developped for enhancing the Japanese language capabilities and the domain specific data.
12
- We use approximately 160 billion tokens from a large Japanese corpus. This model was trained by using Cerebras CS-3s. Cerebras CS-3 is a new AI accelerator that is different from normal GPUs.
13
-
14
  ## Intended uses & limitations
15
 
16
  You can use the raw model for text generation or fine-tune it to a downstream task.
 
9
 
10
  ## Model Details
11
  Llama 3 tedllm is the large language models (8B) that were built by continual pre-training on the Meta Llama 3 8B models. Llama 3 tedllm is developped for enhancing the Japanese language capabilities and the domain specific data.
12
+ We use approximately 160 billion tokens from a large Japanese corpus. This model was trained on the Cerebras CS-3 wafer scale systems. Cerebras' weight streaming technology simplifies the training of LLMs by disaggregating compute from model storage. This allowed for efficient scaling of training across nodes using simple data parallelism.
 
13
  ## Intended uses & limitations
14
 
15
  You can use the raw model for text generation or fine-tune it to a downstream task.