OrionZheng
commited on
Commit
•
cda22d1
1
Parent(s):
6996791
Update README.md
Browse files
README.md
CHANGED
@@ -25,7 +25,7 @@ The table below lists the 8B/8B-Chat model that has completed training on 1.1T t
|
|
25 |
|
26 |
| Model Name | Description | #Param |Huggingface |
|
27 |
|----------------|-------------------------------------------------|----------|-------------|
|
28 |
-
| **OpenMoE-8B(1.1T)** | 8B MoE with comparable FLOPs of a
|
29 |
| **OpenMoE-8B-Chat (1.1T+SFT)** | OpenMoE-8B-1.1T supervised finetuned on the [WildChat GPT-4 Subset](https://huggingface.co/datasets/allenai/WildChat-nontoxic) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-chat) |
|
30 |
|
31 |
|
@@ -34,11 +34,11 @@ Besides, we also provide all our intermediate checkpoints(base, 8B, 34B) for res
|
|
34 |
| Model Name | Description | #Param |Huggingface |
|
35 |
|----------------|-------------------------------------------------|----------|-------------|
|
36 |
| **OpenMoE-34B-200B** | 34B MoE with comparable FLOPs of a 7B LLaMA(No SFT) |34B |[Link](https://huggingface.co/OrionZheng/openmoe-34b-200B) |
|
37 |
-
| OpenMoE-8B-200B | 8B MoE with comparable FLOPs of a
|
38 |
-
| OpenMoE-8B-400B | 8B MoE with comparable FLOPs of a
|
39 |
-
| OpenMoE-8B-600B | 8B MoE with comparable FLOPs of a
|
40 |
-
| OpenMoE-8B-800B | 8B MoE with comparable FLOPs of a
|
41 |
-
| OpenMoE-8B-1T | 8B MoE with comparable FLOPs of a
|
42 |
| OpenMoE-base(128B) | A small MoE model for debugging only |637M |[Link](https://huggingface.co/OrionZheng/openmoe-base) |
|
43 |
| OpenLLaMA-base(128B) | A dense counter-part of OpenMoE-base |310M |[Link](https://huggingface.co/fuzhao/OpenLLaMA_Base) |
|
44 |
|
|
|
25 |
|
26 |
| Model Name | Description | #Param |Huggingface |
|
27 |
|----------------|-------------------------------------------------|----------|-------------|
|
28 |
+
| **OpenMoE-8B(1.1T)** | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b) |
|
29 |
| **OpenMoE-8B-Chat (1.1T+SFT)** | OpenMoE-8B-1.1T supervised finetuned on the [WildChat GPT-4 Subset](https://huggingface.co/datasets/allenai/WildChat-nontoxic) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-chat) |
|
30 |
|
31 |
|
|
|
34 |
| Model Name | Description | #Param |Huggingface |
|
35 |
|----------------|-------------------------------------------------|----------|-------------|
|
36 |
| **OpenMoE-34B-200B** | 34B MoE with comparable FLOPs of a 7B LLaMA(No SFT) |34B |[Link](https://huggingface.co/OrionZheng/openmoe-34b-200B) |
|
37 |
+
| OpenMoE-8B-200B | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-200B) |
|
38 |
+
| OpenMoE-8B-400B | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-400B) |
|
39 |
+
| OpenMoE-8B-600B | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-600B) |
|
40 |
+
| OpenMoE-8B-800B | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-800B) |
|
41 |
+
| OpenMoE-8B-1T | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-1T) |
|
42 |
| OpenMoE-base(128B) | A small MoE model for debugging only |637M |[Link](https://huggingface.co/OrionZheng/openmoe-base) |
|
43 |
| OpenLLaMA-base(128B) | A dense counter-part of OpenMoE-base |310M |[Link](https://huggingface.co/fuzhao/OpenLLaMA_Base) |
|
44 |
|