sedrickkeh
commited on
Commit
•
819d065
1
Parent(s):
329ccb0
Update README.md
Browse files
README.md
CHANGED
@@ -85,7 +85,7 @@ We uptrain Mistral-7B on 100B tokens of RefinedWeb.
|
|
85 |
|
86 |
## Model Details
|
87 |
- **Developed by**: [Toyota Research Institute](https://www.tri.global/our-work/robotics)
|
88 |
-
- **Model Type**: This is an auto-regressive language model initialized from [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) and uptrained into a linear model based on the [SUPRA]() architecture.
|
89 |
- **Dataset**: Initialized from [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). Uprained on 100B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
|
90 |
- **Tokenizer**: `mistralai/Mistral-7B-v0.1`
|
91 |
- **Library**: [OpenLM](https://github.com/mlfoundations/open_lm/) (we use a [fork](https://github.com/TRI-ML/linear_open_lm/) of OpenLM that supports linear attention)
|
|
|
85 |
|
86 |
## Model Details
|
87 |
- **Developed by**: [Toyota Research Institute](https://www.tri.global/our-work/robotics)
|
88 |
+
- **Model Type**: This is an auto-regressive language model initialized from [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) and uptrained into a linear model based on the [SUPRA](https://arxiv.org/abs/2405.06640) architecture.
|
89 |
- **Dataset**: Initialized from [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). Uprained on 100B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
|
90 |
- **Tokenizer**: `mistralai/Mistral-7B-v0.1`
|
91 |
- **Library**: [OpenLM](https://github.com/mlfoundations/open_lm/) (we use a [fork](https://github.com/TRI-ML/linear_open_lm/) of OpenLM that supports linear attention)
|