Text Generation
PyTorch
English
openlm
linear
mistral
Eval Results
sedrickkeh commited on
Commit
819d065
1 Parent(s): 329ccb0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -85,7 +85,7 @@ We uptrain Mistral-7B on 100B tokens of RefinedWeb.
85
 
86
  ## Model Details
87
  - **Developed by**: [Toyota Research Institute](https://www.tri.global/our-work/robotics)
88
- - **Model Type**: This is an auto-regressive language model initialized from [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) and uptrained into a linear model based on the [SUPRA]() architecture.
89
  - **Dataset**: Initialized from [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). Uprained on 100B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
90
  - **Tokenizer**: `mistralai/Mistral-7B-v0.1`
91
  - **Library**: [OpenLM](https://github.com/mlfoundations/open_lm/) (we use a [fork](https://github.com/TRI-ML/linear_open_lm/) of OpenLM that supports linear attention)
 
85
 
86
  ## Model Details
87
  - **Developed by**: [Toyota Research Institute](https://www.tri.global/our-work/robotics)
88
+ - **Model Type**: This is an auto-regressive language model initialized from [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) and uptrained into a linear model based on the [SUPRA](https://arxiv.org/abs/2405.06640) architecture.
89
  - **Dataset**: Initialized from [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). Uprained on 100B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
90
  - **Tokenizer**: `mistralai/Mistral-7B-v0.1`
91
  - **Library**: [OpenLM](https://github.com/mlfoundations/open_lm/) (we use a [fork](https://github.com/TRI-ML/linear_open_lm/) of OpenLM that supports linear attention)