Update README.md
Browse files
README.md
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
-
# LongLoRA and LongAlpaca
|
2 |
|
3 |
|
4 |
[![Gradio](https://img.shields.io/badge/Gradio-Online%20Demo-green)](https://1841bb028d32e8619c.gradio.live)
|
@@ -30,8 +30,8 @@ For detailed usage and codes, please visit the [Github project](https://github.c
|
|
30 |
15. [License](#license)
|
31 |
|
32 |
## News
|
33 |
-
- [x] [2023.10.8] We release the long instruction-following dataset
|
34 |
-
- (The previous sft models
|
35 |
- [x] [2023.10.3] We add support GPTNeoX models. Please refer to this [PR](https://github.com/dvlab-research/LongLoRA/pull/32) for usage. Thanks for @naubull2 for this contribution.
|
36 |
- [x] [2023.9.22] We release all our fine-tuned [models](https://huggingface.co/Yukang), including **70B-32k models**, [LLaMA2-LongLoRA-70B-32k](https://huggingface.co/Yukang/Llama-2-70b-longlora-32k), [LLaMA2-LongLoRA-7B-100k](https://huggingface.co/Yukang/Llama-2-7b-longlora-100k-ft). Welcome to check them out!
|
37 |
- [x] [2023.9.22] We release [Paper](http://arxiv.org/abs/2309.12307) and this GitHub repo, including training and evaluation code.
|
@@ -95,11 +95,11 @@ We did not use the `input` format in the Alpaca format for simplicity.
|
|
95 |
## Models
|
96 |
|
97 |
### Models with supervised fine-tuning
|
98 |
-
| Model | Size | Context | Train | Link
|
99 |
-
|
100 |
-
| LongAlpaca-7B | 7B | 32768 | Full FT | [Model](https://huggingface.co/Yukang/LongAlpaca-7B)
|
101 |
-
| LongAlpaca-13B | 13B | 32768 | Full FT | [Model](https://huggingface.co/Yukang/LongAlpaca-13B)
|
102 |
-
| LongAlpaca-70B | 70B | 32768 | LoRA+ | [Model
|
103 |
|
104 |
|
105 |
### Models with context extension via fully fine-tuning
|
@@ -361,4 +361,4 @@ If you find this project useful in your research, please consider citing:
|
|
361 |
|
362 |
## License
|
363 |
- LongLoRA is licensed under the Apache License 2.0. This means that it requires the preservation of copyright and license notices.
|
364 |
-
- Data and weights are under CC-BY-NC 4.0 License.
|
|
|
1 |
+
# LongLoRA and LongAlpaca for Long-context LLMs
|
2 |
|
3 |
|
4 |
[![Gradio](https://img.shields.io/badge/Gradio-Online%20Demo-green)](https://1841bb028d32e8619c.gradio.live)
|
|
|
30 |
15. [License](#license)
|
31 |
|
32 |
## News
|
33 |
+
- [x] [2023.10.8] **We release the long instruction-following dataset**, [LongAlpaca-12k](https://drive.google.com/file/d/1JVC1p_Ht-1h61tKitOCW0blnCHf-552U/view?usp=share_link) and **the corresponding models**, [LongAlpaca-7B](https://huggingface.co/Yukang/LongAlpaca-7B), [LongAlpaca-13B](https://huggingface.co/Yukang/LongAlpaca-13B), and [LongAlpaca-70B](https://huggingface.co/Yukang/LongAlpaca-70B).
|
34 |
+
- (*The previous sft models*, [Llama-2-13b-chat-longlora-32k-sft](https://huggingface.co/Yukang/Llama-2-13b-chat-longlora-32k-sft) and [Llama-2-70b-chat-longlora-32k-sft](https://huggingface.co/Yukang/Llama-2-70b-chat-longlora-32k-sft), *have been depreciated*.)
|
35 |
- [x] [2023.10.3] We add support GPTNeoX models. Please refer to this [PR](https://github.com/dvlab-research/LongLoRA/pull/32) for usage. Thanks for @naubull2 for this contribution.
|
36 |
- [x] [2023.9.22] We release all our fine-tuned [models](https://huggingface.co/Yukang), including **70B-32k models**, [LLaMA2-LongLoRA-70B-32k](https://huggingface.co/Yukang/Llama-2-70b-longlora-32k), [LLaMA2-LongLoRA-7B-100k](https://huggingface.co/Yukang/Llama-2-7b-longlora-100k-ft). Welcome to check them out!
|
37 |
- [x] [2023.9.22] We release [Paper](http://arxiv.org/abs/2309.12307) and this GitHub repo, including training and evaluation code.
|
|
|
95 |
## Models
|
96 |
|
97 |
### Models with supervised fine-tuning
|
98 |
+
| Model | Size | Context | Train | Link |
|
99 |
+
|:---------------|------|---------|---------|-------------------------------------------------------------------------------------------------------------------------|
|
100 |
+
| LongAlpaca-7B | 7B | 32768 | Full FT | [Model](https://huggingface.co/Yukang/LongAlpaca-7B) |
|
101 |
+
| LongAlpaca-13B | 13B | 32768 | Full FT | [Model](https://huggingface.co/Yukang/LongAlpaca-13B) |
|
102 |
+
| LongAlpaca-70B | 70B | 32768 | LoRA+ | [(Model)](https://huggingface.co/Yukang/LongAlpaca-70B-lora) |
|
103 |
|
104 |
|
105 |
### Models with context extension via fully fine-tuning
|
|
|
361 |
|
362 |
## License
|
363 |
- LongLoRA is licensed under the Apache License 2.0. This means that it requires the preservation of copyright and license notices.
|
364 |
+
- Data and weights are under CC-BY-NC 4.0 License. They are licensed for research use only, and allowed only non-commercial. Models trained using the dataset should not be used outside of research purposes.
|