Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,37 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
|
5 |
+
# llama-70b-chat-4-shards
|
6 |
+
|
7 |
+
Github Link: https://github.com/mexiQQ/llama-70b-chat-4-shards
|
8 |
+
|
9 |
+
This repository contains a script to transform the weights of Llama v2 70B (chat) from an 8-shard configuration to a 4-shard configuration, making it more accessible for users with machines that have only 4 GPUs.
|
10 |
+
|
11 |
+
For convenience, you can directly download from https://huggingface.co/Jinawei/llama-v2-70b-chat-4-shards
|
12 |
+
|
13 |
+
## Introduction
|
14 |
+
|
15 |
+
Meta recently released the weights for Llama v2 70B, distributed across 8 shards. However, some users may have hardware constraints, such as having only 4 GPUs on their machine, making it difficult to load and utilize the model directly. This repository provides a solution to this problem by offering a script that can transform the 8-shard weight distribution of LLAMA v2 70B into a 4-shard configuration, facilitating easier usage on machines with fewer GPUs.
|
16 |
+
|
17 |
+
## Usage
|
18 |
+
|
19 |
+
```
|
20 |
+
python convert.py \
|
21 |
+
--input_llama_path ~/llama-2-70b-chat \
|
22 |
+
--input_shards 8 \
|
23 |
+
--output_llama_path ~/llama-2-70b-chat-4-shards \
|
24 |
+
--output_shards 4
|
25 |
+
```
|
26 |
+
|
27 |
+
## Star as Activation
|
28 |
+
If this script proves useful in your work or projects, please consider giving it a star on GitHub. Your support helps to make the project more visible, encourages future development, and is greatly appreciated
|
29 |
+
|
30 |
+
## Acknowledgements
|
31 |
+
|
32 |
+
- Thanks to Meta for releasing the Llama v2 70B weights.
|
33 |
+
- Any other acknowledgements or credits you'd like to give.
|
34 |
+
|
35 |
+
## Contact
|
36 |
+
|
37 |
+
For any inquiries or to report issues, please open an issue on this repository.
|