yunmorning
commited on
Commit
•
01bd3fb
1
Parent(s):
55f5a6d
Update docker run command
Browse files
README.md
CHANGED
@@ -49,7 +49,6 @@ This model is compatible with **[Friendli Container](https://friendli.ai/product
|
|
49 |
- Before you begin, make sure you have signed up for [Friendli Suite](https://suite.friendli.ai/). **You can use Friendli Containers free of charge for four weeks.**
|
50 |
- Prepare a Personal Access Token following [this guide](#preparing-personal-access-token).
|
51 |
- Prepare a Friendli Container Secret following [this guide](#preparing-container-secret).
|
52 |
-
- Install Hugging Face CLI with `pip install -U "huggingface_hub[cli]"`
|
53 |
|
54 |
### Preparing Personal Access Token
|
55 |
|
@@ -88,25 +87,16 @@ You should pass the container secret as an environment variable to run the conta
|
|
88 |
Once you've prepared the image of Friendli Container, you can launch it to create a serving endpoint.
|
89 |
|
90 |
```sh
|
91 |
-
export MODEL_DIR=$PWD/FriendliAI--Llama-2-70b-chat-hf-fp8
|
92 |
-
export FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET"
|
93 |
-
export FRIENDLI_CONTAINER_IMAGE="registry.friendli.ai/trial"
|
94 |
-
export GPU_ENUMERATION='"device=0,1"'
|
95 |
-
|
96 |
-
huggingface-cli download FriendliAI/Llama-2-70b-chat-hf-fp8 \
|
97 |
-
--local-dir $MODEL_DIR \
|
98 |
-
--local-dir-use-symlinks False
|
99 |
-
|
100 |
docker run \
|
101 |
-
--gpus
|
102 |
-
-
|
103 |
-
-
|
104 |
-
|
105 |
-
"
|
106 |
-
|
107 |
-
--
|
108 |
-
--
|
109 |
-
--
|
110 |
```
|
111 |
|
112 |
---
|
@@ -146,7 +136,7 @@ Meta developed and publicly released the Llama 2 family of large language models
|
|
146 |
|
147 |
**License** A custom commercial license is available at: [https://ai.meta.com/resources/models-and-libraries/llama-downloads/](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
|
148 |
|
149 |
-
**Research Paper** ["Llama-2: Open Foundation and Fine-tuned Chat Models"](arxiv.org/abs/2307.09288)
|
150 |
|
151 |
## Intended Use
|
152 |
**Intended Use Cases** Llama 2 is intended for commercial and research use in English. Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks.
|
|
|
49 |
- Before you begin, make sure you have signed up for [Friendli Suite](https://suite.friendli.ai/). **You can use Friendli Containers free of charge for four weeks.**
|
50 |
- Prepare a Personal Access Token following [this guide](#preparing-personal-access-token).
|
51 |
- Prepare a Friendli Container Secret following [this guide](#preparing-container-secret).
|
|
|
52 |
|
53 |
### Preparing Personal Access Token
|
54 |
|
|
|
87 |
Once you've prepared the image of Friendli Container, you can launch it to create a serving endpoint.
|
88 |
|
89 |
```sh
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
90 |
docker run \
|
91 |
+
--gpus '"device=0,1"' \
|
92 |
+
-p 8000:8000 \
|
93 |
+
-v ~/.cache/huggingface:/root/.cache/huggingface \
|
94 |
+
-e FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET" \
|
95 |
+
-e HF_TOKEN="YOUR HUGGING FACE TOKEN" \
|
96 |
+
registry.friendli.ai/trial \
|
97 |
+
--web-server-port 8000 \
|
98 |
+
--hf-model-name meta-llama/Llama-2-70b-chat-hf-fp8 \
|
99 |
+
--num-devices 2 # Use tensor parallelism degree 2
|
100 |
```
|
101 |
|
102 |
---
|
|
|
136 |
|
137 |
**License** A custom commercial license is available at: [https://ai.meta.com/resources/models-and-libraries/llama-downloads/](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
|
138 |
|
139 |
+
**Research Paper** ["Llama-2: Open Foundation and Fine-tuned Chat Models"](https://arxiv.org/abs/2307.09288)
|
140 |
|
141 |
## Intended Use
|
142 |
**Intended Use Cases** Llama 2 is intended for commercial and research use in English. Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks.
|