Include Massed Compute VM with Steps
#1
by
nic-mc
- opened
README.md
CHANGED
@@ -101,6 +101,37 @@ Hardware kindly provided by [Massed Compute](https://massedcompute.com/?utm_sour
|
|
101 |
|
102 |
Only the train splits were used (if a split was provided), and an additional pass of decontamination is performed using approximate nearest neighbor search (via faiss).
|
103 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
104 |
## Prompt formatting
|
105 |
|
106 |
In sticking with the theme of the bagel, I didn't want to use a single prompt format, so I used 4 - vicuna, llama-2, alpaca, and chat-ml (sorta).
|
|
|
101 |
|
102 |
Only the train splits were used (if a split was provided), and an additional pass of decontamination is performed using approximate nearest neighbor search (via faiss).
|
103 |
|
104 |
+
## How to easily download and use this model
|
105 |
+
[Massed Compute](https://massedcompute.com/?utm_source=huggingface&utm_creative_format=model_card&utm_content=creator_jon) has created a Virtual Machine (VM) pre-loaded with TGI and Text Generation WebUI.
|
106 |
+
|
107 |
+
1) For this model rent the [Jon Durbin 4xA6000](https://shop.massedcompute.com/products/jon-durbin-4x-a6000?utm_source=huggingface&utm_creative_format=model_card&utm_content=creator_jon) Virtual Machine
|
108 |
+
2) After you start your rental you will receive an email with instructions on how to Login to the VM
|
109 |
+
3) Once inside the VM, open the terminal and run `conda activate text-generation-inference`
|
110 |
+
4) Then `cd Desktop/text-generation-inference/`
|
111 |
+
5) Run `volume=$PWD/data`
|
112 |
+
6) Run`model=jondurbin/bagel-8x7b-v0.2`
|
113 |
+
7) `sudo docker run --gpus '"device=0,1,2,3"' --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model`
|
114 |
+
8) The model will take some time to load...
|
115 |
+
9) Once loaded the model will be available on port 8080
|
116 |
+
|
117 |
+
Sample command within the VM
|
118 |
+
```
|
119 |
+
curl 0.0.0.0:8080/generate \
|
120 |
+
-X POST \
|
121 |
+
-d '{"inputs":"<|system|>You are a friendly chatbot.\n<|user|>What type of model are you?\n<|assistant|>","parameters":{"do_sample": true, "max_new_tokens": 100, "repetition_penalty": 1.15, "temperature": 0.7, "top_k": 20, "top_p": 0.9, "best_of": 1}}'\
|
122 |
+
-H 'Content-Type: application/json'
|
123 |
+
```
|
124 |
+
|
125 |
+
You can also access the model from outside the VM
|
126 |
+
```
|
127 |
+
curl IP_ADDRESS_PROVIDED_BY_MASSED_COMPUTE_VM:8080/generate \
|
128 |
+
-X POST \
|
129 |
+
-d '{"inputs":"<|system|>You are a friendly chatbot.\n<|user|>What type of model are you?\n<|assistant|>","parameters":{"do_sample": true, "max_new_tokens": 100, "repetition_penalty": 1.15, "temperature": 0.7, "top_k": 20, "top_p": 0.9, "best_of": 1}}'\
|
130 |
+
-H 'Content-Type: application/json
|
131 |
+
```
|
132 |
+
|
133 |
+
For assistance with the VM join the [Massed Compute Discord Server](https://discord.gg/Mj4YMQY3DA)
|
134 |
+
|
135 |
## Prompt formatting
|
136 |
|
137 |
In sticking with the theme of the bagel, I didn't want to use a single prompt format, so I used 4 - vicuna, llama-2, alpaca, and chat-ml (sorta).
|