Text Generation
Transformers
Safetensors
qwen2
conversational
text-generation-inference
Inference Endpoints
jondurbin commited on
Commit
05db6fd
1 Parent(s): bb0527d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -8
README.md CHANGED
@@ -580,26 +580,20 @@ Experiment, and find out what works and doesn't.
580
  2) After you created your account update your billing and navigate to the deploy page.
581
  3) Select the following
582
  - GPU Type: A6000
583
- - GPU Quantity: 2
584
  - Category: Creator
585
  - Image: Jon Durbin
586
  - Coupon Code: JonDurbin
587
  4) Deploy the VM!
588
  5) Navigate to 'Running Instances' to retrieve instructions to login to the VM
589
  6) Once inside the VM, open the terminal and run `volume=$PWD/data`
590
- 7) Run `model=jondurbin/airoboros-34b-3.3`
591
  8) `sudo docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model`
592
  9) The model will take some time to load...
593
  10) Once loaded the model will be available on port 8080
594
 
595
  For assistance with the VM join the [Massed Compute Discord Server](https://discord.gg/Mj4YMQY3DA)
596
 
597
- ### Latitude.sh
598
-
599
- [Latitude](https://www.latitude.sh/r/4BBD657C) has h100 instances available (as of today, 2024-02-08) for $3/hr!
600
-
601
- They have a few blueprints available for testing LLMs, but a single h100 should be plenty to run this model with 8k ctx.
602
-
603
  ## Support me
604
 
605
  - https://bmc.link/jondurbin
 
580
  2) After you created your account update your billing and navigate to the deploy page.
581
  3) Select the following
582
  - GPU Type: A6000
583
+ - GPU Quantity: 4
584
  - Category: Creator
585
  - Image: Jon Durbin
586
  - Coupon Code: JonDurbin
587
  4) Deploy the VM!
588
  5) Navigate to 'Running Instances' to retrieve instructions to login to the VM
589
  6) Once inside the VM, open the terminal and run `volume=$PWD/data`
590
+ 7) Run `model=jondurbin/airoboros-110b-3.3`
591
  8) `sudo docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model`
592
  9) The model will take some time to load...
593
  10) Once loaded the model will be available on port 8080
594
 
595
  For assistance with the VM join the [Massed Compute Discord Server](https://discord.gg/Mj4YMQY3DA)
596
 
 
 
 
 
 
 
597
  ## Support me
598
 
599
  - https://bmc.link/jondurbin