Text Generation
NeMo
English
nvidia
steerlm
llama2
zhilinw commited on
Commit
6c49012
1 Parent(s): cd31078

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -63,7 +63,7 @@ H100, A100 80GB, A100 40GB
63
 
64
  ## Steps to run inference:
65
 
66
- We demonstrate inference using NVIDIA NeMo Framework, which allows easy model deployment based on [NVIDIA TRT-LLM](https://github.com/NVIDIA/TensorRT-LLM), a highly optimized inference solution focussing on high throughput and low latency.
67
 
68
  Pre-requisite: you would need at least a machine with 4 40GB or 2 80GB NVIDIA GPUs, and 300GB of free disk space.
69
 
 
63
 
64
  ## Steps to run inference:
65
 
66
+ We demonstrate inference using NVIDIA NeMo Framework, which allows hassle-free model deployment based on [NVIDIA TRT-LLM](https://github.com/NVIDIA/TensorRT-LLM), a highly optimized inference solution focussing on high throughput and low latency.
67
 
68
  Pre-requisite: you would need at least a machine with 4 40GB or 2 80GB NVIDIA GPUs, and 300GB of free disk space.
69