petermcaughan commited on
Commit
cceea9d
1 Parent(s): a44ba12

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -35,14 +35,14 @@ Below is average latency of generating a token using a prompt of varying size us
35
 
36
  | Prompt Length | Batch Size | PyTorch 2.1 torch.compile | ONNX Runtime CUDA |
37
  |-------------|------------|----------------|-------------------|
38
- | 16 | 1 | N/A | N/A |
39
- | 256 | 1 | N/A | N/A |
40
- | 1024 | 1 | N/A | N/A |
41
- | 2048 | 1 | N/A | N/A |
42
- | 16 | 4 | N/A | N/A |
43
- | 256 | 4 | N/A | N/A |
44
- | 1024 | 4 | N/A | N/A |
45
- | 2048 | 4 | N/A | N/A |
46
 
47
  ## Usage Example
48
 
 
35
 
36
  | Prompt Length | Batch Size | PyTorch 2.1 torch.compile | ONNX Runtime CUDA |
37
  |-------------|------------|----------------|-------------------|
38
+ | 32 | 1 | 53.64ms | 15.68ms |
39
+ | 256 | 1 | 59.55ms | 26.05ms |
40
+ | 1024 | 1 | 89.82ms | 99.05ms |
41
+ | 2048 | 1 | 208.0ms | 227.0ms |
42
+ | 32 | 4 | 70.8ms | 19.62ms |
43
+ | 256 | 4 | 78.6ms | 81.29ms |
44
+ | 1024 | 4 | 373.7ms | 369.6ms |
45
+ | 2048 | 4 | N/A | 879.2ms |
46
 
47
  ## Usage Example
48