optimum-nvidia 's Collections

H100 Optimized TensorRT-LLM Models

Nvidia H100 Tensor Cores GPU optimized inference engines. These engines can potentially leverage the `float8` data type to speed up computations

This collection has no items.