can this model run on A800 ?
2
#10 opened 18 days ago
by
wang35
FP4 in attention proj
2
#9 opened 28 days ago
by
yoursmin
can this model run on Hopper GPU
6
#8 opened 29 days ago
by
simonlindelta

Can this model work with vLLM?
3
#7 opened about 1 month ago
by
KimChen

Request for Detailed Benchmarking Setup with TensorRT-LLM on B200
1
#6 opened about 1 month ago
by
StardusterLiu

Benchmark results compared to orig fp8 / int4 quants etc?
5
#1 opened about 1 month ago
by
CHNtentes