elinas
/

llama-30b-int4

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Resources

View closed (1)

I'm getting 0.4 tokens/s on a 4090.

#2 opened almost 2 years ago by