Text Generation
Transformers
PyTorch
English
llama
Inference Endpoints
text-generation-inference
TheBloke's picture
Change cache = true in config.json to significantly boost inference performance
b8eb946