aws-neuron
/

optimum-neuron-cache

Model card Files Files and versions Community

optimum-neuron-cache / inference-cache-config

8 contributors

History: 29 commits

Jingya's picture

Jingya HF staff

Temporarily remove SD 1.5 from Runway

a74d412 verified 3 months ago

gpt2.json

398 Bytes

Add more gpt2 configurations 8 months ago
llama-variants.json

2.63 kB

Update inference-cache-config/llama-variants.json 6 months ago
llama2-70b.json

287 Bytes

Create llama2-70b.json 6 months ago
llama2-7b-13b.json

2.02 kB

Rename inference-cache-config/llama2.json to inference-cache-config/llama2-7b-13b.json 6 months ago
llama3-70b.json

283 Bytes

Create llama3-70b.json 6 months ago
llama3-8b.json

883 Bytes

Rename inference-cache-config/llama3.json to inference-cache-config/llama3-8b.json 6 months ago
mistral-variants.json

3.29 kB

Remove SalesForce embedding model 9 months ago
mistral.json

1.46 kB

Add more batch_size for mistral on smaller instances 7 months ago
mixtral.json

294 Bytes

Create mixtral.json 6 months ago
stable-diffusion.json

1.54 kB

Temporarily remove SD 1.5 from Runway 3 months ago