[Cache Request] aws-neuron/Llama-2-7b-hf-neuron-throughput

#57
by Gerald001 - opened

Please add the following model to the neuron cache

AWS Inferentia and Trainium org

This neuron model has been deprecated because Llama-2-7b-hf is now present in the cache and can be deployed directly.

dacorvo changed discussion status to closed

Sign up or log in to comment