[Cache Request] TheBloke/Llama-2-7B-Chat-GGML

#36
by lou987 - opened

Please add the following model to the neuron cache

AWS Inferentia and Trainium org

GGML models are not supported on Neuron. You can however use meta-llama/Llama-2-7b-chat-hf.

dacorvo changed discussion status to closed

Sign up or log in to comment