This model is a 4-bit quantized version of the MNN model exported from Llama-2-7b-chat using llm-export.