The model in this repository utilizes Mistral-7B-Instruct-v0.1 (https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1), the mlc-llm (https://llm.mlc.ai/docs/) Metal version with 4-bit quantization and an embedding layer for MLC embedding. You have the option to use the FastAPI server instead of OpenAI to run the model locally. For using in langchain, please refer to the sample_langchain.py file in the following GitHub link: https://github.com/mlc-ai/mlc-llm/blob/main/examples/rest/python/sample_langchain.py.

Environment setup

conda create -n mlc-chat-venv -c mlc-ai -c conda-forge mlc-chat-cli-nightly

conda activate mlc-chat-venv

Fast API Server

python -m mlc_chat.rest --model Mistral-7B-Instruct-v0.1-q4f16_1/ --lib-path Mistral-7B-Instruct-v0.1-q4f16_1/Mistral-7B-Instruct-v0.1-q4f16_1-metal.so

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .