Hardware Requirements

#10
by Ravnoor1 - opened

What is the exact hardware requirement to run this model locally on the machine or VM. Storage,RAM,GPU, cache/buffer etc. Please tell

Owner

This model's architecture is the same as Mistral-7B except it does not have the LM head. A single V100 GPU and 30GB RAM is more than enough to perform inference, but I have not tested the minimum requirements.

This comment has been hidden

I'm not having luck getting it to fit in a v100 16gb. I assume the v100 that should fit this is the 32gb variant?

I'm having trouble getting this to fit on 4x NVIDIA A6000 (ADA) gpus with 48gb VRAM each. Is there something I did wrong with the model setup?

from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
        model_name = "intfloat/e5-mistral-7b-instruct"
        model_kwargs = {"device":"cuda"}
        encode_kwargs = {'normalize_embeddings': False}
        embed_model = HuggingFaceEmbeddings(
            model_name=model_name,
            model_kwargs=model_kwargs,
            encode_kwargs=encode_kwargs,
            multi_process=True
            show_progress=True
        )
        collection_name="local_huggingface_doc_embedding"
    with open('data.pkl', 'rb') as file:
         data = pickle.load(file)
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=30)
    docs = text_splitter.split_documents(data)
    vectorstore = Chroma.from_documents(
         collection_name=collection_name, documents=docs, embedding=embed_model)
    retriever = vectorstore.as_retriever(search_kwargs={"k":10})

Sign up or log in to comment