mosaicml/mpt-7b · Running on single Nvidia K80 GPU with large context to generate long output

May 19, 2023

Hi there

I previously used dolly-v2-3b and dolly-v2-7b to try and generate long text but it kept regenerating the same text over and over.

My scenario is that I need to find relevant documents using FAISS vector search where k=10/k=20 etc and expect model to generate similar documents using Prompt with data given in the prompt.

Can I use MPT-7b to generate this type of document with long context on a single GPU Azure cloud instance?

VM Size
Standard_NC6 (6 cores, 56 GB RAM, 380 GB disk) With Nvidia K80 GPU

sam-mosaic

May 23, 2023

The model should fit onto a k80. You'll need to use standard torch attention. If you instantiate the model with max_seq_len=4096 you should be able to get sequences twice as long as the dolly models you were trying.

Depending on the type of documents this could work? Really depends how similar the task is to pretraining data.

If you get repetitive output, try searching for a good no_repeat_ngram_size (somewhere between 3-9) and repetition_penalty(somewhere between 1.01 and 1.2), as well as increasing the temperature.

sam-mosaic changed discussion status to closed May 23, 2023

airtable

May 24, 2023

Thanks @sam-mosaic , appreciate your response, I will try to see how to configure these settings and try again

airtable changed discussion status to open May 24, 2023

abhi-mosaic

Jun 3, 2023

Closing for now as this issue has gone stale

abhi-mosaic changed discussion status to closed Jun 3, 2023