Edit model card

6.0 bit exl2 quant (8 vut head) of Fireworks Hermes 2.5 fine tune of Mixtral-8x22b

Use Vicuna prompt template

needs ~ 120GB vRam (2xA100 or 3X RTX 6000)

Downloads last month
2
Inference API
Input a message to start chatting with bdambrosio/Mixtral-8x22b-instruct-oh-6.0bpw-exl2.
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.