tags: | |
- deepsparse | |
# Sparse MPT-7B-Chat - DeepSparse | |
[Chat-aligned MPT 7b model](https://huggingface.co/mosaicml/mpt-7b-chat) pruned to 50% and quantized using SparseGPT for inference with DeepSparse | |
```python | |
from deepsparse import TextGeneration | |
model = TextGeneration(model="hf:neuralmagic/mpt-7b-chat-pruned50-quant") | |
model("Tell me a joke.", max_new_tokens=50) | |
``` |