mgoin's picture
Update README.md
81f62ea verified
|
raw
history blame
725 Bytes
This model was quantized and pruned with [SparseGPT](https://arxiv.org/abs/2301.00774), using [SparseML](https://github.com/neuralmagic/sparseml).
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "nm-testing/OpenHermes-2.5-Mistral-7B-pruned50"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_id)
inputs = tokenizer("Hello my name is", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=20)
print(tokenizer.batch_decode(outputs)[0])
"""
<s> Hello my name is Katie and I am a student at the University of Gloucestershire. I am currently studying
"""
```