neuralmagic
/

OpenHermes-2.5-Mistral-7B-pruned50

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

OpenHermes-2.5-Mistral-7B-pruned50 / README.md

mgoin's picture

Update README.md

81f62ea verified 9 months ago

|

725 Bytes

	This model was quantized and pruned with [SparseGPT](https://arxiv.org/abs/2301.00774), using [SparseML](https://github.com/neuralmagic/sparseml).


	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_id = "nm-testing/OpenHermes-2.5-Mistral-7B-pruned50"
	model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	inputs = tokenizer("Hello my name is", return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=20)
	print(tokenizer.batch_decode(outputs)[0])

	"""
	<s> Hello my name is Katie and I am a student at the University of Gloucestershire. I am currently studying

	"""


	```