Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,7 @@ By leveraging a pre-sparsified model's structure, you can efficiently fine-tune
|
|
27 |
|
28 |
### Running the model
|
29 |
|
30 |
-
This model
|
31 |
|
32 |
```python
|
33 |
# pip install transformers accelerate
|
@@ -37,7 +37,7 @@ tokenizer = AutoTokenizer.from_pretrained("neuralmagic/Llama-2-7b-pruned50-retra
|
|
37 |
model = AutoModelForCausalLM.from_pretrained("neuralmagic/Llama-2-7b-pruned50-retrained-ultrachat", device_map="auto")
|
38 |
|
39 |
input_text = "Write me a poem about Machine Learning."
|
40 |
-
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
|
41 |
|
42 |
outputs = model.generate(**input_ids)
|
43 |
print(tokenizer.decode(outputs[0]))
|
@@ -47,7 +47,7 @@ print(tokenizer.decode(outputs[0]))
|
|
47 |
|
48 |
Model evaluation metrics and results.
|
49 |
|
50 |
-
| Benchmark | Metric | Llama-2-7b | Llama-2-7b-pruned50-retrained-ultrachat |
|
51 |
|------------------------------------------------|---------------|-------------|-------------------------------|
|
52 |
| [MMLU](https://arxiv.org/abs/2009.03300) | 5-shot, top-1 | xxxx | xxxx |
|
53 |
| [HellaSwag](https://arxiv.org/abs/1905.07830) | 0-shot | xxxx | xxxx |
|
|
|
27 |
|
28 |
### Running the model
|
29 |
|
30 |
+
This model may be run with the transformers library. For accelerated inference with sparsity, deploy with [nm-vllm](https://github.com/neuralmagic/nm-vllm) or [deepsparse](https://github.com/neuralmagic/deepsparse).
|
31 |
|
32 |
```python
|
33 |
# pip install transformers accelerate
|
|
|
37 |
model = AutoModelForCausalLM.from_pretrained("neuralmagic/Llama-2-7b-pruned50-retrained-ultrachat", device_map="auto")
|
38 |
|
39 |
input_text = "Write me a poem about Machine Learning."
|
40 |
+
input_ids = tokenizer.apply_chat_template(input_text, add_generation_prompt=True, return_tensors="pt").to("cuda")
|
41 |
|
42 |
outputs = model.generate(**input_ids)
|
43 |
print(tokenizer.decode(outputs[0]))
|
|
|
47 |
|
48 |
Model evaluation metrics and results.
|
49 |
|
50 |
+
| Benchmark | Metric | Llama-2-7b-ultrachat | Llama-2-7b-pruned50-retrained-ultrachat |
|
51 |
|------------------------------------------------|---------------|-------------|-------------------------------|
|
52 |
| [MMLU](https://arxiv.org/abs/2009.03300) | 5-shot, top-1 | xxxx | xxxx |
|
53 |
| [HellaSwag](https://arxiv.org/abs/1905.07830) | 0-shot | xxxx | xxxx |
|