File size: 731 Bytes
2ce9990
 
 
a31484b
 
 
 
 
 
 
 
 
 
 
 
aa9998f
a31484b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
license: apache-2.0
---

A toy Llama adapted from [JackFram/llama-160m](https://huggingface.co/JackFram/llama-160m) with special tokens added.

This checkpoint can be loaded into MASE's `LlamaQuantized`

```python
from transformers.models.llama import LlamaTokenizer
from chop.models.manual.llama_quantized import (
    LlamaQuantizedConfig,
    LlamaQuantizedForCausalLM,
)

name="Cheng98/llama-160m"
tokenizer = LlamaTokenizer.from_pretrained(name)

# override the quant_config to quantized the model
# default does not quantize llama
config = LlamaQuantizedConfig.from_pretrained(
    name,
    # quant_config="./quant_config_na.toml"

)

llama = LlamaQuantizedForCausalLM.from_pretrained(
    name,
    config=config,
)
```