--- license: apache-2.0 --- A toy Llama adapted from [JackFram/llama-160m](https://huggingface.co/JackFram/llama-160m) with special tokens added. This checkpoint can be loaded into MASE's `LlamaQuantized` ```python from transformers.models.llama import LlamaTokenizer from chop.models.manual.llama_quantized import ( LlamaQuantizedConfig, LlamaQuantizedForCausalLM, ) name="Cheng98/llama-160m" tokenizer = LlamaTokenizer.from_pretrained(name) # override the quant_config to quantized the model # default does not quantize llama config = LlamaQuantizedConfig.from_pretrained( name, # quant_config="./quant_config_na.toml" ) llama = LlamaQuantizedForCausalLM.from_pretrained( name, config=config, ) ```