Cheng98
/

llama-160m

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

llama-160m / README.md

Cheng98's picture

Update README.md

aa9998f about 1 year ago

|

history blame contribute delete

No virus

731 Bytes

	---
	license: apache-2.0
	---

	A toy Llama adapted from [JackFram/llama-160m](https://huggingface.co/JackFram/llama-160m) with special tokens added.

	This checkpoint can be loaded into MASE's `LlamaQuantized`

	```python
	from transformers.models.llama import LlamaTokenizer
	from chop.models.manual.llama_quantized import (
	LlamaQuantizedConfig,
	LlamaQuantizedForCausalLM,
	)

	name="Cheng98/llama-160m"
	tokenizer = LlamaTokenizer.from_pretrained(name)

	# override the quant_config to quantized the model
	# default does not quantize llama
	config = LlamaQuantizedConfig.from_pretrained(
	name,
	# quant_config="./quant_config_na.toml"

	)

	llama = LlamaQuantizedForCausalLM.from_pretrained(
	name,
	config=config,
	)
	```