# Llama-2-70B-Chat-GGUF-tokenizer-legacy

## Tokenizer for llama-2-70b-chat

This repository contains the following files: special_tokens_map.json, tokenizer_config.json, tokenizer.json, and tokenizer.model. These files are used to load a llama.cpp model as a HuggingFace Transformers model using [llamacpp_HF loader](https://github.com/oobabooga/text-generation-webui/blob/main/modules/llamacpp_hf.py).

Note: converted using [convert_llama_weights_to_hf.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py) with legacy method.

## How to use with [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui)

1. Download a .gguf file from [TheBloke/Llama-2-70B-Chat-GGUF](https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGUF) based on your preferred quantization method;

2. Place your .gguf in a subfolder of models/ along with these 4 files.