Initial Upload

# Llama-2-70B-Chat-GGUF-tokenizer-legacy

## Tokenizer for llama-2-70b-chat

This repository contains the following files: special_tokens_map.json, tokenizer_config.json, tokenizer.json, and tokenizer.model. These files are used to load a llama.cpp model as a HuggingFace Transformers model using [llamacpp_HF loader](https://github.com/oobabooga/text-generation-webui/blob/main/modules/llamacpp_hf.py).

Note: converted using [convert_llama_weights_to_hf.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py) with legacy method.

## How to use with [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui)

1. Download a .gguf file from [TheBloke/Llama-2-70B-Chat-GGUF](https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGUF) based on your preferred quantization method;

2. Place your .gguf in a subfolder of models/ along with these 4 files.

Files changed (5) hide show

README.md +14 -3
special_tokens_map.json +23 -0
tokenizer.json +0 -0
tokenizer.model +3 -0
tokenizer_config.json +41 -0

README.md CHANGED Viewed

@@ -1,3 +1,14 @@
----
-license: llama2
----

+# Llama-2-70B-Chat-GGUF-tokenizer-legacy
+## Tokenizer for llama-2-70b-chat
+This repository contains the following files: special_tokens_map.json, tokenizer_config.json, tokenizer.json, and tokenizer.model. These files are used to load a llama.cpp model as a HuggingFace Transformers model using [llamacpp_HF loader](https://github.com/oobabooga/text-generation-webui/blob/main/modules/llamacpp_hf.py).
+Note: converted using [convert_llama_weights_to_hf.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py) with legacy method.
+## How to use with [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui)
+1. Download a .gguf file from [TheBloke/Llama-2-70B-Chat-GGUF](https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGUF) based on your preferred quantization method;
+2. Place your .gguf in a subfolder of models/ along with these 4 files.

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,23 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
+size 499723

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,41 @@

+{
+  "add_bos_token": true,
+  "add_eos_token": false,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "</s>",
+  "legacy": true,
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": null,
+  "sp_model_kwargs": {},
+  "spaces_between_special_tokens": false,
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": "<unk>",
+  "use_default_system_prompt": false
+}