beomi
/

llama-2-ko-7b

@@ -146,6 +146,32 @@ TBD
 *Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
 ---
 > Below is the original model card of the Llama-2 model.

 *Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
+## Note for oobabooga/text-generation-webui
+Remove `ValueError` at `load_tokenizer` function(line 109 or near), in `modules/models.py`.
+```python
+diff --git a/modules/models.py b/modules/models.py
+index 232d5fa..de5b7a0 100644
+--- a/modules/models.py
++++ b/modules/models.py
+@@ -106,7 +106,7 @@ def load_tokenizer(model_name, model):
+                 trust_remote_code=shared.args.trust_remote_code,
+                 use_fast=False
+             )
+-        except ValueError:
++        except:
+             tokenizer = AutoTokenizer.from_pretrained(
+                 path_to_model,
+                 trust_remote_code=shared.args.trust_remote_code,
+```
+Since Llama-2-Ko uses FastTokenizer provided by HF tokenizers NOT sentencepiece package,
+it is required to use `use_fast=True` option when initialize tokenizer.
+Apple Sillicon does not support BF16 computing, use CPU instead. (BF16 is supported when using NVIDIA GPU)
 ---
 > Below is the original model card of the Llama-2 model.