beomi commited on
Commit
b5ef04e
β€’
1 Parent(s): b3d5578

Fix typo on tokenize example

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -43,7 +43,7 @@ Llama-2-Ko is an auto-regressive language model that uses an optimized transform
43
  - New vocab and merges, trained with Korean Corpus
44
  - Tokenizer Examples: Llama-2 vs **Llama-2-Ko**
45
  - Use the same tokenization for English, but a shorter/merged tokenization for Korean.
46
- - Tokenize "μ•ˆλ…•ν•˜μ„Έμš”, μ˜€λŠ˜μ€ 날씨가 μ°Έ μ’‹λ„€μš”."
47
  - Llama-2:
48
  ```
49
  ['▁', 'μ•ˆ', '<0xEB>', '<0x85>', '<0x95>', 'ν•˜', 'μ„Έ', 'μš”', ',', '▁', '였', '<0xEB>', '<0x8A>', '<0x98>', '은', '▁', '<0xEB>', '<0x82>', '<0xA0>', '씨', 'κ°€', '▁', '<0xEC>', '<0xA2>', '<0x8B>', '<0xEB>', '<0x84>', '<0xA4>', 'μš”']
 
43
  - New vocab and merges, trained with Korean Corpus
44
  - Tokenizer Examples: Llama-2 vs **Llama-2-Ko**
45
  - Use the same tokenization for English, but a shorter/merged tokenization for Korean.
46
+ - Tokenize "μ•ˆλ…•ν•˜μ„Έμš”, μ˜€λŠ˜μ€ 날씨가 μ’‹λ„€μš”."
47
  - Llama-2:
48
  ```
49
  ['▁', 'μ•ˆ', '<0xEB>', '<0x85>', '<0x95>', 'ν•˜', 'μ„Έ', 'μš”', ',', '▁', '였', '<0xEB>', '<0x8A>', '<0x98>', '은', '▁', '<0xEB>', '<0x82>', '<0xA0>', '씨', 'κ°€', '▁', '<0xEC>', '<0xA2>', '<0x8B>', '<0xEB>', '<0x84>', '<0xA4>', 'μš”']