Xenova HF staff commited on
Commit
f2253a2
•
1 Parent(s): 4ec0196

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -5,9 +5,9 @@ tags:
5
  - tokenizers
6
  ---
7
 
8
- # GPT-4 Tokenizer
9
 
10
- A 🤗-compatible version of the **GPT-4 tokenizer** (adapted from [openai/tiktoken](https://github.com/openai/tiktoken)). This means it can be used with Hugging Face libraries including [Transformers](https://github.com/huggingface/transformers), [Tokenizers](https://github.com/huggingface/tokenizers), and [Transformers.js](https://github.com/xenova/transformers.js).
11
 
12
  ## Example usage:
13
 
@@ -15,7 +15,7 @@ A 🤗-compatible version of the **GPT-4 tokenizer** (adapted from [openai/tikto
15
  ```py
16
  from transformers import GPT2TokenizerFast
17
 
18
- tokenizer = GPT2TokenizerFast.from_pretrained('Xenova/gpt-4')
19
  assert tokenizer.encode('hello world') == [15339, 1917]
20
  ```
21
 
@@ -23,6 +23,6 @@ assert tokenizer.encode('hello world') == [15339, 1917]
23
  ```js
24
  import { AutoTokenizer } from '@xenova/transformers';
25
 
26
- const tokenizer = await AutoTokenizer.from_pretrained('Xenova/gpt-4');
27
  const tokens = tokenizer.encode('hello world'); // [15339, 1917]
28
  ```
 
5
  - tokenizers
6
  ---
7
 
8
+ # DBRX Instruct Tokenizer
9
 
10
+ A 🤗-compatible version of the **DBRX Instruct** (adapted from [databricks/dbrx-instruct](https://huggingface.co/databricks/dbrx-instruct)). This means it can be used with Hugging Face libraries including [Transformers](https://github.com/huggingface/transformers), [Tokenizers](https://github.com/huggingface/tokenizers), and [Transformers.js](https://github.com/xenova/transformers.js).
11
 
12
  ## Example usage:
13
 
 
15
  ```py
16
  from transformers import GPT2TokenizerFast
17
 
18
+ tokenizer = GPT2TokenizerFast.from_pretrained('Xenova/dbrx-instruct-tokenizer')
19
  assert tokenizer.encode('hello world') == [15339, 1917]
20
  ```
21
 
 
23
  ```js
24
  import { AutoTokenizer } from '@xenova/transformers';
25
 
26
+ const tokenizer = await AutoTokenizer.from_pretrained('Xenova/dbrx-instruct-tokenizer');
27
  const tokens = tokenizer.encode('hello world'); // [15339, 1917]
28
  ```