add info about loading tokenizer
Browse files
README.md
CHANGED
@@ -7,12 +7,12 @@ It is intended for language acquisition research, on a single desktop with a sin
|
|
7 |
|
8 |
## Loading the tokenizer
|
9 |
|
10 |
-
BabyBERTa was trained with `add_prefix_space=
|
11 |
Make sure to load the tokenizer as follows:
|
12 |
|
13 |
```python
|
14 |
tokenizer = RobertaTokenizerFast.from_pretrained("phueb/BabyBERTa",
|
15 |
-
add_prefix_space=
|
16 |
```
|
17 |
|
18 |
### Performance
|
|
|
7 |
|
8 |
## Loading the tokenizer
|
9 |
|
10 |
+
BabyBERTa was trained with `add_prefix_space=True`, so it will not work properly with the tokenizer defaults.
|
11 |
Make sure to load the tokenizer as follows:
|
12 |
|
13 |
```python
|
14 |
tokenizer = RobertaTokenizerFast.from_pretrained("phueb/BabyBERTa",
|
15 |
+
add_prefix_space=True)
|
16 |
```
|
17 |
|
18 |
### Performance
|