## Change rope scaling to match max embedding size

#16

by
Blackroot
- opened

Rope theta appears to be configured for 32k context length, hitting a max position embedding of 131072 using the formula from https://arxiv.org/pdf/2310.05209 Given:

β = 1,000,000^(log_(T_train / 2π) (T_new / 2π))

Where t_train is 32768 (the old context length) and t_extra is 131072 (the new context length) solving the eq gives us 9,370,821 for the new value.

```
import math
def calculate_beta():
# Log_[t_train/2pi](t_new/2pi)
inner = math.log((8192*4*4) / (2 * math.pi), (8192*4) / (2 * math.pi))
# base ^ inner
beta = 1000000 ** inner
return beta
```

Blackroot
changed pull request title from

**Update config.json**to**Change rope scaling to match max embedding size**