How is this different from v1?

by amgadhasan - opened Dec 11, 2023

Dec 11, 2023

Title

Dec 11, 2023

it seems they changed rope theta to 1e6 for all their models.

Yuuru

Dec 11, 2023

32k context

Dec 11, 2023

@Yuuru What is the source of this information?

Dec 11, 2023

it seems they changed rope theta to 1e6 for all their models.

They also set "sliding_window" to null for some reason.

@Yuuru What is the source of this information?

The config.json file. (it's the same context size as the previous version)

Dec 12, 2023

@mrfakename , vllm says it btw when loading the model

 […] max_seq_len=32768, download_dir=None, load_format=auto, tensor_parallel_size=1, quantization=None, seed=0

Dec 12, 2023

Yeah it would be interesting to understand how it's actually different from the first one.

Tokie

Jan 10

It is a lot less obedient, for one: v0.1 refuses to answer a sixth of my test prompts, while v0.2 refuses to answer three‑quarters of them.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment