Higher perplexity than Meta-Llama-3-70B-Instruct? Meta-Llama-3-8B-Instruct-abliterated was lower.

by matatonic - opened May 29, 2024

May 29, 2024

I found that Meta-Llama-3-8B-Instruct-abliterated had a lower perplexity than Meta-Llama-3-8B-Instruct - which was AMAZING and was expecting the same here, but it's not - it's higher - any idea how to reproduce the lower perplexity results from the 8B models? Some key difference in how it was applied?

Doctor-Shotgun

May 30, 2024

Honestly, I don't think perplexity is even a relevant metric here, other than as a sanity check that you didn't omega break the model and send it to infinity

yttria

Jul 22, 2024

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment