@Jaward on Hugging Face: "When untrained tokens play "catch me if you can" the Fishing For Margikarp…"

Post

890

When untrained tokens play "catch me if you can" the Fishing For Margikarp paper is the detective:)
The playbook:
- Inspect token vocab & study encode/decode pattern.
- Brute-force on architecture-dependent indicators (same matrix in token embeddings and final layer) to identify untrained tokens.
- Then verify if identified tokens are out of distribution by prompting a target llm (with no tied threshold).

Quite a bait huh, Cohere:)

Paper: https://arxiv.org/pdf/2405.05417
Code: https://github.com/cohere-ai/magikarp

Join the conversation