Legal/law expert version

#7
by Adikul - opened

Can you extract a expert like for legal or law case work?

Owner

@Adikul Unfortunately that’s not how these experts work. These experts actually learn to perfect grammar instead of a specific topic. That’s why I hate the term “expert”. For example, one expert might learn when to put a space when expert might learn when to put symbols one expert might learn when to write words. But no single expert is going to get specifically good at let’s say coding or math or role-playing for example. The whole point of MOE is to make the computation cheaper since you don’t have to process the entire model per token(s) even though you still need to load the entire model into RAM.

Here is some sources you can read if you would like to understand them a bit further. The mixtral paper has a graph showing what each 7b expert got good at, and spoiler alert its none of them are good at one specific topic.

https://huggingface.co/blog/moe#what-does-an-expert-learn

https://arxiv.org/pdf/2401.04088.pdf

Sign up or log in to comment