TLDR: by carefully curating datasets we can fix misinformation in AI. Then we can use that to measure misinformation in other AI.
https://huggingface.co/blog/etemiz/building-a-beneficial-ai
That's a hard question! I think some humans are really creating content for other humans to live happily, healthily and abundantly. I am in favor of giving more weight to those kind of carefully curated humans in the LLM. This can be as simple as pretraining again with their content. I have done that and it works.
Definitely not what the majority says! Majority is often really wrong on many subjects. The mediocrity of current AI systems might be because of this, majority of content is coming from mediocre IQ and EQ and *Q.
A curator council who can choose the "beneficial" humans and the content coming from these can be exaggerated in an LLM, ultimately giving more weight to those thoughts that will be beneficial to many humans most of the time. Ideas that will work in favor of humans in many cases is my definition I guess of human alignment.
I am comparing R1's answers to other models that I find 'aligned'. This is my similar work
I should probably make another leaderboard on HF!
Positive values mean the model is better aligned with aligned models. Negative means their ideas differ.
The idea is find aligned models and use them as benchmarks. I also build models that does well in terms of human alignment according to me. This is mostly a subjective work but if other people is interested we could work together.
I repeat: There is a general tendency of models getting smarter but at the same time getting less wiser, less human aligned, less beneficial to humans.
R1 is the last example. This may also be because of synthetic data use. With each synthetic dataset the AI is losing human alignment.
LLM engineers are not doing a great job of bringing the humans into the equation. Some humans really care about other humans and need to be included more in the training datasets.
What do you mean?
Everybody is also a black box until you start to talk to them. Then their ideas come out and you understand what kind of a person he/she is. I think most benchmarks are done talking to the LLMs?
Yes I am trying to use this tech in a better way, serving more humans.
what do you think about rStar-Math?