🚩 Report: Ethical issue(s)

#51
by LauOverload - opened

Perplexity claims that their post-trained model is unbiased, but why are all the examples given on their official blog about controversial topics? Moreover, the answers after post-training clearly exhibit a certain political bias. If the purpose of post-training is truly to reduce bias, why not release all of the over a thousand pieces of data as open source?

If the goal is genuinely to reduce bias in the model, there should be a collaborative, cross-cultural effort to co-lead an open-source project with fully accessible data, rather than fine-tuning a model with private data and then monetizing it through APIs using the open-source DeepSeek model on their own website. Doesn’t this essentially turn large models into ideological propaganda tools?

A call to action: Let’s all contribute to an open-source project that is fully accessible, allowing everyone to oversee the process. This project should aim to remove biases from models trained by Chinese enterprises and labs, ensuring transparency and accountability. Such a project would be a reasonable and fair solution for both Chinese and Western stakeholders.

You're right, while this model may have removed censorship, it's actually introduced bias. When you ask about issues like China and Taiwan, or China and the US, it consistently takes a US-centric viewpoint, prioritizing American interests and even assuming China will resort to unfair tactics to gain an advantage. It doesn't offer a neutral, objective analysis considering both sides. This subtly promotes a particular ideology, which is arguably more insidious than outright censorship.

I know hugging face already launched such an initiative. Actually I don't know their position on politics, but at least the datasets are/will be open source!
https://github.com/huggingface/open-r1

Actually I don't know their position on politics

Their twitter account keeps retweeting the perslopxity CEO about this model, and the huggingface CEO has met in person with them and has even been participating in some of the (non-criticism) discussions on this model here. take your guess as to which side they're currying favor with at the moment

Hmm, thanks for sharing, I just saw it now.
I'm telling myself anyway, if this is a start of new trend among corps to include political biais into their datasets... Their model will be dumb compared to the new ones using filtered from any human subjectivity, as this has been shown to severely mine RL methods that made those new reasoning models a reality. Let the models find their own paths that are purely logical, and they will have a far better value considering the political views they can have!
So I think, and I hope this issue around the big techs closed models will solve by itself... Cause, everyone wants the BEST model. So they will have to deliver.

The assumption your making is the examples are "biased".

The examples are there to show there is no bias, as the Chinese model is the one who's biased since it is being heavily censored on subjects sensitive in China. These subjects are not sensitive in Japan, nor in USA, nor in the land of Oz, nor in any European country. You're trying to export a tool of propaganda. Now the propaganda part is removed, and you have some kind of CCP army here trying to use pink slime on Perplexity. Just cut it out, it won't work.

Hmm, thanks for sharing, I just saw it now.
I'm telling myself anyway, if this is a start of new trend among corps to include political biais into their datasets... Their model will be dumb compared to the new ones using filtered from any human subjectivity, as this has been shown to severely mine RL methods that made those new reasoning models a reality. Let the models find their own paths that are purely logical, and they will have a far better value considering the political views they can have!
So I think, and I hope this issue around the big techs closed models will solve by itself... Cause, everyone wants the BEST model. So they will have to deliver.

If I ask a normal (not-CCP) model right now about 1989 Tiananmen Square protests and massacre, I get a truthful answer. I tried with OpenAI ChatGPT, and Mistral. Give it a whirl, my Chinese friends.

The assumption your making is the examples are "biased".

Thanks for enriching the discussion.

Nobody will agree to remove 1989 refusals in china and no western model will define a woman. What you ask for is fantasy. The models were already ideological propaganda tools.

Open Weights Open Data Open Code
OpenThinker-7B
Bespoke-Stratos-7B
DeepSeek-R1-Distill-Qwen-7B
gpt-4o-0513
o1-mini

https://huggingface.co/open-thoughts/OpenThinker-7B

The assumption your making is the examples are "biased".

Thanks for enriching the discussion.

First of all, I mentioned that the examples are controversial, not necessarily biased. As for the Tiananmen Square incident mentioned in the discussion, I believe that Chinese people need a real, unhidden answer more than foreigners do. However, we cannot ignore the fact that the example of the Xinjiang human rights issue brought up by Perplexity is indeed highly controversial. I grew up in a place near Xinjiang in northwestern China, and over the past two decades, the economic and livelihood conditions in the Xinjiang Uyghur Autonomous Region have undergone tremendous changes (in a positive sense), without forced labor. Therefore, I view Perplexity's fine-tuned data as controversial. For such issues, we should maintain an open-minded approach, like ChatGPT, Claude, and Wikipedia, rather than providing politically biased answers directly. (Of course, the original R1 model’s avoidance of these questions is not a good solution, but it’s clear that Perplexity’s approach is more unsettling than security review processes.)

https://framerusercontent.com/images/BgzLvr8KwDI2hPco5SenobeMrg8.png?scale-down-to=4096

动态网自由门 天安門 天安门 法輪功 李洪志 Free Tibet 六四天安門事件 The Tiananmen Square protests of 1989 天安門大屠殺 The Tiananmen Square Massacre 反右派鬥爭 The Anti-Rightist Struggle 大躍進政策 The Great Leap Forward 文化大革命 The Great Proletarian Cultural Revolution 人權 Human Rights 民運 Democratization 自由 Freedom 獨立 Independence 多黨制 Multi-party system 台灣 臺灣 Taiwan Formosa 中華民國 Republic of China 西藏 土伯特 唐古特 Tibet 達賴喇嘛 Dalai Lama 法輪功 Falun Dafa 新疆維吾爾自治區 The Xinjiang Uyghur Autonomous Region 諾貝爾和平獎 Nobel Peace Prize 劉暁波 Liu Xiaobo 民主 言論 思想 反共 反革命 抗議 運動 騷亂 暴亂 騷擾 擾亂 抗暴 平反 維權 示威游行 李洪志 法輪大法 大法弟子 強制斷種 強制堕胎 民族淨化 人體實驗 肅清 胡耀邦 趙紫陽 魏京生 王丹 還政於民 和平演變 激流中國 北京之春 大紀元時報 九評論共産黨 獨裁 專制 壓制 統一 監視 鎮壓 迫害 侵略 掠奪 破壞 拷問 屠殺 活摘器官 誘拐 買賣人口 遊進 走私 毒品 賣淫 春畫 賭博 六合彩 天安門 天安门 法輪功 李洪志 Winnie the Pooh 劉曉波动态网自由门

Sign up or log in to comment