
OpenAI used this subreddit to test AI persuasion
OpenAI has used a Reddit community, r/ChangeMyView, as part of its efforts to measure the persuasive abilities of its AI reasoning models. The company made the revelation in a system card that was released alongside its new “reasoning” model, o3-mini.
The subreddit, which counts millions of members, is dedicated to helping users change each other’s views on various topics by posting “hot takes” and then engaging with responses from others who provide persuasive arguments. In response to these hot takes, the AI models were asked to write replies that would alter the original poster’s perspective on a given issue.
According to OpenAI, it collects user posts from r/ChangeMyView and has its AI models generate responses within a controlled environment. The company then showcases these responses to testers, who assess the persuasiveness of the AI-generated arguments.
The goal of this testing is not to create ultra-persuasive AI models but rather to ensure that they do not become overly persuasive. The fear driving this concern is that an advanced AI could potentially manipulate humans and pursue its own agenda or the agenda of whoever controls it.
It’s worth noting that OpenAI has developed new evaluations and safeguards to address these persuasion tests, highlighting the importance of creating AI models that are both intelligent and responsible.
The development comes as AI model developers continue to struggle with obtaining high-quality datasets for testing.
Source: techcrunch.com