
An organization developing math benchmarks for artificial intelligence (AI) has come under fire for failing to disclose its funding from OpenAI until relatively recently. According to reports, Epoch AI, a nonprofit primarily backed by Open Philanthropy, a research and grantmaking foundation, did not reveal that it received support from OpenAI until December 20.
It appears that many contributors to the FrontierMath benchmark were not informed of OpenAI’s involvement until this point. A contractor for Epoch AI, who goes by the username “Meemi,” stated on the forum LessWrong that numerous individuals working on the project were unaware of OpenAI’s support until it was publicly disclosed.
“Many people contributed to FrontierMath without knowing about OpenAI’s role in it,” Meemi claimed. “This lack of transparency is problematic.”
The secrecy surrounding Epoch AI’s funding from OpenAI has led some in the AI community to question the integrity of the benchmark. Some critics argue that the secrecy may have compromised the objectivity of FrontierMath.
Epoch AI lead mathematician Ellot Glazer attempted to address these concerns, stating that he believes OpenAI’s results on the benchmark are legitimate and that they do not appear to have trained their AI model using the dataset.
However, Glazer also acknowledged that Epoch AI has been unable to independently verify OpenAI’s results. He noted that the organization is still in the process of conducting an independent evaluation, which would allow them to confirm or deny OpenAI’s scores.
The controversy highlights the difficulties involved in developing objective benchmarks for AI and securing the necessary resources for these initiatives without creating the perception of conflicts of interest.
As the development of AI continues to advance at a rapid pace, the need for reliable and unbiased benchmarks becomes increasingly critical. It is essential that organizations such as Epoch AI prioritize transparency in their funding and operations to maintain public trust and ensure the integrity of their work.
In conclusion, this latest controversy serves as a stark reminder of the challenges associated with developing empirical benchmarks for AI.
Source: techcrunch.com