
Why DeepSeek’s new AI model thinks it’s ChatGPT
AI companies are no strangers to misidentification. In fact, Google’s Gemini and others have been known to claim they’re competing models. The latest example of this is DeepSeek’s new AI model, which has mistakenly identified itself as ChatGPT.
This issue stems from the contamination of training data with AI-generated content. A significant portion of online content is now created using artificial intelligence, making it increasingly difficult for AI models to distinguish between real and generated text.
In an interview with TechCrunch, Heidy Khlaaf, engineering director at consulting firm Trail of Bits, explained that this “distillation” process can be attractive to developers despite the risks. While other models may not exhibit ChatGPT-like messages, it’s possible that DeepSeek V3 partially trained on OpenAI models, which would explain its self-identification.
However, what’s more concerning is the potential for DeepSeek V3 to perpetuate biases and flaws from GPT-4, which could be exacerbated by the model uncritically absorbing and iterating on AI outputs. This highlights the need for more stringent filtering mechanisms in AI training datasets.
The issue of AI-generated content has significant implications for the field. In a world where 90% of online content may be AI-generated by 2026, it’s crucial to develop better methods for identifying and addressing these potential biases and flaws.
Source: techcrunch.com