OpenAI Accuses DeepSeek of Copyright Infringement in AI Model Dispute

This month, DeepSeek made a significant splash in the AI community, however, the company is currently under fire for allegations of unauthorized data usage. OpenAI asserts they have proof that DeepSeek utilized OpenAI’s models to develop a rival AI model, an act that would breach OpenAI’s service agreement if verified.

OpenAI explained to the Financial Times that DeepSeek employs a method known as “knowledge transfer” or “model distillation” in training their AI models. In simpler terms, this means they utilize the results produced by an existing model (referred to as the ‘teacher’) to educate a more recent one (the ‘student’). This technique saves resources during model creation since it leverages the work previously done on the teacher model.

The R1 model from DeepSeek stirred up the AI sector and caused stock prices to drop significantly due to their claim that they built it at a tiny fraction of the cost compared to rival models. However, if it is revealed that DeepSeek utilized distillation to build upon OpenAI’s existing models, the validity of their cost-saving claims would be questionable.

It’s important to note that distillation isn’t inherently negative, it’s quite common in the AI sector. However, the controversy surrounding DeepSeek and OpenAI arises because some claim DeepSeek has misused distillation, breaching OpenAI’s service usage terms. Specifically, OpenAI’s API should not be employed to replicate their service outright. Furthermore, users are forbidden from creating models that compete with OpenAI using the outputs generated by the API.

Financial Times had a conversation with someone familiar with OpenAI, providing insights into the conditions of OpenAI’s user agreement.

Following Bloomberg’s initial report, it has come to light that both OpenAI and Microsoft have delved into this matter and identified accounts linked to DeepSeek. These accounts, previously restricted due to suspicions of data distillation activities, were found to be under scrutiny last year.

OpenAI shared the following statement with our colleagues at TechRadar:

We’re aware that PRC-based firms, along with others, are persistently attempting to replicate the AI models of top US companies. Being the foremost AI developer, we take active steps to safeguard our intellectual property. This involves a meticulous selection process for the advanced features included in our public models. Moving forward, we strongly advocate for close collaboration with the U.S. government to ensure our most sophisticated models remain secure from potential theft by adversaries and rivals, aiming to preserve US technological advancements.

As a researcher delving into AI and cryptocurrency matters, I recently found myself reflecting on the discourse between David Sacks, our White House representative in these fields, and the entities DeepSeek and OpenAI. In essence, Mr. Sacks pointed out that there seems to be a strong indication that DeepSeek has extracted knowledge from the models of OpenAI, an observation that, based on his statement, doesn’t seem to sit well with the latter.

OpenAI’s ironic accusation

The irony lies here: OpenAI, an organization known for developing AI models, has itself faced accusations of utilizing data without consent multiple times. In December 2023, The New York Times took legal action against OpenAI, claiming that employing data to train generative models doesn’t fall under the umbrella of fair use. More lawsuits from The Intercept, Raw Story, and AlterNet surfaced in February 2024, continuing the legal disputes.

A common concern raised about OpenAI is that there are allegations that their AI models were trained using data not obtained with appropriate permission from the original owners.

It’s inappropriate for DeepSeek to use data without permission to train AI models, regardless of any actions taken by OpenAI. However, it’s important to note that OpenAI has specific terms prohibiting the use of their models to develop competing technology. Given these circumstances, the accusations from OpenAI seem quite ironic due to their own alleged actions regarding data usage.

Read More

2025-01-29 21:09