October 05, 2023
Foundation models and the privatization of public knowledge
To protect the integrity of knowledge production, the training procedures of foundation models such as GPT-4 need to be made accessible to regulators and researchers. Foundation models must become open and public, and those are not the same thing", writes AlgoSoc PI José van Dijck with researchers Fabian Ferrari and Antal van den Bosch in the September issue of Nature Machine Intelligence.
Their article discusses concerns surrounding ChatGPT, an AI developed by OpenAI. These concerns encompass safety, plagiarism, bias, and accuracy, with a particular focus on how large language models (LLMs) owned by private entities may impact knowledge production. The authors notes that OpenAI, despite its nonprofit origins, has transformed partly into a capped-profit company with Microsoft as a major investor. The issue of whether regulators and scientists will have access to the inner workings of deep neural networks, including their training data and procedures, remains under-addressed. The lack of transparency could hinder thorough inspection, replication, and testing, potentially compromising the integrity of public knowledge.
In their article, Van Dijck et al. also examine the training data and procedures of foundation models like GPT-4 and PaLM. While some information is available about the datasets used, details about their specific composition and sources remain undisclosed. The author highlights that making foundation models open and public is crucial. "Open" refers to making them available for detailed inspection and replication, while "public" implies treating them as utilities accessible to all. Van Dijck et al. suggest that ensuring transparency, implementing technical safeguards, and enacting legal and regulatory measures are essential steps. Moreover, they emphasize the need for improved AI literacy among citizens to foster a democratic understanding of these technologies in the context of knowledge creation.
Cite this article: Ferrari, F., van Dijck, J. & van den Bosch, A. Foundation models and the privatization of public knowledge. Nat Mach Intell 5, 818–820 (2023). https://doi.org/10.1038/s42256...
More results /
Europe Wrote the AI Rulebook. Can it Deliver on its Ambitions?
By Natali Helberger • José van Dijck • Claes de Vreese • October 27, 2025
By Isabella Banks • Jacqueline Kernahan • October 02, 2025
By José van Dijck • August 25, 2025
By Natali Helberger • March 06, 2025
By Kätliin Lember • November 07, 2024
By Leonie Westerbeek • November 22, 2024
By Corien Prins • May 18, 2021
Platform observability and content governance under the DSA
By Charis Papaevangelou • Fabio Votta • September 22, 2025
By Agustin Ferrari Braun • July 04, 2025
By Agustin Ferrari Braun • April 16, 2025