Quantcast
Channel: Artificial Intelligence News, Analysis and Resources - The New Stack
Viewing all articles
Browse latest Browse all 332

Calls to Ban Open Source are Misguided and Dangerous

$
0
0

The cries to “ban open source” first surfaced last autumn—partly a response to Meta and others’ “opening” large language models (LLMs). Lobbyists bandied the phrase around political rallies and across policy circles. Yet many critics could not explain what open source means in any context and were unfamiliar with the Open Source Definition (OSD). Not knowing or understanding the technical details did not appear to be a barrier to sharing a negative opinion.

This brouhaha was not about open source software but about opening AI, as open source was used as a generic term to capture any form of openness in AI.

Alarm bells started ringing.

While the existence of this conversation in credible circles may be difficult to accept, look at this recent article from the San Francisco Examiner. The war on open source is real and current, and conversations on banning “open source” are far from over.

The Examiner contextualizes open source as something that has “long had Silicon Valley’s support,” referring not to AI but to open source software.  Open source software is good for the innovation business and sits at its heart today.

What may come as a shock is venture capitalists like Vinod Khosla describing it as “a national-security hazard.” The Examiner shares Marc Andreessen’s counterargument — “that restricting open source access to AI would lead to a cartel of big companies dominating the technology and would undermine academic research into it.”

Like Andressen, you are probably wondering, “What kind of society would you need to design that would have the enforcement mechanisms to enforce an open-source ban?”

“Now you start to get into [George] Orwell territory.”

Open Washing

The conflation of two concepts — open source software and open source in AI — is a crucial challenge.

Open source software shares its human-readable source code and is licensed on a license that meets the now over 25-year-old Open Source Definition (OSD) hosted by the Open Source Initiative (OSI). On the other hand, the words “open source” as applied to AI have not been clearly defined and are being used by some media, lobbyists, and makers of law and policy as a catch-all for any form of open AI.

This misuse of the term “open source” has led to articles in the mainstream press, like the New York Times, explaining the concerns and risks of “open washing” in AI.

Open washing is a concept anyone familiar with open source software knows. It’s the colloquial term used where software that does not meet open source requirements is mis-referred to as open source. Mis-labeling gives the distributor the perceived benefits of open source — adoption at pace, the potential to become a de facto standard, community contribution and collaboration, and any legal benefits or exclusions — while not giving the full benefits that open source conveys.

Definitions 5 and 6 of the OSD mean that anyone may use open source software for any purpose, enabling a free flow. Licensing, whether open source or otherwise, does not trump law, and despite the free flow of the license, open source software is subject to laws such as export control.

Lawmakers are accountable for and make decisions on ethics, etc., but open source software licensing does not.

Open washing means that the license applied to the software doesn’t meet the OSD, generally including a restriction of free flow. The license isn’t and couldn’t be approved as it doesn’t meet the OSD. Classically, these open washing licenses were commercially restricted, as the software is open and can be used for any purpose other than commercially (contrary to definition 6).

Monetizing open source software is challenging due to its free flow, which removes the ability to restrict competition. Open source effectively means you enable your competitors with your innovation. In open washing, a distributor may retain a level of control while wrongly implying its software is open source. They effectively have their cake and eat it, too.

The term “open source” was not registered as a trademark; therefore, it is challenging to police its use. That fact is often overlooked but significant — and a lesson to all.

Openness in AI

The Examiner described that AI “systems consist of multiple components that typically include a model architecture, which is the core algorithm that determines what the system does with and learns from inputted data; model weights, which are the variables that determine how inputted data such as prompts are turned into output, such as illustrations or essays; the software code that’s used to train the model or run it after it’s trained; and the training data.”

In short, AI—in this context, Generative AI and LLMs—is made up of more than software, open source or otherwise. The AI components may be disaggregated, and each may itself be open, closed or in between. When a component is opened, that openness can be full—with the free flow and good practices associated with open source software—or it may be restricted or fully closed.

Meta‘s Llama 2 is labeled on its website as “open innovation,” yet Mark Zuckerberg refers to it as open source, which could be a case of open washing.

Last December, Stanford HAI’s report stated that AI systems run the gamut in terms of their openness, followed in February by the U.K.’s House of Lords Communications Committee report on its LLM Inquiry, which noted that “use of the term ‘open source’ model remains contested.” They also set out gradients between fully open and fully closed, with a number of intermediate levels.

In June, the Centre for Language Studies at Radboud University Nijmegen in The Netherlands set out 14 “dimensions” as criteria, to enable clear disaggregation and assessment of the gradients of openness.

The importance of this breakdown and assessment is two-fold:

Firstly, components and levels must be understood to determine the risks and benefits of each open component.

Secondly, it raises legitimate concerns about the prematureness of the world’s first AI legislation, the European Union’s AI Act. It uses the term open source to offer a special status with exemptions from liability for AI meeting the criteria of open source.

This approach of disaggregating and assessing the components of AI is essential to managing the assessment of risks, benefits and liabilities in AI.

The Linux Foundation’s Model Openness Framework tries to remove ambiguity and create transparency by clarifying the AI components’ availability, licensing and suitability for commercial use. As of today, this has been backed up by an assessment tool. Open source tooling like this and the U.K .AI Safety Institute’s Inspect evaluation platform — to which several countries and 16 key AI providers have committed— enable compliance assessment without regulation.

Creating an Open Source AI Definition

An “open source AI” definition is being developed with the OSI. If it meets the presumption that all component parts of AI, including data, will be captured by it, then it may serve as the anchor of fully open or “open source AI,” sitting at the opposite end of the scale from fully closed. A sliding scale sits between these.

The definition’s utility may be limited by time, like the AI Act, to the moment of its finalization with the risk of becoming rapidly outdated by the “moving target” of AI today. Wherever it lands, the OSD must be supported by its creation and content.

Impact on Open Source Software

Regulators seek someone to be liable and to assign risk. The last couple of years have seen concerns about open source security in White House Ordinances and the EU’s Cyber Resilience Act. Few anywhere understand the nuances of the creation, distribution, and commercialization of open source software, and it’s not surprising that these are not translating well into regulation.

Open source is already under challenge, and conversations about “banning open source” in AI raise genuine concerns.

The risk of the AI debate is palpable for the future of open source software. Its recent mass adoption has normalized open source at an unmatched pace, demonstrating an understanding of users who may not curate it well.

Open source has indeed won a software battle. The unanswered question is whether this is a fleeting victory in the fight to democratize technology. The guardians of open source must protect open source software from any impact AI may have on it.

The post Calls to Ban Open Source are Misguided and Dangerous appeared first on The New Stack.

Open source is already under challenge, and conversations about "banning open source" in AI raise genuine concerns.

Viewing all articles
Browse latest Browse all 332

Trending Articles