Artificial intelligence is in the middle of a perfect storm in the software industry, and now Mark Zuckerberg is calling for open sourced AI.
Three powerful perspectives are colliding on how to control AI:
- All AI should be open source for sharing and transparency.
- Keep AI closed-source and allow big tech companies to control it.
- Establish regulations for the use of AI.
There are a few facts that make this debate tricky. First, if you have a model’s source code, you know absolutely nothing about how the model will behave. Openness in AI requires far more than providing source code. Second, AI comes in many different flavors and can be used to solve a broad range of problems.
From traditional AI for fraud detection and targeted advertising to generative AI for creating chatbots that, on the surface, produce human-like results, pushing us closer and closer to the ultimate (and scary) goal of Artificially Generated Intelligence (AGI). Finally, the ideas listed above for controlling AI all have a proven track record for improving software in general.
Understanding the Different Perspectives
Let’s discuss the different perspectives listed above in more detail.
Perspective #1 — All AI should be open source for sharing and transparency: This comes from a push for transparency with AI. Open source is a proven way to share and improve software. It provides complete transparency when used for conventional software. Open source software has propelled the software industry forward by leaps and bounds.
Perspective #2 — Keep AI closed-source and allow big tech companies to control it: Closed-source, or proprietary software, is the idea that an invention can be kept a secret, away from the competition, to maximize financial gain. To open source idealists, this sounds downright evil; however, it is more of a philosophical choice than one that exists on the spectrum of good and evil. Most software is proprietary, and that is not inherently bad — it is the foundation of a competitive and healthy ecosystem. It is a fundamental right of any innovator who creates something new to choose the closed-source path. The question becomes, if you operate without transparency, what guarantees can there be around responsible AI?
Perspective #3 — Establish regulations for using AI: This comes from lawmakers and elected officials pushing for regulation. The basic idea is that if a public function or technology is so powerful that bad actors or irresponsible management could hurt the general public, a government agency should be appointed to develop controls and enforce those controls. A school of thought suggests that incumbent and current leaders in AI also want regulation, but for reasons that are less pure — they want to freeze the playing field with them in the lead. We will primarily focus on the public good area.
The True Nature of Open Source
Before generative AI burst onto the scene, most software running in data centers was conventional. You can determine precisely what it does if you have the source code for traditional software. An engineer fluent in the appropriate programming language can review the code and determine its logic. You can even modify it and alter its behavior. Open source (or open source code) is another way of saying — I am going to provide everything needed to determine behavior and change behavior. In short, the true nature of open source software is to provide everything you need to understand the software’s behavior and change it.
For a model to be fully open, you need the training data, the source code of the model, the hyperparameters used during training, and, of course, the trained model itself, which is composed of the billions (and soon trillions) of parameters that store the model’s knowledge — also known as parametric memory. Now, some organizations only provide the model, keep everything else to themselves, and claim it is “open source.” This practice is known as “open-washing” and is generally frowned upon by both the open and closed-source communities as disingenuous. I would like to see a new term used for AI models that are partially shared. Maybe “partially open model” or “model from an open washing company.”
There is one final rub when it comes to fully shared models. Let’s say an organization wants to do the right thing and shares everything about a model — the training data, the source code, the hyperparameters, and the trained model. You still can’t determine precisely how it will behave unless you test it extensively. The parametric memory that determines behavior is not human-readable. Again, the industry needs a different term for fully open models. A term that is different from “open source,” which should only be used for non-AI software because the source code of a model does not help determine the behavior of the model. Perhaps “open model.”
Common Arguments
Let’s look at some common arguments that endorse using only one of the previously described perspectives. These are passionate defenders of their perspective, but that passion can cloud judgment.
Argument: Closed AI supporters claim that big tech companies have the means to guard against potential dangers and abuse. Therefore, AI should be kept private and out of the open source community.
Rebuttal: Big tech companies have the means to guard against potential abuse, but that does not mean they will do it judiciously or at all. Furthermore, there are other objectives besides this. Their primary purpose is making money for their shareholders, which will always take precedence.
Argument: Those who think that AI could threaten humanity like to ask, “Would you open source the Manhattan Project?”
Rebuttal: This is an argument for governance. However, it is an unfair and incorrect analogy. The purpose of the Manhattan Project was to build a bomb during wartime by using radioactive materials to produce nuclear fusion. Nuclear fusion is not a general-purpose technology that can be applied to different tasks. You can make a bomb and generate power — that’s it. The ingredients and the results are dangerous to the general public, so all aspects should be regulated. AI is much different. As described above, it comes in varying flavors with varying risks.
Argument: Proponents of open sourcing AI say that open source facilitates the sharing of science, provides transparency, and is a means to prevent a few from monopolizing a powerful technology.
Rebuttal: This is primarily true, but it is not entirely true. Open source does provide sharing. For an AI model, it is only going to provide some transparency. Finally, whether “open models” will prevent a few from monopolizing their power is debatable. To run a model like ChatGPT at scale, you must compute that only a few companies can acquire it.
Needs of the Many Outweigh the Needs of the Few
In “Star Trek II: The Wrath of Khan,” Spock dies from radiation poisoning. Spock realizes that the ship’s main engines must be repaired to facilitate an escape, but the engine room is flooded with lethal radiation. Despite the danger, Spock enters the radiation-filled chamber to make the necessary repairs. He successfully restores the warp drive, allowing the Enterprise to reach a safe distance. Unfortunately, Vulcans are not immune to radiation. His dying words to Captain Kirk explain the logic behind his actions, “The needs of the many outweigh the needs of the few or the one.”
This is perfectly sound logic and will have to be used to control AI. Specific models pose a risk to the general public. For these models, the general public’s needs outweigh innovators’ rights.
Should All AI Be Open Source?
Let’s review the axioms established thus far:
- Open Source should remain a choice.
- Open models are not as transparent as non-AI software that is open sourced.
- Close Source is a right of the innovator.
- There is no guarantee that big tech will correctly control their AI.
- The needs of the general public must take precedence over all others.
The five bullets above represent everything I tried to make clear about open source, closed source, and regulations. If you believe them to be accurate, the answer to the question, “Should All AI be Open Source?” is no because it will not control AI, nor will a closed source. Furthermore, in a fair world, open source and open models should remain a choice, and close source should remain a right.
We can go one step further and talk about the actions the industry can take as a whole to move toward effective control of AI:
- Determine the types of models that pose a risk to the general public. Because they control information (chatbots) or dangerous resources (automated cars), models with high risk should be regulated.
- Organizations should be encouraged to share their models as fully open models. The open source community will need to step up and either prevent or label models that are only partially shared. The open source community should also put together tests that can be used to rate models.
- Closed models should still be allowed if they do not pose a risk to the general public. Big Tech should develop its controls and tests that it funds and shares. This may be a chance for Big Tech to work closely with the open source community to solve a common problem.
The post Open Source or Closed? The AI Dilemma appeared first on The New Stack.
In a fair world, open source and open models should remain a choice, and close source should remain a right.