
German physicist Werner Heisenberg, on his deathbed, said, “When I meet God, I am going to ask him two questions: Why relativity? And why turbulence? I really believe he will have an answer for the first.”
It’s a quote commonly misattributed to Albert Einstein. But the point is that scientists get extremely uncomfortable when trying to model equations against systems where they can’t predict the underlying behavior, and turbulence in fluid dynamics is one of the most unpredictable of all mathematical environments.
That’s where we are today with security in the fast-moving waters of AI. The volume and velocity of code generation are creating new back pressures that are beyond the reasoning of Old World models of software security.
What’s going to happen next? This may be the first time anyone’s attempted to explain a software problem by talking about the mental models in fluid dynamics, but as a mechanical engineer, I can’t stop myself from observing the parallels.
How Turbulence Breaks Models
The first important part of fluid dynamics is a concept called laminar flow, which describes the way fluid moves through a pipe. Laminar is short for lamination, and you can think of laminate flooring or layers in a croissant.
When water moves in a laminar way, it doesn’t move between layers. The water at the edges of the pipe moves the slowest, because it is right at the surface of the pipe. The water slides past itself in layers, and there’s a little friction that slows it down between steps.
When you model this out, you get a nice curve — a nice parabola that describes that flow, and it’s easy to predict the relationship between pressure and velocity. The behavior of different types of fluid, which is known as viscosity, is also easy to predict.
Laminar systems are awesome because they are easy to model and understand. You can predict the behavior of every water particle as it flows through that pipe.
What does this have to do with AI? We’re getting there. Unfortunately, laminar flow doesn’t always happen.
If you start pushing too much water through the pipe, if it moves too quickly, or if the pipe gets too large or the viscosity gets too low, the system breaks down. This happens with a lot of complicated systems that are easy to model and predict, and then all of a sudden they are not — and then that behavior can change rapidly.
When that happens in water, it switches over to something called turbulence. With turbulence, that nice assumption we are making about the water not crossing between layers breaks down, and now those particles are crossing between streams. You can’t process it anymore, and you can’t explain it.
When scientists try to predict turbulence, their theories are all based on computer simulations and experiments to see exactly what happens. So even though they can’t explain it, they can come up with equations to model the scenarios.
Scientists have long tried to figure out “first principles” to model turbulence, but they are still unable to. This doesn’t mean we can’t predict the behavior, but we can’t explain it or model it at the level of a single particle. We can still come up with some pretty good linear regression models or other frameworks for determining what will happen to a system with a given pressure input. But the assumptions change completely.
Software Security Is in Transitional Flow
Today, many developers are using AI tools to help write software faster, but far fewer who use AI to review pull requests and code that their team is writing. It’s similar to the shift from laminar flow into turbulent flow.
Laminar flow is where we still are today in most places in the models we use to write and secure software. Developers write code, developers review code and developers rely on code review at the individual pull request and line of code level when they look for security defects and vulnerabilities.
Open source projects have a set list of maintainers, and they review code from contributors. Vulnerability researchers discover new vulnerabilities all the time. They get assigned unique CVE identifiers, and they’re tracked by security teams in spreadsheets. It’s easy to understand and simple to model.
There’s this saying that all models are wrong, but some are useful. The real key to that point, to take it a little further, is that these models and these ways of thinking about software development work when you know which regime you’re in and you know how to model those systems correctly.
But if developers are using AI to write more and more code, but not using those same techniques to review that code, now we’re getting more water going through that pipe than we can handle, and those assumptions of the model start to break down.
In fluid dynamics, when something starts to transition from one model to another, it’s referred to as transitional flow. This is actually the toughest type of system to model and reason about. That transition doesn’t really happen at a set time either. There are some constraints and ratios you can try to plug in and predict which state you are in, but a lot of engineering work goes into knowing which regime you are operating in, and whether it’s laminar or turbulent.
AI Is Bursting the Pipe
When a pipe bursts, you can see the exact transition happening from laminar to turbulence. There’s a brief moment before the burst when it’s flowing in a nice steady stream, and then boom, the explosion happens.
That’s where we are today with AI security. Not all hope is lost. We can still get good results. We can still secure systems. But we definitely have to start thinking differently about the models, because right now it’s obvious they are being overwhelmed by the new volume and velocity of AI.
With AI, the scale at which software development is going to happen is only going to increase. Developers are writing more and more code using these systems. All code has bugs, and some of those bugs have a security impact. So if devs are writing more and more code, there’s going to be more and more security defects in it.
Security researchers are getting more productive at looking for known bugs being introduced in systems. If they’re finding more vulnerabilities than teams can triage and deal with, we’re not going to be able to build trust and rely on systems anymore.
And the pace at which these AI systems are being pushed into production to process sensitive workloads keeps getting faster. It’s really hard to deploy systems securely, and the more sensitive the data they’re accessing, it’s even harder to do so securely because the stakes are so much higher. Companies are rushing AI into these systems at a much faster pace than anything else I’ve seen happen in my career.
When you combine all these factors, we see software security transitioning from something like laminar — where it’s easy to understand, model and predict — to something where we need to get useful results from a turbulent data set, even if we don’t fully understand every particle.
We’re going to need new techniques. We’re going to have to start helping developers review and catch security issues just to keep up with the pace that new software is being written. We’re going to have to help security teams deal with new permutations and vulnerabilities faster than their spreadsheets can keep up, which is how most of this is managed today.
There’s no simple math here. We have to change the way we think, and come up with probabilistic or statistical types of approaches. We’ll have to take a “pets transitioning to cattle approach.” That’s another analogy you hear a lot in infrastructure.
If you have just a couple of systems or computers, you can treat them with names and know about every single one. We do that a lot with vulnerabilities today in a lot of contexts — each has a unique identifier, and security researchers remember a lot of them. If that scale extends by 20 times, we’re not going to be able to do that anymore.
AI Software Attacks Are Rising
That transition is already happening, it’s just not evenly distributed yet.
Developers can’t keep up with code reviews anymore because so much of it comes from people they don’t trust or don’t know, and it’s too hard to catch security issues in the code they’re reviewing now.
Attackers understand that attacking systems is the perfect use case for a large language model because the cost of getting something wrong is basically zero. They can just keep trying over and over again. Whereas defenders need to be perfect all the time. And that change is making security teams uncomfortable in the same way that physicists are uncomfortable discussing turbulent systems.
So what do we do here? We need to get comfortable operating when there are too many vulnerabilities to give unique names to, where you can’t rely on humans capturing every security defect in code reviews anymore.
We’ve Survived the Transition to Turbulence Before
This AI gold rush isn’t the first time that the industry has had to take a major new form factor and learn how to trust the security of software components created by other people. We’ve been doing this successfully with open source for decades.
And the infrastructure used to run AI workloads today has a very close 1:1 overlap with the same infrastructure everyone is already running — it’s basically Kubernetes with some CUDA sprinkled in, give or take. Many of us still have battle scars from trying to roll out secure multitenant Kubernetes application teams, and this is no more difficult.
Playing with AI toolkits from source reveals an incredibly fast-moving and turbulent security environment. PyTorch builds only work with specific CUDA builds, which all require super-old versions of GNU Compiler Collection (GCC)/Clang to compile against. Packages can take hours to build even on powerful systems.
Also scary: the state of these supply chains, with most links leading to random forum dumps. If a CVE appeared in one of the low-level native libraries, I’m not sure how the industry would react. Most of this stuff is very hard to bootstrap or build from source, so it would be a mess.
These libraries often are processing completely untrusted and unsanitized training data, and the AI space in total feels like we’ve regressed a few years in overall supply chain security.
I’m still hopeful in the long run that defenders can use AI to help deal with this noise at scale and extract better signals. But we learned from decades of establishing trust in the open source domain that you have to start with a secure foundation and build things right from the start.
At Chainguard, we launched an Images AI bundle that includes a comprehensive collection of images for stages in the AI workload life cycle, from development images to workflow management tools to vector databases for production storage. When you address your AI software supply chain from the base level to include software signatures, software bills of materials (SBOMs) and CVE remediation, you are basically expanding the size of the pipes so the burstiness of your AI development ambitions maintain security guardrails.
The post Navigating the Turbulent Waters of AI Security appeared first on The New Stack.
Applying old security models to the new pressures of AI is giving enterprises a crash course in the unpredictability of turbulence.