OWASP has spawned a Top Ten list for generative artificial intelligence (AI).
Generative AI is aka the Large Language Model Applications — such as ChatGPT— whose own makers are worried that their creations are going to wipe the unsightly scourge known as “you and me” from the planet.
Contrast Security CISO David Lindner’s response to the OWASP Top Ten list: “This will be fun to follow!”
Lindner plucked the OWASP news from the news-o-sphere in his May 26 CISO Insights column. He’s far from the only one who’s either fretting and/or eyeball-rolley at the hype. The big tech companies — Alphabet, Apple, Microsoft, Nvidia, et al., are drooling over the prospects of how they can use generative AI in countless applications. Europe is, as usual, out in front: On Wednesday, June 14, the European Parliament passed landmark legislation to regulate AI.
CISOs, likewise, are abuzz. Lindner, who participates in a CISO roundtable every quarter, was at one such roundtable last month when OWASP laid out its list. The No, 1 topic at the roundtable was, in fact, AI.
“I mean, AI? it’s hot,” he says. “There are people across the spectrum who think it's the greatest thing, that it will alter the tech space like the web once did. And you can see that that could be the case.”
Sure, eventually, but the technology isn’t there at this point, Lindner cautions. “I think it's a little immature yet,” the CISO says, though he believes it will quickly get to the level that people and businesses are dreaming of — particularly when it comes to automating what can be tedious work on the part of technology workers who spend precious time helping customers by, for example, programming a chatbot to help those customers.
Potential problems
OWASP says that its aim with the list is “to educate developers, designers, architects, managers, and organizations about the potential security risks when deploying and managing Large Language Models (LLMs)” As such, the list includes the top 10 most critical vulnerabilities often seen in LLM applications, highlighting their potential impact, ease of exploitation and prevalence in real-world applications. The goal is to raise awareness of the vulnerabilities, to suggest how they can be remediated and, ultimately, to beef up security for LLM applications.
Examples of vulnerabilities include prompt injections, data leakage, inadequate sandboxing, and unauthorized code execution, among others. The goal is to raise awareness of these vulnerabilities, suggest remediation strategies and ultimately to improve the security posture of LLM applications.
Lindner says that from a security perspective, we don’t yet have solutions for the potential problems LLM apps will bring, which include prompt injections, data leakage, inadequate sandboxing and unauthorized code execution, among others.
“It's going to be interesting to follow the OWASP Top 10 list, because we're going to come up against these 10 things, and I'm guessing half of them we can't solve yet,” he predicts.
Predictions on the No. 1 LLM vulnerability
Lindner’s take on the top (currently) unsolvables:
Prompt Injections. OWASP’s definition:“Bypassing filters or manipulating the LLM using carefully crafted prompts that make the model ignore previous instructions or perform unintended actions.”
Insufficient Access Controls. OWASP’s definition: “Not properly implementing access controls or authentication, allowing unauthorized users to interact with the LLM and potentially exploit vulnerabilities.”
Both of those are “big, big, big potential problem[s],” Lindner says. His rationale is that if you think about the way these big systems work, “It’s like a really, really, really intelligent Google search, but trained on old data sets and having the reasoning capability of about a second grader.”
He points to the noted tendency of generative AI applications to hallucinate, which is when LLM apps spout nonsense and present it as facts. Examples include Google’s February promotional video for its Bard chatbot — in which it flubbed facts about the James Webb Space Telescope — or the recent incident of ChatGPT spitting out fictional legal cases and arguments in a New York federal court filing, replete with bogus quotes and bogus internal citations.
The problems
The problem is that LLMs are doing something similar to what Google does: namely, crowd-sourcing.
“From an engineering perspective, when we’re writing code, and we're doing things like making a SQL query, such as to the back end to get some data, we want to know that a specific person logged in,” Lindner explains. “Now, I need to query to get her account information. I know what the response should be or look like, or the format of it, at least. I know exactly what I'm trying to get back. So I create this query, and it's going to return the same stuff every time.”
But that’s not how LLM works, he continues. Instead of writing that highly structured SQL query, a user instead enters a prompt. The LLM app then returns what’s known as “completion.” The thing is, the completion could be different every time, even for the same prompt.
“Part of the problem is refining the prompt to the point where you know what you're going to get back, or can at least have some level of expectation of what you're going to get back,” Lindner says.
And that's exactly where the focus will be for engineers and developers as they continue to work with LLMs for many of the types of services and tooling that they’re accustomed to using with software, the CISO emphasizes. “We have to be specific, and it's going to require a lot of training for a lot of these LLMs to get to the point where we understand what we're asking and what we're going to get back.”
Prompt injections are going to be an even bigger problem because on the front end, prompts involve language, along with every nuance of how you can phrase a query. What if the prompt were to include a character the LLM isn’t trained on, such as a Hebrew character? Would the app pop a gasket?
“I've seen three-, four-, five-word prompts where [the user] just keeps asking the same [thing], and the answers are different every time,” the CISO notes. “We have to think about [the LLM apps] like they’re a human. When we're asked a question we might respond differently. Like, if one of my kids asked me something, and I said, No,’ and literally, three minutes later they asked me the same thing, I'd probably be frustrated and answer differently.”
Data poisoning
How can we differentiate between a hallucinating, absent-minded LLM vs. one that’s been purposefully trained, with malicious intent, on tainted data? In fact, computer scientists in February demonstrated that “data poisoning” — in which deep-learning training data is intentionally polluted with garbage information — is theoretically possible.
“For just $60 USD, we could have poisoned 0.01% of the LAION-400M or COYO-700M datasets in 2022,” they write, referring to popular data sets. From a security perspective, such poisoning attacks would enable malicious actors to embed a backdoor in a model so as to control its behavior after training, according to Florian Tramèr, assistant professor at ETH Zurich and one of the paper’s coauthors.
“The large machine-learning models that are being trained today — like ChatGPT, Stable Diffusion, or Midjourney — need so much data to [train], that the current process of collecting data for these models is just to scrape a huge part of the Internet,” Tramèr told IEEE Spectrum — a fact that makes it extremely difficult to maintain quality control.
Maintaining control over sensitive data
Many organizations are looking at LLMs to enable chatbots, and you can see the appeal: Customers are comfortable with using them, and they would spare engineers having to fumble through finding the answer to a specific question about some project from years back.
It sounds great at first blush, Lindner acknowledges, but the amount of sensitive information that could get pulled in by trawling is unnerving: legal documents, nonpublic intellectual property, sensitive information that could prove valuable to would-be attackers and more.
That’s where the access question comes into play, he says. “How do you control the access?” he ponders. “And the answer is, ‘You can't.’ So then, do we just go back to just loading the stuff that's public? Well, then, it starts to lose what it was created for. Let's just say, for instance, some organizations decide to throw all of its customer data in the LLM. To train it. How do you control access to that? It's almost like you have to put a front end in front of your prompts that are going to the back end, and then not only validating it, going in, but validating it, coming back out. Because, remember, it responds differently every time.”
That's going to be “the real money machine,” Lindner concludes. “And I think we aren't quite there yet.”