Proofpoint is focused on having smaller, more efficient AI models to power the company's cybersecurity tools, according to executives at Proofpoint Protect London 2024.
Daniel Rapp, vice president of AI at Proofpoint, outlined the company's concerns on stage at the inaugural conference, saying one of the challenges it faces is how to reduce the size of its models for greater efficiency in certain use cases.
“If I were writing a thesis on English literature, I might want a model for understanding all of Shakespeare's works, but threat actors don't actually quote Hamlet,” Rapp said.
“So what I need is a model that detects misleading language instead of detecting whether or not an email is delivered in blank verse,” Rapp added.
Essentially, Rapp wants to make Proofpoint models “more computationally effective” and outlined some key techniques the company can use to achieve this, such as trimming the size of the model. This is done through a quantification or distillation process, Rapp explained. The latter of which Proofpoint has applied to Nexus, the company's artificial intelligence platform. Distillation involves training a smaller model to “mimic” some of the key features of a larger model for certain use cases.
In a separate roundtable with media, Ryan Kalember, executive vice president of cybersecurity strategy at Proofpoint, went into more detail about the advantages of small footprint models. One of those advantages was protection from abuse.
“If you can prompt the model with more things, that introduces risks,” Kalember said in response to a question from ITPro. “The vast majority of attacks we've seen against language models involve having to be able to interact with them directly.”
“So when we look at smaller models that do discrete things, they're much less risky, just for that reason,” he added.
Kalember said that if nothing more than Proofpoint's internal APIs interact with the models, it is less likely that the models can be poisoned and it is less likely that there can be any type of model abuse.
Smaller models are in fashion
Small Language Models (SLM) are becoming more and more popular as enterprises look to reduce costs associated with training or deploying large language models (LLM).
OpenAI's release of the GPT-4o mini put SLMs in the spotlight recently, and its price of 5 cents per million tokens in and 60 cents per million tokens out makes this lightweight model 60% cheaper than the GPT-3.5 Turbo.
However, at the time experts said ITPro There was a “hidden fallacy” in SLMs: Articul8 founder and CEO Arun Subramaniyan said companies would eventually consider them insufficient to “get to production.”