Opaque Systems Extends Confidential Computing to Augmented Language Model Implementations 

Print Friendly, PDF & Email

Opaque Systems recently unveiled Opaque Gateway, a software offering that broadens the utility of confidential computing to include augmented prompt applications of language models. One of the chief use cases of the gateway technology is to protect the data privacy, data sovereignty, and data security of organizations’ data that frequently augments language model prompts with enterprise data sources.

With Opaque Gateway, users can facilitate Retrieval Augmented Generation and other forms of prompt augmentation, while ensuring data remains encrypted to and from language models, including Large Language Models (LLMs). These capabilities are a natural extension of the confidential computing tenet, in which data is encrypted in transmission, at rest, and while in use.

According to Aaron Fulkerson, Opaque Systems CEO, Opaque Gateway was partly inspired by the fact that “companies want to take the same concept of confidential data and apply it to Generative AI implementations. We have a platform that companies will use to run AI workloads on encrypted datasets that stay encrypted all the way through, including during processing.”

Architecturally, Opaque Gateway is positioned between the data sources augmenting a prompt issued to a vector database and the particular language model selected to answer the prompt. The gateway administers encryption via a client interface (rendering client side, not server side, encryption), so that when the prompt and its augmented data—from any variety of enterprise sources—reaches the gateway, it’s already secured. Customers may also provide their own encryption for this step.

Once transmissions reach the gateway, it institutes a number of critical functions for optimizing implementations. This functionality includes monitoring and reporting, machine learning processing, non-deterministic rule filtering (for access controls at the data level), and securing the model’s response, which is ideal for training language models.

There’s also a complete audit log with a root of trust to verify data sovereignty, data security, and data privacy.

Monitoring and Reporting

Opaque Gateway’s monitoring and reporting features assist language model implementations in several ways. Firstly, they enable administrators, data governance, and IT personnel to review what data—which can be from any number of sources including data warehouses, databases, transactional systems, and more—is going to and from the language models.

“If I’m an employee of this company who has admin rights, even though the data’s been encrypted I can look at a report, just like you would from a network firewall, but it’s a data firewall, to see what data’s flowing through my gateway,” Fulkerson explained. Doing so provides strategic benefits by enabling organizations to see what prompts are being issued, what the model’s outputs are, and how to get better results. For example, users might be “asking about specific implementations of a product and… not getting good responses,” Fulkerson commented. “Well, we can fine-tune this, or we can improve the RAG by adding an additional data source.”

Machine Learning Processing 

Opaque Gateway also involves machine learning to process the augmented data before routing it to a language model. The utility derived from these capabilities is binary. Firstly, “we use Natural Language Processing and we identify PII, and we can redact or sanitize the PII,” Fulkerson remarked. “We can provide guardrails around sentiment and other activities.”

According to Fulkerson, the ML capabilities also involve the employment of LLaMA for prompt compression inside the gateway.  This expression of generative AI reduces the number of tokens transmitted to the model, without compromising accuracy, to decrease the cost of language model implementations. “Prompt compression, right now, is a big deal because people’s LLM implementation for internal use cases is really expensive,” Fulkerson added. “But, you’re going to see a rapid lowering in cost on LLM implementations and cost per tokens as [prompt compression] becomes more competitive.”

Non-Deterministic Rule Filtering

Opaque Gateway also applies what Wilkerson referred to as “AI” to facilitate non-deterministic rules for access controls, user permissions, and group permissions. According to the CEO, many organizations have typically experienced challenges implementing access controls at the data level for prompt augmentations when data stems from heterogeneous sources across environments.

“There’s multiple data sources that are doing the [prompt] augmentation and injecting enterprise data into that,” Fulkerson noted. “As soon as it hits the gateway, we’re passing along the notion of who is the user.” That knowledge informs the permissions to reinforce facets of data governance, access controls, and regulatory compliance. Moreover, these non-deterministic rules are applied bi-directionally, both to and from the language model.

A Catch-On Situation 

Language model implementations requiring augmentation with enterprise data will likely increase in the near future as organizations become better acclimated with vector databases and LLMs. The ability to access these resources in a secure, governed, sovereign manner is critical for ensuring regulatory compliance and the long term success of these endeavors. Opaque Gateway’s extension of the confidential computing paradigm to accommodate this growing use case could considerably help organizations achieve these key objectives.

About the Author

Jelani Harper is an editorial consultant servicing the information technology market. He specializes in data-driven applications focused on semantic technologies, data governance and analytics.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Speak Your Mind

*

Comments

  1. Top Data science programming languages you should know about to become a successful data scientists in 2023