Enhancing Predictive Cognitive Computing Models with Traditional Data Modeling

Print Friendly, PDF & Email

With so much attention devoted to the purported wonders of predictive cognitive computing models (typically characterized by classic machine learning and deep learning), it’s easy to lose sight of the conventional data models underpinning these applications.

The latter variety of data modeling is typically part of the data engineering process for aligning datasets in specific settings. Examples of settings include databases, various data science tools for data prep, as well as production environments for deploying cognitive predictive models.

With the sundry of data going into machine learning training datasets and array of mathematical outputs computed by their adaptive predictive models, it’s imperative to fundamentally understand how this data relates to predictive model objectives, business objectives, and the feedback loop inherent in making machine learning technologies enrich operational use cases.

Standardized data models (called ontologies) have consistently proven one of the most expressive ways of describing each of these objectives and the data requirements supporting them. According to TopQuadrant CTO Ralph Hodgson, they improve machine learning’s effectiveness in two ways.

“If we are addressing the neural network world where you can have training sets and results sets, there is a strong role that ontologies can play in making sense of the results sets and training sets,” Hodgson commented. “If we’re talking about machine learning as a predictive propensity or trained analytics, those are more mathematically based. You can bring into play ontologies as a way to make sense of what it is that’s being mapped.”

In both of these use cases, conventional data modeling strengthens the effectiveness of predictive cognitive models to maximize machine learning’s enterprise worth.

Ontological Modeling

The ontological foundation supporting cognitive predictive models is broad. Ontologies naturally evolve to include any data type or variation. They harmonize divers data for singular deployments—such as machine learning applications for individualized marketing and sales opportunities, for example. Moreover, ontologies encompass each dimension of the data modeling process from the most rudimentary to wide sweeping details. At a basic level, these models “are simply schemas that are very useful to people,” Hodgson acknowledged. “If they do entity relationship models they can easily relate to those kind of things.”

However, their greater value proposition for the myriad data attributes pertaining to machine learning training and results datasets is in incorporating an almost limitless supply of intimate facts about such data. Ontologies excel at modeling “complex properties and axioms,” Hodgson mentioned. “They represent the knowledge of a domain in great detail.” When that domain is focused on data for machine learning, organizations get a clear understanding of how their machine learning data is impacting its business goals.

Neural Networks

When applied to highly layered neural networks (particularly those involving scores of parameters and hyper-parameters), ontologies substantially impact the inputs and outputs of these predictive models. These standardized data models initially affect the model building process in relation to the “training [data] sets, and give you a way of nominating those training sets,” Hodgson remarked. “This is typically the case with what are called Convolutional Neural Networks and Reentry or Recurring Neural Networks.”

When using these types of neural networks or others, there’s an indispensable feedback mechanism that’s critical to the model’s ability to learn and improve upon its predictive accuracy—for such tasks as underpinning natural language technology use cases for text analytics, for example. According to Hodgson, “There’s always a feedback involved. So you can have an ontology making sense of the feedback and you can have an ontology that’s making sense of the results.”

Data Quality

When deployed on less complicated, more straightforward mathematical approaches to machine learning, ontologies are an immensely helpful means of reinforcing data quality for training datasets. With this use case, data are mapped to ontologies relevant to the machine learning algorithms invoked, which “helps sort out what the quality of the data is before you get busy with those algorithms,” Hodgson explained. One of the pivotal ways ontologies support this functionality is via standard vocabularies and taxonomies that are used to describe the various data aspects modeled within them.

Data validation measures such as SHACL only reinforce the capability of ontologies to appropriately vet the data prior to leveraging it for machine learning algorithms. “The data that feeds these things has to be validated in terms of how it represents the domain of interest properly,” Hodgson observed. “There’s a strong role of ontologies, particularly with the SHACL validation going on, to make sure the data is good. That is a very clear role.”

Supervising Machine Learning

Whether used in conjunction with deep neural networks or more basic machine learning approaches, ontologies inform both the inputs and outputs of the data involved. They’re instrumental in the data selection process and ensuring data are suitable for cognitive computing applications. In addition to aligning data of any point of distinction in a homogenous form, these uniform data models are influential in “guiding machine learning, assuring that machine learning focuses on the right terms, and capturing the results of those learnings in such a way that people can examine them,” denoted TopQudrant CEO Irene Polikoff. “I know that there is quite a bit of concern about the black box approach which is often just pure machine learning. So, augmenting it with some guidance from rules becomes very important in that context.”

By serving as a medium to inject business logic, rules, and expectations via the representation of domain knowledge apposite to any particular deployment, ontologies are able to maximize the effect of conventional data modeling on predictive cognitive computing models. Specifically, “ontologies are always in this kind of supervisory role, if you like, in situations like this,” Hodgson disclosed. “They’re not directly executing the machine learning; they’re not actually machine learning engines in their own right. They’re actually in a supervisory capacity.”

As such, they help ensure this technology is functioning as it’s designed to, and that the results of its algorithms are achieving the business aims for which they were implemented.

About the Author

Jelani Harper is an editorial consultant servicing the information technology market. He specializes in data-driven applications focused on semantic technologies, data governance and analytics.

Sign up for the free insideBIGDATA newsletter.

Speak Your Mind