Sign up for our newsletter and get the latest big data news and analysis.

The Importance of Opening the AI Black Box in 2019

Today we continue the insideBIGDATA Executive Round Up, our annual feature showcasing the insights of thought leaders on the state of the big data industry, and where it is headed. In today’s discussion, our panel of experienced big data executives – Ayush Parashar, Co-founder and Vice President of Engineering for Unifi Software, Robert Lee, Vice President & Chief Architect, Pure Storage, Inc., and Oliver Schabenberger, COO & CTO at SAS – examine the importance of opening the AI “black box” and how “explainability” will be hot in 2019.

The conversation is moderated by Daniel D. Gutierrez, Managing Editor & Resident Data Scientist of insideBIGDATA.

insideBIGDATA: AI and its adaptability come with a significant barrier to its deployment particularly in regulated industries like drug discovery – “explainability” as to how AI reached a decision and gave its predictions. How will 2019 will mark a new era in coming up with solutions to this very real problem?

Ayush Parashar, Co-founder and Vice President of Engineering for Unifi Software

Ayush Parashar: For AI generated answers to be broadly trusted and adopted—that requires transparency of the AI decision-making process presented in a human friendly fashion. In other words, we’ll need to visibly show how AI algorithms arrived at a conclusion that an expert in a relevant area can easily understand.

We see that level of transparency emerging in data analytics already where we can see data lineage and a full audit trail from the origin of a data source to any manipulations of that data as it’s joined with other data and served up as an insight. In 2019 and beyond I expect we’ll see even more transparency of AI powering analytics and in more forms – including the broader use of visualizations to instantly show the path for how decisions are arrived at when making diagnosis for patients in the healthcare industry. As companies embrace a culture of self-service data use then transparency around data’s origin to determine its trustworthiness will play an even greater role in ‘explainability’.

Robert Lee, Vice President & Chief Architect for Pure Storage

Robert Lee: The importance of data governance and provenance have never been as front and center as it is today.  Regulations around data privacy, data sharing, and 3rd-party use of data have been thrust into the limelight with recent events, and there is a real demand from just a data management perspective for better solutions to tracking data access and usage. These issues are magnified considerably with machine-learning because of the correctness, accuracy, and biases of a model are derived from the data used to train it, and not an explicit algorithm. The only way to explain and defend a model for completeness and against bias is to be able to maintain provenance for the data used to train it – to show that a self-driving car was in fact trained on a sufficient set of data from low-light conditions; to show that a facial recognition model was trained on a sufficiently diverse set of races and ethnicities. Solving this problem is rooted maintaining and archiving test data alongside rich enough metadata and data management catalogs to be able to recall and access those data as needed.

Oliver Schabenberger, COO & CTO at SAS

Oliver Schabenberger: Explainability of AI is part of a larger effort toward fair, accountable and transparent AI systems. The issues about algorithmic decision making are not new, but the conversation has ramped up in recent years. AI is bringing automated decisioning to new domains such as medical diagnostics and autonomous driving, and is building systems that are more complex, less transparent and highly scalable. That combination makes us uneasy.

2019 will be the year that transparency in AI comes front and center. We need to shine a light on the black box. Right now, it is way too dark.

Explainability and transparency are not achieved by handing over thousands of lines of computer code or by listing the millions of parameters of an artificial neural network. The inputs and outputs of AI system must be communicated in easily consumable form. What type of information is the system relying on and how does that affect its predictions and decisions.

Methods to explain the impact of features on predictions already exist; these enable us to shine a light on the workings of the black box by manipulating it.

Automation and autonomy are not the same thing. Monitoring the performance of systems build on algorithms is key and is the responsibility of those developing and of those deploying the algorithms.

We have lots of algorithms that we understand pretty well in isolation, but we are not quite sure what happens when we put hundreds or even thousands of these algorithms together.

One of AI’s most important and useful features is its ability to make connections and inferences that are not obvious or may even be counter-intuitive. As AI takes on increasingly important and diverse tasks, data scientists need to be able to explain clearly and simply what their models are doing and why. This will build confidence and trust in AI and the decisions it supports.

 

Sign up for the free insideBIGDATA newsletter.

Leave a Comment

*

Resource Links: