Sign up for our newsletter and get the latest big data news and analysis.

Federal Reserve Considering Formal Request for Public Feedback about the Adoption of AI in the Financial Services Sector

It’s great to see the Fed move in the direction of potentially providing clarification and additional guidance around supervision of AI/ML in financial services. Additional guidance from the Fed and other agencies (OCC, CFPB) would help a lot to clarify supervisory expectations when AI/ML models are used.  Specific guidance around these topics are important:

  • Explainability. How model decisions should be explained, e.g. to customers whose loan applications have been denied (i.e. the adverse action notice requirements  under Equal Credit Opportunity Act).
  • Model Risk Management. How model risk management guidance should be updated to reflect the additional complexity of AI/ML models; how to assess model robustness, identify model blind-spots, assess their conceptual soundness, stability, and monitor them on an ongoing basis; this will require providing additional clarity on how the Fed’s SR-11-7 guidance from 2011 should be updated and interpreted. 
  • Fairness. How to perform fairness assessment of AI/ML models.

These are all areas in which there is significant work in the academic and research community that is starting to make it into software tools. Clarity on supervisory expectations will help with protecting customers and setting uniform standards for financial services firms to adhere to.

These perspectives are according to Anupam Datta, the Co-Founder, President, and Chief Scientist of Truera, a model intelligence platform. He is also Professor of Electrical and Computer Engineering and (by courtesy) Computer Science at Carnegie Mellon University. His research focuses on enabling real-world complex systems to be accountable for their behavior, especially as they pertain to privacy, fairness, and security. Datta elaborates on the above topics:

Explainability

How model decisions should be explained, e.g. to customers whose loan applications have been denied (i.e. the adverse action notice requirements under the Equal Credit Opportunity Act). 

When banks and other financial services companies deny an individual’s credit application, they have to explain to them the reasons for the denial as well as provide guidance on what they could do to get a favorable outcome. These kinds of explanations are difficult to generate for complex machine learning and artificial intelligence models — the type of models that are increasingly being used to make decisions about credit and other high stakes use cases. 

A significant body of research tackles these questions and some amount of convergence is starting to happen around two classes of methods that are also finding their way into software products: 

(1) Feature importance methods help identify the most important features (or data inputs) that drive a model’s decision, e.g. identifying that Jane was denied credit because of her high debt-to-income ratio and her low income. These are also sometimes referred to as reason codes. Examples of these methods include work on Quantitative Input Influence from CMU, Shapley Additive Explanations from University of Washington, and Integrated Gradients from Google Research. 

(2) Counterfactual explanations provide guidance on how an individual should change their data in order to get a different outcome from a machine learning or artificial intelligence model (e.g., increasing income by $10K a year in order to be approved). These are sometimes called action codes. Examples include Wachter et al, Utsun et al, and Poyiadzi et al

There are important open questions around supervisory expectations on these two topics that would be useful to get clarity around — elaborations on requirements for reason and action codes. One key consideration is around accuracy of explanations, meaning do the explanations correctly identify reasons for decisions and actions to change decisions. This is an important topic that has received attention from researchers and deserves attention in regulatory conversations. 

Model Risk Management

Model risk management is a powerful and well developed function in financial institutions. It provides independent validation of models to ensure that they are robust, well-understood, and carefully monitored. The SR-11-7 guidance issued by the Federal Reserve Board in 2011 provides a well thought out framework for model risk management. 

The introduction of machine learning and artificial intelligence models into use cases, such as credit, fraud, and marketing raise new challenges for effective and efficient model risk models. It raises a key question: How should model risk management guidance be updated to reflect the additional complexity of AI/ML models? Specifically, how should model development and validation teams assess model robustness, identify model blindspots, assess their conceptual soundness, stability, and monitor them on an ongoing basis to ensure that models with issues are detected early, and fixed;  and models perform well in production and are adjusted when they start degrading. 

While there is considerable research on these topics and various methods are finding their way into software products, exactly how the SR-11-7 guidance from 2011 should be interpreted for AI/ML models remains unclear. This is an area where the Federal Reserve and the OCC could provide additional clarification. 

Fairness

There has been a lot of work showing that machine learning models may exhibit unfair bias against historically disadvantaged groups. In the context of financial services, this could mean that, unless additional steps are taken, a model may systematically favor men over women in making credit decisions if the historical data on which the model is trained has this pattern. This observation raises several questions: How do you measure indicators of disparity between groups? How do you attribute this disparity to features or training data issues that cause it?  How do you mitigate fairness issues with models if they exist? These are all questions on which there have been considerable research. This is a good time for fair lending testing to be revisited in light of the additional complexity of AI/ML models and the shared understanding of this topic in the community.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1

Leave a Comment

*

Resource Links: