Understanding 4 Concepts for Avoiding Bias in AI-enabled Fraud Detection

Print Friendly, PDF & Email

In this special guest feature, Ilya Gerner, Director of Compliance Strategy for GCOM, explains why bias can be an issue when using artificial intelligence (AI) for fraud detection. By understanding key concepts of machine learning (ML), organizations can ensure greater equity in AI outputs. Ilya has over ten years experience in advanced analytics, leading teams in the development of fraud detection algorithms, building decision-support tools, and conducting statistical analysis. Since 2020, he has supported the Internal Revenue Service in its Identity Theft Strategy initiative, leading efforts to provide data analytic capabilities to the Security Summit and the Information Sharing and Analysis Center (ISAC) and conducting strategic risk analysis to identify gaps in identity theft protection.

Fraud can be a big problem for government agencies that deliver benefits to the public. As one example, the proportion of unemployment benefits improperly paid out by states can exceed 40%

Artificial intelligence (AI) can help. AI can pore through reams of data to uncover potential fraud – and do it far more quickly and accurately than humans can. So, it’s no surprise that more agencies are turning to AI to help identify fraud and reduce fraudulent payouts.

But AI has a known potential to introduce bias. For instance, the Lensa AI image generator was recently found to deliver renderings that altered people’s appearance in ways that could be considered biased based on gender and race.

Bias can enter machine learning (ML) models in multiple ways. One way is through historical data. If you train a model based on a dataset that itself contains bias, that bias will be baked into the model. 

Another way is through the introduction of proxy data. Imagine that you’re looking for evidence of fraud in tax return filings, for example. A model that omits the age of the person filing the return but includes the total number of tax returns the person has previously filed could still result in disparate age-based impacts, because the number of tax returns filed in a lifetime could be a rough proximate for age.

Unfairness is of particular concern for governments, which deal in datasets that include legally protected attributes such as age, gender, and race. Agencies want to avoid both disparate treatment – applying decisions to demographic groups in dissimilar ways – and disparate impacts – harming or benefiting demographic groups in dissimilar ways.

But there are strategies for avoiding bias and inequity in AI-driven fraud detection. By better understanding how ML models function, organizations can help ensure fairness in AI.

ML Concepts for Blunting Bias

Let’s look at four key approaches to mitigating unfairness in ML algorithms – Unawareness, Demographic Parity, Equalized Odds, and Predictive Rate Parity – and how they might play out in a hypothetical but real-world scenario.

State tax agencies make every effort to collect overdue taxes, but resource constraints mean that they might not be able to resolve every case. So, agencies prioritize cases that will result in a high amount of money collected but at a low cost.

Let’s say a state tax agency wants to identify 50 taxpayers to receive a mailed notification that their taxes are past due. But it wants to avoid contacting taxpayers who are likely to consume resources by calling the agency’s customer service center after they receive the notification.

The agency knows from historical data that taxpayers over age 45 are more likely to call the customer service center. That means age, a sensitive attribute, is part of the picture. This will have different implications, depending on which strategy for mitigating unfairness is applied:

Mitigation Approach 1:  Unawareness. A model using this concept omits sensitive attributes such as age. But it doesn’t account for proxies of such sensitive attributes.

In our hypothetical example, the tax agency’s ML model applies the Unawareness concept to select taxpayers based on their frequency of calls to the contact center. It doesn’t directly use age as an attribute, but because age correlates with phone usage, it will favor younger taxpayers. As a result, the model selects 35 taxpayers under age 45, and 15 taxpayers over age 45. The outcome is that 10 of the taxpayers end up calling the contact center – not a bad result, but perhaps not ideal.

Mitigation Approach 2:  Demographic Parity. With this concept, the model’s probability of predicting a specific outcome is the same for one individual as for another individual with different sensitive attributes.

Applying the Demographic Parity concept, the tax agency’s ML model directly uses age to ensure equal distribution of taxpayers above and below age 45. As a result, the model selects 25 taxpayers under age 45, and 25 over age 45. The outcome is that 14 taxpayers call the contact center – a less favorable result than with the Unawareness concept.

Mitigation Approach 3:  Equalized Odds. With Equalized Odds, if the model predicts the same outcome for two individuals with different sensitive attributes, the probability it will select either individual is the same.

Applying this concept, the agency’s ML model uses age to ensure that true-positive and false-positive rates are the same for taxpayers above and below age 45. As a result, it selects 30 taxpayers under age 45, and 20 over age 45. In this case, eight taxpayers call the customer service center— the best outcome so far.

Mitigation Approach 4:  Predictive Rate Parity. With this concept, if the model predicts a specific outcome for two individuals with different sensitive attributes, the probability it will predict that outcome for either individual is the same.

Applying this concept, the agency’s ML model uses age to ensure that among taxpayers who call the contact center, an equal number are under 45 and over 45. As a result, it selects 40 taxpayers under age 45 and 10 over age 45. Eight taxpayers call the customer service center – the same outcome as with Equalized Odds.

To summarize the results of this hypothetical situation, two of the models achieve a more desirable outcome, but they rely on sensitive data. The model that doesn’t rely on sensitive data achieves a fairly desirable outcome, but the data it uses is a proxy for sensitive data.

Balancing Accuracy and Fairness

One challenge for ML modelers is that the four concepts for mitigating unfairness are mutually exclusive. Modelers have to select one fairness definition to apply to an algorithm, and then accept the tradeoffs.

Demographic Parity, Equalized Odds, and Predictive Rate Parity all involve disparate treatment. Unawareness doesn’t involve disparate treatment, but it can result in disparate impacts. Each concept has its pros and cons, and there’s no correct or incorrect choice.

Another challenge is that there’s often a tradeoff between accuracy and fairness. A highly accurate model might not be equitable. But improving the model’s fairness can make it less accurate. For fraud detection, an agency might choose to run a less accurate model to make fraud detection more equitable.

AI is helping governments more efficiently and effectively identify and prevent fraud. What’s important is that they understand how ML concepts can affect treatment and outcomes, and that they be transparent about how they’re using AI. By leveraging strategies to avoid bias and inequity in AI-enabled fraud detection, they can serve the public fairly.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Speak Your Mind



  1. It’s awesome how much technology has imroved and what can happen with frauds. I wish our word would be a better place for all