Sign up for our newsletter and get the latest big data news and analysis.

The Quest for Objective Truth: Evolving from Opinions to Facts Using Machine Learning in Real Estate

Introduction

There is no limit to how artificial intelligence (AI) and machine learning (ML) can impact the way we live and conduct business. From medical diagnoses to autonomous driving, the applications for these technologies seem endless. Real estate is no exception. Throughout the industry, ML is being used to predict market trends, forecast risks, calculate valuations, perform efficient appraisals, match properties to customer preferences and more. In short, a user’s next home could theoretically be just a click away.

An application that harnesses ML can take into account a family’s size, commute, hobbies and preferred home style. What’s more, it will ensure borrowers don’t overpay based on market conditions and future trends. Basically, it finds them as perfect a match as there could be.

Whether it’s real estate tech (RETech) startups or established organizations, data scientists are looking to pioneer AI and ML methodologies and lead the space, and customer experience is at the forefront of their objectives. The accuracy of their data models, how customers interact with a business’ brand and how satisfied those businesses are with the outcomes will help measure success.

Ultimately, the predictions and forecasts ML models make are only as good as the data used to train them.

Real Estate Data is Opinionated

Throughout the industry, most ML models start with understanding different patterns that exist in the data. Using basic property characteristics like living area square feet, number of bedrooms and bathrooms, or type of neighborhood, the models look to fit the property in a pattern so that accurate predictions can be made. The better the data about these characteristics, the better the performance of the model.

The problem is that real estate data is, by nature, opinionated. For instance, tax rolls are the product of a tax assessor’s evaluation. Property listings on multiple listing services reflect the descriptions of listing agents and brokers. Appraisals can vary based on who you hire. The varied way in which data is gathered and recorded lends itself to human errors and inconsistencies, which can unfortunately cause problems in the accuracy of the models.

Why It Matters

Subjectivity is what it means to be human.

A listing agent may call a space that’s 7×7 square-feet with a nook to hang clothes a “full room” to increase the likelihood of a sale at a higher price, but an appraiser may not agree. They may just call it a den.

These wide-ranging perspectives — and the ability to imagine and reimagine the same home as something bold and new — can be beautiful. But it can also make it difficult to find an objective truth.

In the absence of a data set that acts as a single source of truth, ML models are left with no choice but to consider this subjective data as reality. Because of this, the models run the risk of replicating and sometimes even amplifying human estimations and biases. Most models aren’t as precise in their predictions as we would like because they either fail to recognize statistical biases, reproduce past prejudices or don’t consider all possible factors.

How to Find Objective Truth

Fortunately, accurate data does exist — although in disparate sources — so it just takes a little more effort to find and organize it. More data allows for better identification of when and where human judgments cause bias in model predictions.

One method to overcome this bias is to build ML models specifically to find opinion patterns in the data, and then use those predictions to formulate the training data sets for the actual ML models. Knowing, for instance, that a particular appraiser tends to underestimate living area based on ceiling height, or that an assessor can be inexact in valuating home improvements based on permits, can create clarity in understanding opinion patterns in data.

Another option to remove inaccurate evaluations is to provide more data to frontline workers in the real estate industry. Doing so gives lenders, assessors, brokers, appraisers and listing agents the ability to move the needle towards data-driven insights. As a result, not only do workflows become more automated but the data behavior slowly changes. Over time, the model builds confidence in key data points and minimizes biases in data.

Data is the Future of Real Estate

As the American engineer Dr. William Edwards Deming once said, “Without data, you’re just a person with an opinion.”

Successfully drawing relevant conclusions from data requires human opinions and judgments to be accounted for and then suppressed through ML. Even as AI advances to recognize human emotions, this is one area where innovation may fare better without them.

About the Author

Anand Singh is a Senior Leader of Big Data Platform & Solutions at CoreLogic® and leads development of a few next-gen products in the real estate space. Singh holds four patents in the data valuation and monetization vertical and, through the power of data, analytics, machine learning and artificial intelligence, has diverse experience in building data products and solutions across different industries.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1

Leave a Comment

*

Resource Links: