When Data-Driven Meets Data Silos: Let the Fun Really Begin

Print Friendly, PDF & Email

In this special guest feature, Ed Thompson, CTO and co-founder at Matillion, believes that on balance, the systems that lead to having many data silos are a good thing; they indicate a business has the autonomy to choose the best systems in each department. This should make the business more efficient overall. However, the business needs data from all these systems. At Matillion, Ed leads the product and engineering teams to deliver world-class data integration and transformation products for customers looking to modernize their data analytics approaches. Previously, Ed spearheaded client projects involving Software Lifecycle Management and led the Integrations team for a technical services provider.

A giant new edifice dominates the San Francisco skyline. Bulging as if inflated by the very wealth and success that paid for it is the Salesforce Tower. The empire is a physical marker of what success looks like in the age of internet computing. Salesforce got where it is today by choosing a big market, betting big, and building a global sales and marketing powerhouse.
However, the element of its strategy that every smart SaaS company wants to emulate was its bottom-up approach.

Put yourself in the shoes of the VP Sales of any small to medium organization. It’s 2003, IT has implemented a bargain basement CRM system on an unreliable server behind a VPN that your reps have to access with patchy mobile connectivity. Your customizations are taking forever, it’s
not practical, it’s not working, and it’s holding back sales. One of your new reps mentions his prior employer had tried out this new service called Salesforce.com. It’s priced keenly enough that you can get it under the radar via your company credit card.

A couple of days later, whether anyone likes it or not, whether anyone even realizes it, your organization is a Salesforce customer. Anyone prepared to stand in the way of that becomes an organizational King Canute, washed away by the tide of “getting on with the job.”

In 2019, every ISV wants to ape that Salesforce model and every crevice of the IT industry has been mined by a torrent of great SaaS tools. These are priced enticingly just within the spending power of middle management, team leaders, developers, or anyone with an unscratched IT itch and a job they need to get done.

That gets customers through the door, but what makes them stay isn’t wining and dining the CIO for multi-year bundle deals. It is the technology that just works and keeps working to solve problems. The subscription-based cloud models make it just as easy to attract new customers as it is to lose them.

In addition to yielding success for the next Salesforce ISV giant, organizations gain a new unique data source that can tell them more about their customers, logistical and operational efficiency, or pricing model. These new systems also introduce the risk of creating yet another new data silo. Every time someone in an organization willingly commits their company to another open-ended subscription, they create a brand new silo of data. It’s never been easier for anyone to create a new source of critical data in a business, and it has never been more necessary to harness the power of that data.

Of course, this model works both ways. When an ISV needs to entice thousands of new users onto a platform and can’t rely on wining and dining the CIO to make that multi-year bundle deal, the software itself has to just work. It needs to comprehensively solve the problem it sets out to defeat. When it does, both the ISV and their customer win.

But using a slew of the latest and greatest SaaS tools tactically across a business only gets you so far. Every business needs to be data-driven to be competitive, and the best businesses discipline themselves to bring the best possible data to every decision they make – that is where the data integration problem starts. The answers are all there in those different systems, but it’s simply not ready to be used for decision making. Nowhere near.

Even a relatively small company will start collecting data silos from the very start, let’s consider the landscape of an enterprise company.

It is common that key business areas are each supported by one or more systems:

● Sales CRM
● Marketing automation
● Partner management portal
● Video conferencing
● Finance tools
● Billing systems
● Product telemetry
● Customer support software
● Web analytics tools
● Issue tracking software
● Software Development Management
● Human resources and recruiting management platforms

Every good business is focused on the customer and ensuring that their experience is smooth and serves their needs. However, to remain customer-centric a business must understand its customers, where they interact and how they buy. With the digital buying journey as complex as it is today, we may need data from any of the above systems to answer critical business
questions that help refine the customer experience.

Questions such as:

  • Did our message resonate with website visitors?
  • How was the sales experience? Did they find what they needed and get in touch with a person when they wanted to?
  • How long was the sales cycle? Was the lead a result of a partner referral?
  • How much did they spend? What products did they buy? Are they using the product? Are they getting stuck in the product?
  • Did they need support? Did we do a good job of quickly fixing issues?
  • Based on their experience, are they likely to renew? Churn?

We spin up and onboard new technologies to help answer our questions, leaving us with silos of data. To effectively answer these questions enterprises need access to ALL of the data, in one place, which sounds like a terrifying problem, but rest assured, it’s not as painful as you might think. There are two key things all enterprise businesses must get right when overcoming data silos.

Step #1: Collect all of the data into one cloudy place

The software industry has never been able to standardize its data models and while the past 40 years have been littered with attempts (SOAP web services, you told me you would solve everything!) the dominance of some of the SaaS vendors is at least ensuring a largely open approach across the board.

At the same time, it has never been easier or cheaper to store all the sources of data and harmonize them into data sets that anyone can understand. There are plenty of tools to choose from that can vastly simplify this task, but there are a few key considerations before you break open your wallet.

Connector Availability

Most data integration tools have lots of connectors out of the box, but there is always a balance between quantity and quality. Check the quality of these and ensure they match up with the systems you have now and the ones you are likely to use in the future. Will they expose those customizations? Do they cover all the data you need? And can they shift that data at-pace? Since the silos in your organization are always likely to grow, consider a per-connector pricing model as a negative as well.

Generic Vendor

Everyone has at least one or two niche systems that are not covered by the usual crop of connectors and probably will be low on the priority list for a vendor. Ensure the products you are assessing have generic connector options so you can build your own.

Security and Control

If you are dealing with sensitive data and need full control of the data processing infrastructure and its security architecture, a SaaS vendor might not be for you. You can still take advantage of a cloud data warehouse.


It’s tough to estimate just how much data you have or will have in the future, and this shouldn’t be a limiting factor. Take a careful look at vendors who use esoteric utility metrics. Once you have centralized the data, it must be cataloged in its raw form so authorized users can access and find what they need with ease. This is also the best time to lock away any of those data sets that are too hot to handle or anonymize data sets. While this solves the problem of data siloed across source systems, it doesn’t address the problem of format-siloed data. In other words, data that cannot be brought together because of the way it is collected and stored. A process needs to take place to transform this data into a common format.

Step #2: Transform data to get the most meaningful insights for your business — and, make sure everyone can do this

Once the data has been dropped into a cloud data warehouse, data lake or whatever the most en vogue term for a big ol’ data store is, most data pipeline vendors just leave you to it. And by ‘leave you to it’ I mean they rely on one of your technical departments to build some big, slow, difficult to maintain SQL or other pipeline routines that turn the data into something useful. This approach does NOT scale with your ambitions to be a data-driven organization.

It’s critically important to make sure the data and the ability to work with it is accessible across an organization. Sure, some data might be used for critical management reports that the business relies on powered by centrally managed ETL’s, but far more of it is most useful for everyday tactical decisions. The ability to transform that data should be a skill that is available right across an organization. Every department should have access to the best visualization and business intelligence made possible by a simple yet powerful data transformation capability.

ETL tools have traditionally been the preserve of the hard-core data professional, but this is changing faster than ever. A new wave of data transformation technology is fast enough and simple enough to be used by a wide range of business professionals – from software engineers to marketing analysts alike.


On balance, the systems that lead to having many data silos are a good thing; they indicate a business has the autonomy to choose the best systems in each department. This should make the business more efficient overall. However, the business needs data from all these systems.

If there is one thing that EVERYONE in an organization who can purchase a SaaS system (no matter how small or cheap) needs to know it’s that the data collected by those systems must be available; any system that hoards its own data in the cloud is a non-starter.

Dealing with those data silos is not nearly as painful as you might think. There are a large number of extract and load tools that can centralize your data. Forget these as they only solve half of your data silo problems. To overcome the more painful problem of data transformation look to the cloud-ISVs, the next Salesforces, that are developing and evolving in the name of customer obsession.

Sign up for the free insideBIGDATA newsletter.

Speak Your Mind



  1. Great article!