Sign up for our newsletter and get the latest big data news and analysis.

Graph Databases and the Connected Enterprise

In this special guest feature, Emil Eifrem, Founder and CEO of Neo Technology suggests that in order to achieve connected enterprise status and realize the significance of the graph database, companies must understand the database options that are available. Right now, the landscape consists of three categories, which he outlines below, including where he sees their growth in the next few years. Emil sketched what today is known as the property graph model on a flight to Mumbai in 2000. As the CEO of Neo Technology, co-founder of Neo4j and a co-author of the O’Reilly book Graph Databases, he’s devoted his professional life to building and evangelizing graph databases. Committed to sustainable open source, Emil guides Neo along a balanced path between free availability and commercial reliability. He plans to save the world with graphs and own Larry’s yacht by the end of the decade.

As we enter a new year, many of us are asking what the future will look like in our respective industries. With the database industry specifically, we have seen a surge in interest for NoSQL databases that range from niche to mainstream which has left us with many choices to consider. As companies look to reevaluate their business models (including their databases of choice) in 2017 and beyond, it will behoove them to remember that those who don’t innovate will be left behind.

The reality is that companies will be replaced if they don’t update their business models. Did you know that in 1960, the average company lifespan on the S&P 500 was about 65 years? It is now projected that by 2027, 75% of S&P companies will be replaced. Think about that for a minute. That’s just 10 years!

Survival comes through innovation. Through 2027 and beyond, companies that successfully leverage the data within their enterprise have a better chance of surviving. The need for businesses to be able to treat every single customer with the most tailored, personalized experience has never been more important than it is now.

Information is now connected and synthesized to drive decision making, direction and interaction with the customer. This is what we refer to as the connected enterprise: a system of interwoven data points, working together on a single fabric to help businesses glean insights from data connections.

In order to achieve connected enterprise status and realize the significance of the graph database, companies must understand the database options that are available. Right now, the landscape consists of three categories, which I have outlined below, including where I see their growth in the next few years.

Relational Databases (RDBMS)

Not a revelation, but this is still the dominant technology and will likely remain as such. Countless companies and industries rely on relational databases, and I don’t see that changing any time soon.

Tier-One NoSQL Databases

With continued developer backing and commercial success, we’ll see a handful of tier-one, non-relational databases surface to the top among NoSQL ranks. Furthermore, by 2020 the growth of open source databases will outsource their closed source competitors. Also by this date, it is likely that most tier-one NoSQL vendors will have open source products and vibrant communities supporting them. There will likely be overlap stemming from these vendors offering a secondary functionality with other data models, which will of course, increase competition.

Tier-Two NoSQL Databases

Non-leader NoSQL databases, which I’ll dub as tier-two databases, will focus on niche models, many of which haven’t had the time to fully blossom. Most of these databases have narrower use cases and boutique models, which is why I predict they will have less of a commercial impact.

Where do graph databases come into play? We consider graph databases, especially ones with native graph storage and query processing,  the next generation of relational databases, but with first class support for “relationships,” or those implicit connections indicated via foreign keys in the relational model. By definition, a native graph database is one that stores data as a graph and processes queries that return data and its relationships in real-time, making it ideal for querying connected data. It models data in a straightforward, whiteboard friendly manner, enabling users to represent the complexity of data connections in an explicit way compared to that of a relational database.

Relationships are first-class citizens of the graph data model, unlike other database systems which require us to infer connections between entities using special properties such as foreign keys or out-of-band processing like MapReduce. Each node in the graph database model directly and physically contains a list of relationship records that represent its relationships to other nodes.

Companies can essentially keep the data as it is in the real world: small, normalized, yet richly connected entities. This allows users to query and view data from any imaginable point of interest, supporting many different use cases.

For companies trying to remain in the top 25% of the S&P 500, the increasingly clear success factor is leveraging connected data. Graph databases can provide a complete and real-time view of interconnected information and can scale more naturally to large datasets and require shorter query times, presenting an advantage for anyone who wants to ask complicated questions of a dataset. By treating the relationships between data points as first-class objects, a graph database is optimized to answer questions about those relationships. Ultimately, a connected enterprise powered by a graph database is more profitable than a disconnected one and companies who recognize this will thrive in the years to come.

 

Sign up for the free insideBIGDATA newsletter.

 

 

Leave a Comment

*

Resource Links: