Sign up for our newsletter and get the latest big data news and analysis.

Big Data Industry Predictions for 2017

2017-trendsWow! What a year 2016 has been. The big data industry has significant inertia moving into 2017. In order to give our valued readers a pulse on important new trends leading into next year, we here at insideBIGDATA heard from all our friends across the vendor ecosystem to get their insights, reflections and predictions for what may be coming. We were very encouraged to hear such exciting perspectives. Even if only half actually come true, Big Data in the next year is destined to be quite an exciting ride. Enjoy!

Daniel D. Gutierrez – Managing Editor


IT becomes the data hero. It’s finally IT’s time to break the cycle and evolve from producer to enabler. IT is at the helm of the transformation to self-service analytics at scale. IT is providing the flexibility and agility the business needs to innovate all while balancing governance, data security, and compliance. And by empowering the organization to make data-driven decisions at the speed of business, IT will emerge as the data hero who helps shape the future of the business. – Francois Ajenstat, Chief Product Officer at Tableau

In 2017, we’re going to see analytics do more than ever to drive customer satisfaction. As the world of big data exploded, business leaders had a false comfort in having these mammoth data lakes which brought no value on their own when they were sitting unanalyzed. Plain and simple, data tells us about our customers — it’s how we learn more about customers and how to better serve them. As today’s customers expect a personalized experience when interacting with a business, we’re going to see customer analytics become the spinal cord of the customer journey, creating touch points at every level of the funnel and at every moment of interaction. – Ketan Karkhanis, SVP and GM of the Salesforce Analytics Cloud

Democratization of Data Analysis – In 2017 I believe that C-suite executives will begin to understand that there is a real gap between their data visions and the ability of their enterprise to move data horizontally throughout the organization. In the past, big data analysis has lagged in implementation compared to other parts of the business being transformed by advanced technology such as supply chains. I believe companies will begin to place different data storage systems into the hands of end users in a fast and efficient manner that has user self-direction and flexibility, democratizing data analysis. –  Chuck Pieper, CEO, Cambridge Semantics

The battleground for data-enriched CRM will only continue to heat up in 2017. Data is a great way to extend the value proposition of CRM to businesses of all sizes, especially those in the small-to mid-size range. By providing pre-populated data sets, the amount of “busy work” done by sales and other CRM users is reduced, and the better the data, the more effective individuals can be every moment of the day. A lot of M&A as well as in-house development and partnerships will fuel more data-powered CRM announcements in 2017. The key, of course, is seeing which providers provide the most seamless and most sensible use cases out of the box for their customers.” – Martin Schneider, Vice President of Corporate Communications, SugarCRM

In 2017 (and 2018), streaming analytics will become a default enterprise capability, and we’re going to see widespread enterprise adoption and implementation of this technology as the next big step to help companies gain a competitive advantage from their data. The rate of adoption will be a hockey stick model and ultimately take half the time it has taken Hadoop to rise as the default big data platform over the past six years. Streaming analytics will enable the real-time enterprise, serving as a transformational workload over their data platforms that will effectively move enterprises from analyzing data in batch-mode once or twice a day to the order of seconds to gain real-time insights and taking opportunistic actions. Overall, enterprises leveraging the power of real-time streaming analytics will become more sensitive, agile and gain a better understanding of their customers’ needs and habits to provide an overall better experience. In terms of the technology stack to achieve this, there will be an acceleration in the rise and spread of the usage of open source streaming engines, such as Spark Streaming and Flink, in tight integration with the enterprise Hadoop data lake, and that will increase the demand for tools and easier approaches to leverage open source in the enterprise. – Anand Venugopal, Head of Product, StreamAnalytix, Impetus Technologies

The unique value creation for businesses comes not just from processing and understanding transactions as they happen and then applying models, but by actually doing it before the consumer, or the sensor, logs in to do something. I predict we will quickly move from post-event and even real-time to preemptive analytics that can drive transactions instead of just modifying or optimizing them. This will have a transformative impact on the ability of a data-centric business to identify new revenue streams, save costs and improve their customer intimacy. – Scott Gnau, Chief Technology Officer, Hortonworks

Text analytics will be subsumed by ML/AI in 2017. The terms Text Mining and Text Analytics never really gained the kind of cachet and power in the marketplace that most of us hoped they would. This year will see the terms be subsumed by ML/AI and they’ll become component pieces of AI. – Jeff Catlin, CEO, Lexalytics

IT will start automating the choices for data management and analysis, leading to standardized data prep, quality, and governance. BI tools have been making more decisions for people and automating more processes. The knowledge for doing this — e.g., choosing one chart type over another — was embedded into the tools themselves. Data prep and management tends to be different, because the required rules are specific to the business requirements rather than being inherent in the data. Rule-based data management will enable IT to define rules that the business uses in its analytics processes, making business analysts more productive while still ensuring reliability and reproducibility. For a use case, consider a data scientist who sources data externally, and lets the data tools automatically choose which enterprise data prep and cleansing processes need to be applied. – Jake Freivald, Vice President, Information Builders

Managing the sprawl: Self-service analytics technologies have put analysis into the hands of more users and as a byproduct, led to the creation of derivative artifacts: additional datasets and reports, think Tableau workbooks and Excel spreadsheets. These artifacts have taken on a life of their own. In 2017, we will see a set of technologies begin to emerge to help organize these self-service data sets and manage data sprawl. These technologies will combine automation and encourage organic understanding, guided by well thought-out, but broadly applicable policies. – Venky Ganti, CTO, Alation

We will move from “only visual analysis” to include the whole supply chain of data. We will eventually see visualizations in unified hubs that show us more data, including asset management, catalogs, and portals, as well as visual self-service data preparation. Further, visualizations will become a more common means of communicating insights. The result of this is that more users will have a deeper understanding of the data supply chain, and the use of visual analysis will increase. – Dan Sommer, Senior Director and Market Intelligence Lead, Qlik

Artificial Intelligence

AI, ML, and NLP innovations have really exploded this past year but despite a lot of hype, most of the tangible applications are still based on specialized AI and not general AI. We will continue to see new use-cases of such specialized AI across verticals and key business processes. These use-cases would primarily be focused on the evolutionary process improvement side of the digital transformation. Since the efficiency of ML is based on constant improvement through better and wider training data, this would only add to the already expanding size of the data enterprise needs to manage. Good data management policies would be key to achieving a scalable and sustainable AI vision. For the business users this would mean better access to actionable intelligence, and elimination of routine tasks that can be delegated to the bots. For users who want to stay relevant in the new economy, this would allow them transform their roles in to knowledge workers that focus on tasks that can still only be done based on the general intelligence. Business users that can train the AI models would also be very hot commodity in the economy of future. – Vishal Awasthi, Chief Technology Officer, Dolphin Enterprise Solutions Corporation

Why machine-led, human-augmented intelligence is the next tech revolution – In 2017, more C-suite executives are going to prioritize data-driven business outcomes. As C-level executives see the potential for analytics, they’ve begun to show greater participation in getting analytics off the ground in their organizations, and I expect they’ll be leading the charge this year to ensure insights permeate every level and department of the business. All of the true technological revolutions have happened when people at a mass scale are empowered. So, shifting data science from an ivory tower function to giving everyone in an organization access to advanced, interactive AI will help each employee become smarter and more productive. It’s becoming clearer that when data can inform each and every decision a business user is making, businesses are going to see a real a competitive advantage and business outcome. – Ketan Karkhanis, SVP and GM of the Salesforce Analytics Cloud

Graph-Based Databases for Emerging Tech – The key applications companies are exploring — IoT, machine learning and AI – will be constrained by relational database technology. These areas will move towards sitting on top of graph-based architecture, which by definition, expands much more quickly in response to the output of those learnings. If you think of AI, it cycles back on data many, many times, and once it has a conclusion, it asks for more information. If that information in a relational format is not already there, all those AI, IoT and machine learning programs stop. But if it’s on a graph-based arch it automatically allows itself those multiple levels of joins to bring in more information. That will help unleash the real potential of some of those new technologies. – –  Chuck Pieper, CEO, Cambridge Semantics

The symbiotic relationship between man and machine will enable better decisions. Machines will never replace man, but they will empower and complement the data-driven efforts of workers in the coming years, especially as data becomes more accessible across departments and organizations. The democratization of data, the self-service movement and data’s continued simplicity means more people will be leveraging it in more applications – paving the way for a better man vs. machine relationship. For example, IBM Watson can go through medical papers, research and journals and then present top choices, but only a trained doctor can make the right decision for a specific patient. Adding to that, the reskilling of the workforce through nanodegrees will simplify data even further. Technology is sharpening the workforce and putting the power of data into the hands of business users – AI and machine-learning will only help them achieve more.” – Laura Sellers, VP of Product Management, Alteryx

My prediction about Big Data is that it will be subsumed into the topic of AI, as big data is an enabler of AI not an end in itself. The lack of focus on big data will actually let the field mature with only the serious players and result in much better business results. – Anil Kaul, Co-Founder and CEO of Absolutdata

Companies will stop reinventing the AI wheel. More and more companies are applying artificial intelligence and deep learning into their applications, but a unified, standardized engine to facilitate this process has lagged behind. Today, to insert AI into robots, drones, self-driving cars, and other devices, each company needs to reinvent the wheel. In 2017, we will see the emergence of unified AI engines that will eliminate or greatly mitigate these inefficiencies and propel the formation of a mature AI tech supplier industry.” – Massimiliano Versace, cofounder and CEO, Neurala

AI will (still) be the new black. One topic that was covered ad nauseam in 2016 was AI. While it’s important to be cautious about all of the AI hype (especially when it comes to use cases that sound like science fiction), the reality is that this technology is going to evolve even faster from here on out. It’s just in the past few years that innovative business-to-business companies have started using AI to achieve specific business outcomes. Keynoters at this year’s IBM World of Watson conference highlighted ways in which it is already delivering impressive business value, as well as examples of how it might help a CEO decide whether to buy a competitor, or help a doctor diagnose a patient’s symptoms in just the next three to five years. – Sean Zinsmeister, Senior Director of Product Marketing, Infer

Artificial intelligence (AI) initiatives will continue, but in the vein of commoditisation – AI is garnering interest in the legal sector, but a closer inspection of the tools and apps being made available reveal that they are presently more similar to commoditised legal services in the form of packaged, low cost modules for areas such as wills, contracts, pre-nuptials and non-disclosure agreements for the benefit of consumers. Undoubtedly, AI offers tremendous potential and some large law firms have launched initiatives to leverage the technology. However, there’s a significant amount of work to be done in defining the ethical and legal boundaries for AI, before the technology can truly be utilised for delivering legal services to clients with minimal human involvement. Until then, in 2017 and perhaps for a few more years yet, we will continue to see incremental innovative efforts to leverage the technology, but in the vein of commoditisation – similar to what we have seen in the last 12 months. – Roy Russell, CEO of Ascertus Limited

AI and analytics vendor M&A activity will accelerate — There’s no doubt that there’s a massive land grab for anything AI, machine learning or deep learning. Major players as diverse as Google, Apple, Salesforce and Microsoft to AOL, Twitter and Amazon drove the acquisition trend this year. Due to the short operating history of most of the startups being acquired, these moves are as much about acquiring the limited number of AI experts on the planet as the value of what each company has produced to date. The battle for AI enterprise mindshare has clearly been drawn between IBM Watson, Salesforce Einstein, and Oracle’s Adaptive Intelligent Applications. What’s well understood is that AI needs a consistent foundation of reliable data upon which to operate. With a limited number of startups offering these integrated capabilities, the quest for relevant insights and ultimately recommended actions that can help with predictive and more efficient forecasting and decision-making will lead to even more aggressive M&A activity in 2017. – Ramon Chen, CMO, Reltio

AI and machine learning are already infiltrating the workforce across a multitude of industries. In fact, when it comes to HR and people management, more and more companies are starting to deploy technologies that bring transparency to data around the work employees do. This is creating huge opportunities for businesses to leverage frequent touch points, check-ins and opportunities to provide feedback to employees and get a holistic picture of what’s driving work. In 2017 we can expect to see data and analytics used more in HR and management to help visualize behaviors of employees, from the time they were hired to their success down the road, and understand why they have been so successful. By using machine learning companies can focus on building teams to support long-term goal achievement, instead of frantically hiring to fill immediate needs. – Kris Duggan, CEO of BetterWorks

Artificial intelligence (AI) is rapidly becoming more accessible. Previously, you needed a lot of training to implement AI, but this is becoming less and less true as technology becomes more intelligent. Over the next several years, we can expect AI to become more of a commodity and companies like Google and Microsoft will make it extremely easy for developers to analyze large amounts of data on their platform. Once that data analysis is done, developers will be able to implement processes based on those results, which is essentially AI. In the next year we can expect that AI will become much easier to implement for developers via API calls into their applications. – Kurt Collins, Director of Technology Evangelism & Partnerships,

This year we saw customer interactions evolve from traditional question and answer dialogues, to intelligent machines now enhancing the process and experience. Machines are learning patterns and providing answers to customers to help eliminate some of the mundane tasks that customer service agents used to handle; and intelligent machine personas like the Alexa in the Amazon Echo and Siri in various Apple devices, are paving the way. In 2017, we’ll see more capabilities when it comes to artificial intelligence and customer service like Alexa triggering a call from contact center based on a question about online order status, thermostats submitting a trouble ticket after noticing a problem with the heater, or Siri searching through a cable company’s FAQ to answer to a commonly asked question about internet service troubleshooting. However, one thing will always remain true – human interactions will still be critical when dealing with complex situations or to provide the empathy that is needed in customer service. – Mayur Anadkat, VP of Product Marketing, Five9

For some, the mere mention of artificial intelligence (AI) corresponds to a fashion return from decades ago. So yes, those wide ties are back, and in 2017 we’ll see the rapid adoption of AI in the form of relatively straightforward algorithms deployed on large data sets to address repetitive automated tasks. First a brief history of AI. In the 1960s, Ray Solomonoff laid the foundations of a mathematical theory of AI, introducing universal Bayesian methods for inductive inference and prediction. In 1980 the First National Conference of the American Association for Artificial Intelligence (AAAI) was held at Stanford and marked the application of theories in software. AI is now back in mainstream discussions and the umbrella buzzword for machine intelligence, machine learning, neural networks, and cognitive computing. Why is AI a rejuvenated trend? The three V’s come to mind: Velocity, Variety and Volume. Platforms that can process the three V’s with modern and traditional processing models that scale horizontally providing 10-20X cost efficiency over traditional platforms. Google has documented how simple algorithms executed frequently against large datasets yield better results than other approaches using smaller sets. We’ll see the highest value from applying AI to high volume repetitive tasks where consistency is more effective than gaining human intuitive oversight at the expense of human error and cost. – John Schroeder, Chairman and Founder, MapR

The Cognitive Era of computing will make it possible to converge artificial intelligence, business intelligence, machine learning and real-time analytics in various ways that will make real-time intelligence a reality. Such “speed of thought” analyses would not be possible were it not for the unprecedented performance afforded by hardware acceleration of in-memory data stores. By delivering extraordinary performance without the need to define a schema or index in advance, GPU acceleration provides the ability to perform exploratory analytics that will be required for cognitive computing. – Eric Mizell, Vice President, Global Solutions Engineering, Kinetica

We expect three of the well-funded ML/AI companies to go out of business, while a number of the lesser funded companies will not get off the ground. In addition, we’ll lose more than a few pure-play text analytics companies as ML/AI subsumes more and more of the functionality. The influx of cash isn’t infinite, and companies will need to learn the importance of ROI/TCO analysis. Do you really need a slide or firepole between floors? No. Do you need to have budget for things like, say, salary and advertising, yes. Another common failure will be over-investing in the engineering aspect of the business. While it’s critical to have a great product, people also need to hear about it. If you can’t clearly articulate your business necessity, then it doesn’t matter how cool the product is. – Jeff Catlin, CEO, Lexalytics

Deep Learning will move out of the hype zone and into reality. Deep learning is getting massive buzz recently. Unfortunately, many people are once again making the mistake of thinking that deep learning is a magic, cure-all bullet for all things analytics. The fact is that deep learning is amazingly powerful for some areas such as image recognition. However, that doesn’t mean it can apply everywhere. While deep learning will be in place at a large number of companies in the coming year, the market will start to recognize where it really makes sense and where it does not. By better defining where deep learning plays, it will increase focus on the right areas and speed the delivery of value. – Bill Franks, Chief Analytics Officer, Teradata

By the end of 2017, the idea of deep learning will have matured and true use cases will emerge. For example, Google uses it to look at faces and then determine if the face is happy, sad, etc. There are also existing use cases in which the police is using it to compare the “baseline” facial structure to “real time” facial expressions to determine intoxication, duress or other potentially adverse activities. – Joanna Schloss, Director of Product Marketing, Datameer

The future of all enterprise processes will be driven by Artificial Intelligence, which requires the highest quality of data to be successful. AI is where all business processes are headed; however, with the recent push of AI technology advancements for businesses – many companies have not addressed how they will ensure that the data their AI models are built on is high quality. Data quality is key to pulling accurate insights and actions and in 2017, we will see more companies focus on solving the challenge of maintaining accurate, valuable data, so that AI technology lives up to its promise of driving change and improvement for businesses. – Darian Shirazi, CEO and Co-Founder, Radius

Prediction: Artificial Intelligence will Create New Marketing Categories, Like the B2B Business Concierge. In 2017, AI will allow marketers to create highly personalized ads tailored to buyer’s specific interests in real-time through superior and infinite knowledge. AI will also make mass email marketing tools obsolete (and the resulting spam email), automatically scanning out the “bad” leads and creating custom, personalized communication instead. As AI continues to advance, we can expect to see the recommendation engines that power companies like Netflix and Amazon develop specifically for the B2B market. This will start to pave the way for a B2B business concierge – a completely automated and customized buyer’s journey throughout the funnel that is driven by AI. – Chris Golec, Founder & CEO, Demandbase

AI-as-a-Service will take off: In 2016 AI was applied to solve known problems. And as we move forward, we will start leveraging AI to gain greater insights into ongoing problems that we didn’t even know existed. Using AI to uncover these “unknown unknowns” will free us to collaborate more and tackle new, interesting and life-changing challenges … AI will amplify humans: We have made enormous leaps forward to build machines capable of understanding and simulating human tasks, even mimicking our thought process. 2017 will be the year of knowledge-based AI, as we develop systems based on knowledge, which learn and retain knowledge of prior tasks, rather than pure automation of tasks we want performed. This will completely disrupt the way we work as human capabilities are amplified by machines that learn, remember and inform … AI will be seen as solving the workforce crisis, not creating it: As the baby boomer generation retires, enterprises are on the brink of losing significant institutional mindshare and knowledge. With the astronomical price tag of losing these workers, enterprises are turning to knowledge management and machine learning to train AI to capture institutional knowledge and act on our behalf. In the coming year and beyond, we will see AI adoption not only come from technological need, but also from the need to capture current employee insights and know-how. – Abdul Razack, SVP & Head of Platforms, Big Data and Analytics, Infosys

How Does AI Fit in an Enterprise? Whatever the industry, we can take better advantage of AI by making our current work tools — apps, medical devices, supply chain systems — much better through machine learning. The key is in the delivery — in other words, the “operationalization” of the analytics. I like to use the analogy of the self-driving car. The best autonomous vehicle systems will surely be able to handle the driving task in typical conditions; there are lots of little decisions to be made, but they are straightforward and easy to make. It’s when conditions become more challenging that the magic happens; the car will not only know when a human should intervene but also will smoothly transfer control to the driver and then back again to the machine. We’re on the cusp of where our everyday work apps and devices shift from repositories to assistants — and we need to start planning for it. Today, employees — or their boss — determine the next set of tasks to focus on. They log into an app, go through a checklist, generate a BI report, etc. In contrast, AI could automatically serve up 50% (or more) of what a specific employee needs to focus on that day, and deliver those tasks via a Slack app or Salesforce Chatter. Success will be found in making AI pervasive across apps and operations and in its ability to affect people’s work behavior to achieve larger business objectives. – Dan Udoutch, CEO, Alpine Data

Many Fortune 500 brands are already using chatbots, and many more are developing them as we speak. What’s ahead for the industry? Though it may not seem sexy, the next year will be a foundational one when it comes to applying AI. Chatbots are only as valuable as the relationships they build and the scenarios they can support, so their level of sophistication will make or break them. Investing in AI is only one piece of the puzzle, and 2017 will be the year that companies need to expand their AI initiatives while also doubling down on investing to improve them with new data streams and integration across channels. – Dave O’Flanagan, CEO, Boxever

The AI Hypecycle and Trough of Disillusionment, 2017: IDC predicts that by 2018, 75 percent of enterprise and ISV development will include cognitive/AI or machine learning functionality in at least one application. While dazzling POCs will continue to capture our imaginations, companies will quickly realize that AI is a lot harder than it appears at first blush and a more measured, long-term approach to AI is needed. AI is only as intelligent as the data behind it, and we are not yet at a point where enough organizations can harvest their data well enough to fulfill their AI dreams. – Ashley Stirrup, CMO, Talend

Hybrid Deep Learning systems. In 2017 we’ll see the rise of embedded analytics, optimized by cloud-based learning. The hybrid architectures used by autonomous vehicles – systems embedded within the vehicle to make numerous decisions per second, augmented by cloud-based learning platforms capable of optimizing decisions across the fleet – will serve as the foundation for the next generation of IoT machines. – Snehal Antani, CTO, Splunk

The focus will shift from “advanced analytics” to “advancing analytics.” Advanced analytics will continue to grow, and eventually be brought into self-service tools. With more users advancing their analytics, Artificial Intelligence (AI) might play a bigger role in organizations. But that means AI will also need to have high levels of usability as well, since users will need it to augment their analyses and business decisions. – Dan Sommer, Senior Director and Market Intelligence Lead, Qlik

AI and Advanced Machine Learning: The Automatic Enterprise. Thanks to parallel processing, big data, cloud technology, and advanced algorithms, Artificial Intelligence (AI) and machine learning are becoming more powerful. As tech giants like Google, Facebook, and Apple invest in AI, it is becoming more mainstream. People already interact with virtual personal assistants (PAs) like Apple’s Siri® and Google Assistant®. Facebook successfully created technology to identify people’s faces with its facial recognition app. Recommendation engines and robo-advisors are becoming a reality in financial services. And robotic butlers are delivering room service in hotels around the world. The analysts are jumping on board, with Forrester predicting that investments in AI will grow 300% in 2017 and Gartner forecasting that 50% of all analytical interactions will be delivered via AI in the next three to five years. These are impressive numbers. But how will these investments pay off for the enterprise? Are computers really more intelligent than people? Many jobs will disappear through automation and others will change significantly as the enterprise becomes more automated and intelligent. Over the next few years, some of us could be answering to robo-bosses. From a productivity perspective, we spend a third of our time in the workplace collecting and processing data—AI could all but eliminate this work. Every job in every industry will be impacted by machine learning. The upside? The opportunity to think exponentially means that the potential applications for these technologies are limitless. For businesses, understanding cognitive systems, big data analytics, machine learning technology, and AI—and how to leverage them—will be critical for survival. In the short term, these technologies will give organizations faster access to sophisticated insights, empowering them to make better decisions and act with agility to outpace their competitors. – Mark Barrenechea, CEO and CTO, OpenText

Big Data

Many companies have ideas and initiatives around big data, but not a solid understanding of how it, along with the subsequent insights, will help them better the business or develop new solutions. Technology suddenly gave organizations the ability to process large amounts of data at a high frequency. That together with the move to mobile (as every consumer has one or more devices that they are constantly online with) drives a lot of data – whether through social networks, search engines or more. You have the information but it needs to be taken one step further – you need to analyze it. The question for big data is “what can I learn from it? Where can I make meaningful insights? – Dr. Werner Hopf, CEO and Archiving Principal, Dolphin Enterprise Solutions Corporation

Big data becomes fast and approachable. Sure, you can perform machine learning and conduct sentiment analysis on Hadoop, but the first question people often ask is: “How fast is the interactive SQL?” SQL, after all, is the conduit to business users who want to use Hadoop data for faster, more repeatable KPI dashboards as well as exploratory analysis. In 2017, options will expand to speed up Hadoop. This shift has already started, as evidenced by the adoption of faster databases like Exasol and MemSQL, Hadoop-based stores like Kudu, and technologies that enable faster queries. – Dan Kogan, director of product marketing at Tableau

Big Data, More Data, Fragmented Data – As we amass more enterprise data and blend third-party data, we create greater opportunity for insight and impact. However, let’s be honest. All companies are not created equal when it comes to their Big Data learning curves and sophistication. We will continue to see companies investing in, yet struggling with building their data layers.  Opera Solutions expects to see more attention and focus on data flow, data layers, and the emergence of the insights layer. – Georges Smine, VP Product Marketing, Opera Solutions

Moving into SMB – I see the advent of the big data analytics and discovery for SMB to start taking root in 2017. Big, rich, data environments such as pharma, healthcare, life sciences, financial services, insurance are the current industries leading big data analytics but graph-based databases can also be used by small companies, where you don’t want to spend your time coding and recoding every time you change your mind about what it is you want to look for. –  Chuck Pieper, CEO, Cambridge Semantics

Despite the hype and promise of big data and AI, few clear examples exist today where these technologies impact our lives on a daily basis. Serving relevant ads to website visitors and detecting fraud in credit card transactions come to mind. These companies have invested in big data and machine learning for years, which has allowed them to develop solid data architectures. Companies that have lived with NoSQL databases for more than a year know that ignoring data model design and instead leaning too heavily on the flexible, schema-free capabilities of these databases leads to poorly performing applications, difficult maintainability, and ultimately rework. In 2017, I predict the discipline of data modeling will gain strength as a sought-after skill set and project activity, particularly for companies dedicated to building impactful data strategies. Tools, such as well-designed industry clouds provide the professional data model design necessary for long-term success.” – J.J. Jakubik, Chief Architect, Vlocity

The sheer volume of data generated by applications and infrastructure will only increase, resulting in data overload. For the first time, IT Operations teams will embrace an algorithmic approach – also known as Algorithmic IT Operations, or AIOps – to detect signal from noise to ensure successful service delivery. AIOps platforms will provide IT Operations teams with situational awareness and diagnostic capabilities that were not previously possible using manual, non-algorithmic techniques.” – Michael Butt, Senior Product Marketing Manager at BigPanda

We’re living in a big data glut. But in 2017, we’ll see data become more intelligent, more useable, and more relevant than ever. The cloud has opened the doors to more affordable, smart data solutions that make it possible for non-technical users to explore, through visualization tools, the power of predictive analytics. We’re also seeing the increasing democratization of artificial intelligence which is driving more sophisticated consumer insights and decision-making. Forward-thinking organizations need to approach predictive analytics with the future and extensibility in mind. Today’s tools may not be the best for tomorrow’s needs. Cloud solutions are still evolving and haven’t reached functionality maturity yet, but by merging cloud, open source, and agile development methodologies into their predictive analytics stack, organizations will be able to easily adopt as technology advances.  – Slava Koltovich, CEO, EastBanc Technologies

One Team, One Platform – Data is the common thread within the enterprise, regardless of where the source might be. In the past data handlers have relied on disparate systems for data needs. Next year, the goal will be to move data into the future by providing a one-stop shop to access, develop and explore data. Companies will now look to one data platform for integrated cloud services with easy access and consistent behavior that is equipped to satisfy the needs of diverse data-hungry professionals across the organization. Just as you can easily access a variety of apps on your smartphone, business users and data professionals will look to deploy one platform that allows their organization to tap into the rich capabilities of data. – Derek Schoettle, General Manager, Cloud Data Services, IBM Watson and Cloud Platform

Next year will bring about another deluge of data brought on by advancements in the way we capture it. As more hardware and software is instrumented especially for this purpose, such as IoT devices, it will become easier and cheaper to capture data. Organizations will continue to feed on the increased data volume while the big data industry struggles through a shortage of data scientists and the boundaries imposed by non-scalable legacy software that can’t perform analytics at a granular level on big data data. Healthcare will especially be hard hit in this regard. Sources of huge healthcare data sets are becoming more abundant, ranging from macro-level sources like surveys by the World Health Organization, to micro-level sources like next-generation Genomics technologies. Healthcare professionals are leveraging these data to improve the quality and speed of their services. Even traditional technology companies are venturing into this field. For example, Google is ploughing money into its healthcare initiatives like Calico, its “life-expansion” project, and Verily, which is aimed at disease prevention. We expect the demand for innovative technical solutions in all industries, particularly healthcare to explode in popularity next year. – Michael Upchurch, COO, Fuzzy Logix

Data lakes will finally become useful — Many companies who took the data lake plunge in the early days have spent a significant amount of money not only buying into the promise of low cost storage and process, but a plethora of services in order to aggregate and make available significant pools of big data to be correlated and uncovered for better insights. The challenge has been finding skilled data scientists that are able to make sense of the information, while also guaranteeing the reliability of data upon which data is being aligned and correlated to (although noted expert Tom Davenport recently claimed it’s a myth that data scientists are hard to find). Data lakes have also fallen short in providing input into and receiving real-time updates from operational applications. Fortunately, the gap is narrowing between what has traditionally been the discipline and set of technologies known as master data management (MDM), and the world of operational applications, analytical data warehouses and data lakes. With existing big data projects recognizing the need for a reliable data foundation, and new projects being combined into a holistic data management strategy, data lakes may finally fulfill their promise in 2017. – Ramon Chen, CMO, Reltio

I believe customers will choose solutions in Big Data that deliver faster time to value, simple deployment with ease of management, interoperability with open source tools and solutions that help bridge the skills gap. I predict that Big Data technologies like Hadoop will be adopted at an accelerated rate because customers must get smarter about data. Based on customer conversations, they understand they could be disrupted by a new competitor with a data driven business model. Hadoop will be at the core of a data driven business allowing organizations to be more agile, know more about their customers, and offer new services ahead of the competition. I believe the strength of the community, the work of Cloudera and Hortonworks along with maturing ecosystem tools, as well as interoperability with analytical tools, will provide a secure, enterprise ready data platform. – Armando Acosta, Hadoop Product Manager and Data Analytics SME, Dell EMC

Open source and faux-pen source data technology choices will continue to proliferate, but the new model will redistribute rather than purely reduce costs for enterprises. Vendors are walking away from traditional database and data warehouse business models. Prime examples of this are Pivotal open sourcing Greenplum, Hewlett Packard Enterprise (HPE) spinning off Vertica and other assets, and Actian stopping support for Matrix (formerly ParAccel). Open source projects – or in many cases, vendor sponsored faux-pen sources – are becoming the new model for data processing technology. But while open source reduces the costs of vendor licensing, it also shifts responsibility to the enterprise to sort through the options, assemble stacks and productionize open source projects. This increase in complexity and consumption challenges requires new hiring and/or partnering with as-a-Service cloud vendors. – Prat Moghe, Founder and CEO, Cazena

In 2017 organizations will shift from the “build it and they will come” data lake approach to a business-driven data approach. Use case orientation drives the combination of analytics and operations. Approaching a data lake as “Imagine what your business could do if all your data were collected in one centralized, secure, fully-governed place that any department can access anytime, anywhere.” could sound attractive at a high level, but too frequently results in a data swamp that looks like a data warehouse rebuild and can’t address real-time and operational use case requirements. Once in place the concept is to “ask questions”. In reality, the world moves faster today. Today’s world requires analytics and operational capabilities to address customers, process claims and interface to devices in real time at an individual level. For example any ecommerce site must provide individualized recommendations and price checks in real time. Healthcare organizations must process valid claims and block fraudulent claims by combining analytics with operational systems. Media companies are now personalizing content served though set top boxes. Auto manufacturers and ride sharing companies are interoperating at scale with cars and the drivers. Delivering these use cases requires an agile platform that can provide both analytical and operational processing to increase value from additional use cases that span from back office analytics to front office operations. In 2017, organizations will push aggressively beyond an “asking questions” approach and architect to drive initial and long term business value. – John Schroeder, Chairman and Founder, MapR

Big data goes self-service. Organizations that have realized the value of big data now face a new problem: IT and data teams are being flooded with requests from users to pull data. To address this, we’ll see more organizations opt for a self-service data model so that anyone in the company can easily pull data to uncover new insights to make business decisions. A self-service infrastructure allows any employee to easily access and analyze data, saving IT and data teams precious time and resources. To make this a reality, all types of data in every department will need to be published so that users can self-serve. – Ashish Thusoo, CEO, Qubole

2017 will be the year organizations begin to rekindle trust in their data lakes. The “dump it in the data lake” mentality compromises analysis and sows distrust in the data. With so many new and evolving data sources like sensors and connected devices, organizations must be vigilant about the integrity of their data and expect and plan for regular, unanticipated changes to the format of their incoming data. Next year, organizations will begin to change their mindset and look for ways to constantly monitor and sanitize data as it arrives, before it reaches its destination. – Girish Pancha, CEO and Founder, StreamSets

Companies have been collecting data for awhile, so the data lake is well-stocked with fish. But the people who needed data most couldn’t generally find the right fish. I support the notion of a data lake, dumping all your raw data into one data warehouse. But it doesn’t work if you don’t have a way to make it cohesive when you query it. There have been great innovations by companies like Segment, Fivetran and Stitch, which make moving data into the lake easier. Modeling data is the final step that brings it all together and helps some of the best companies in the world see through data.
Companies like Docker, Amazon Prime Now and BuzzFeed are using all their data to create comprehensive views of their customers and of their businesses. When these final two steps are added, the data lake can finally be a powerful way to get all your data into the hands of every decision-maker to make companies more successful. – Lloyd Tabb, Founder, Chairman & CTO, Looker

In 2017, organizations will stop letting data lakes be their proverbial ball and chain. Centralized data stores still have a place in initiatives of the future: How else can you compare current data with historical data to identify trends and patterns? Yet, relying solely on a centralized data strategy will ensure data weighs you down. Rather than a data lake-focused approach, organizations will begin to shift the bulk of their investments to implementing solutions that enable data to be utilized where it’s generated and where business process occur – at the edge. In years to come, this shift will be understood as especially prescient, now that edge analytics and distributed strategies are becoming increasingly important parts of deriving value from data. – Adam Wray, CEO, Basho Technologies

In 2017, the reports of Big Data’s death will be greatly exaggerated, as will the hype around IoT and AI. In reality, all of these disciplines focus on data capture, curation, analysis and modeling. The importance of that suite of activities won’t go away unless all businesses cease operation. – Andrew Brust, Senior Director, Market Strategy and Intelligence, Datameer

Big data or bust in 2017? Big data is an example of something that didn’t get as far along as people predicted. Of course, it wasn’t stagnant. But nearly everyone involved in the enterprise sector would like it to move faster. The problem is that companies struggle, in general, to make sense of big data because of its sheer volume, the speed in which it is collected and the great variety of content it encompasses. Looking ahead, we can expect to see newer tools and procedures that will help companies house and examine these massive amounts of data and help them move toward truly making data-driven decisions. – Bob DeSantis, COO, Conga

In the new world of data, DBMS is really the management of a collection of data systems. This deserves a new thinking or approach to how we manage these systems and the applications that leverage them. The enterprise has long relied on raw logs and systems monitoring solutions to optimize their Big Data applications—and as companies continue to adopt numerous disparate Big Data technologies to help meet their business needs, complexity is only increasing while the time required to diagnose and resolve issues grows exponentially, all of which is underlined by an acute shortage of talent capable of effectively running and maintaining these intricate Big Data systems. The primary challenge faced by the enterprise is finding a single full-stack platform capable of analyzing, optimizing and resolving any issues that exists with Big Data applications and the infrastructure supporting them. In the year ahead, the enterprise will search for a solution that addresses the unmet challenges of data teams that find themselves spending much of their day digging through machine logs in order to identify the root cause of problems on a Big Data stack. These problems, if not eradicated, will continue to reduce application performance and divert teams from their real mission of deriving the full value from their Big Data. Ideal solutions will be ones that resolve problems automatically, detecting and pinpointing performance and reliability issues with Big Data applications running on clusters; solutions that open up the doors to data equality across the enterprise, that with just the click of a button, drastically accelerate the time-to-value of Big Data investments. – Unravel Data

Big data wanes – Big data will continue to wane as a term. The focus now turns from infrastructure to applications with specific purposes. Companies will look to applications and new business models for concrete value, rather than the more general idea that data can be useful at scale. – Satyen Sangani, CEO, Alation

Big Data analytics to become more visually immersive in 2017: Big Data insights will become more visually immersive in 2017. The capability to generate quality insights has come a long way, but companies are still searching for the secret sauce to really put the insights to work. Companies need a way to make analytics easier to understand and a way to more effectively tell a story through analytics in a straightforward but compelling way. Spatial computing can do this; visualization in three dimensions will expand the range of business users who can benefit from Big Data insights – from C-suite to functional managers to data scientists. Doing it right will drive more data-driven decision making, better complex problem solving and corporate innovation. Companies such as IBM (through Watson) and Accenture (through Accenture Connected Analytics Experience) are leading the way. – Oblong Industries

Business Intelligence

Self-service extends to data prep. While self-service data discovery has become the standard, data prep has remained in the realm of IT and data experts. This will change in 2017. Common data-prep tasks like data parsing, JSON and HTML imports, and data wrangling will no longer be delegated to specialists. With new innovations in this transforming space, everyone will be able to tackle these tasks as part of their analytics flow. – Francois Ajenstat, Chief Product Officer at Tableau

Many Big Data systems are lacking simple UI’s for data input and classification. This usually requires highly technical staff and costs for the configuration, ongoing use, and for the interpretation of Big Data. This produces a high cost of entry and ongoing expenses. To add insult to injury, even once deployed, if the tool cannot be completely adopted by all necessary end users due to complexity, all BI efforts may be for naught. Successful User Interfaces (UI’s) are simple and flexible and modify to the needs of a variety of users and any changes to fluid data sets. This is the future of Big Data. Making Big Data even more accessible accurate, and therefore indispensable. Just as other technologies have evolved, BI is evolving to be more accessible than ever to today’s business. This will only continue in the future. – Dave Bethers, Chief Operations Officer, TCN

Digital transformation will be a CIO imperative for greater than 50% of all institutions. As such, IT will no longer be pushing Big Data technologies to the business owners. Instead, IT will need to respond to the demands for faster and more predicative analytics. Data scientists will be embedded into the business units in larger companies and in the smaller firms, everyone will be considered a citizen data scientist. Regardless, business intelligence will no longer be considered a department but an attitude. A way of life. At least for those who plan to be in business by 2019. – Anthony Dina, Director Data Analytics, Dell EMC

In 2017, business people will become ‘data mixologists’, capable of blending data from any combination of systems – centralized and decentralized – to produce new insights on their own, share them with others, and make better, more trusted business decisions. Historically, mixing together data from spreadsheets, databases, or applications like Marketo, Salesforce and Google Analytics has been an inaccessible capability for business people, as well as a data governance nightmare. Until now, self-service data prep tools have been designed for data scientists who work in silos of disconnected data – a phenomenon known as “data discovery sprawl”. These silos produce inaccurate and unreliable insights, and they don’t put those insights in the hands of business decision-makers. In the coming year, we will see business users choose modern tools that help them become data mixologists, making empowered decisions from trustworthy data sets. – Pedro Arellano, VP of Product Strategy, Birst


The move to serverless architectures will become more widespread in the coming years, and will impact how applications are deployed and managed. Serverless architectures allow users to deploy code and run applications without managing the supporting infrastructure. Instead, the supporting infrastructure is managed by a third party. AWS’ cloud service Amazon Lambda is an example, and we anticipate growth in the number of providers and the breadth of enterprise-ready applications. As use of serverless architectures begin to rise, the overall application development and deployment strategy will begin to shift away from operations and more towards business logic. More cloud providers will also begin migrating to this form of architecture, allowing for a more competitive market with more expansive application support. As such, it will be important for database solution providers to be ‘cloud-ready.’ – Patrick McFadin, Chief Evangelist for Apache Cassandra, DataStax

The conversation around vendor lock-in is becoming much more prominent in senior level meetings, spurred on by many enterprises’ decision to move to the public cloud. To this point, the issue of vendor lock-in was initially discussed as a black or white situation. However, in 2017 we are going to see this conversation shift to acknowledge the many shades of gray, as executives realize and consider the varying degrees of lock-in and how it impacts various departments and levels of management. Examining the potential consequences of using proprietary technology on the different levels of the hardware and software stack will be an important issue within companies this year as more enterprises implement digital transformation initiatives. – Bob Wiederhold, CEO, Couchbase

Big data and the cloud will go hand-in-hand. Five years ago concerns over security and compliance kept enterprises from embracing big data in the cloud. Now, best practices and advancements in technology have allayed those concerns while the cloud’s agility and ease of use are becoming must-have’s for processing big data. As big data moves from an experiment to an organization-wide endeavor, the cost, time and resources needed to manage a massive data center don’t make sense. As a result, more and more companies will look to the cloud to help with the costs of data management. In 2017, expect enterprises to move their big data projects to the cloud in droves. – Ashish Thusoo, CEO, Qubole

2017 will be the year big data platforms go operational with the rise of hybrid clouds. We will see more customer cloud apps, such as Salesforce CRM and Oracle CX, accessing big data insights directly from on-premises big data platforms, which are the foundations of enterprises’ digital transformation and omni-channel marketing strategies. Examples of big data insights that support additional functional areas, such as sales and marketing, include predictive models, lead scoring or personalization. This typically starts with the ingestion of customer and marketing data into a data lake, where the source data is commonly stored in hybrid cloud and on-premises systems. And to operationalize those insights, we’ll see greater demand for standard REST interfaces to big data sets primarily accessible from SQL (such as Hive, Impala or Hawq) for hybrid connectivity from SaaS applications or cloud and mobile application development. For on-premises consumers of hybrid data, we expect hosted big data platforms such as IBM BigInsights on Cloud, Amazon EMR, Azure HDInsights or SAP Altiscale to run more big data workloads, not suitable for local data centers, in the cloud and sending only the insights to on-premises systems for core business operations. – Sumit Sarkar’s, Chief Data Evangelist, Progress

Big-Data-as-a-Service. Big Data continued to see rising adoption throughout 2016, and we’ve observed an increasing number of organizations that are transitioning from experimental projects to large-scale deployments in production. However, the complexity and cost associated with traditional Big Data infrastructure has also prevented a number of enterprises from moving forward. Until recently, most enterprise Hadoop deployments were implemented the traditional way: on bare-metal physical servers with direct attached storage. Big-Data-as-a-Service (BDaaS) has emerged as a simpler and more cost-effective option for deploying Hadoop as well as Spark, Kafka, Cassandra, and other Big Data frameworks. As the public cloud becomes a more common deployment model for Big Data, we anticipate many of these deployments shifting to BDaaS offerings in 2017. In addition to solutions offered by newer BDaaS vendors like BlueData and Qubole, we’ll see more initiatives from established public cloud players like AWS, Google, IBM, and Microsoft. We can also expect a range of other announcements that will further validate the trend toward BDaaS, including both major partnerships (such as VMware’s recent embrace of AWS) and acquisitions (SAP buying Altiscale). As the ecosystem expands, customers will have the flexibility to choose from a range of BDaaS solutions, including public cloud as well as on-premises and even hybrid options (e.g. compute in the cloud and data stored on-premises). – BlueData

Data Governance

The Chief Data Officer Moves to New Heights – In this past year, we’ve seen the Chief Data Officer emerge as an instrumental part of the organization’s plan to harness the full value of data for competitive advantage. In 2017 we will see this role evolve further with the acceleration of CDO hires across industries to help with competitive pressures, aggressive global regulations (things like GDPR and BCBS 239) and the general increasing speed of business. Gartner predicts that by 2019, 90% of large organizations will have a CDO. We see this happening much quicker with the CDO rising as data hero within the organization when faced with the new challenges of managing the big data overload dispersed in separate systems and data silos among specific groups and users enterprise-wide. Wearing a super cape, CDOs will figure out a way to break down the data unrest that likely exists today by implementing business-focused governance processes and platforms and enabling and empowering every user across the enterprise to use and capitalize on data for competitive advantage. – Stan Christiaens, co-founder and CTO of data governance leader Collibra

In 2017, the governance vs. data value tug of war will be front and center. Enterprises have a wealth of information about their customers and partners. Leaders are transforming their companies from industry sector leaders to data driven companies. Organizations are now facing an escalating tug of war between governance required for compliance, and the use of data to provide business value and implement security to avoid damaging data leaks and breeches. Financial services and heath care are the most obvious industries with customers counting in the millions with heavy governance requirements. Leading organizations will manage their data between regulated and non-regulated use cases. Regulated use cases data require governance; data quality and lineage so a regulatory body can report and track data through all transformations to originating source. This is mandatory and necessary but limiting for non-regulatory use cases like customer 360 or offer serving where higher cardinality, real-time and a mix of structured and unstructured yields more effective results. – John Schroeder, Chairman and Founder, MapR

Moore’s Law holds true for databases. Per Moore’s law, CPUs are always getting faster and cheaper. Of late, databases have been following the same pattern. In 2013, Amazon changed the game when they introduced Redshift, a massively parallel processing database that allowed companies to store and analyze all their data for a reasonable price. Since then however, companies who saw products like Redshift as datastores with effectively limitless capacity have hit a wall. They have hundreds of terabytes or even petabytes of data and are stuck between paying more for the speed they had become accustomed to, or waiting five minutes for a query to return. Enter (or reenter) Moore’s law. Redshift has become the industry standard for cloud MPP databases, and we don’t see that changing anytime soon. With that said, our prediction for 2017 is that on-demand MPP databases like Google BigQuery and Snowflake will see a huge uptick in popularity. On-demand databases charge pennies for storage, allowing companies to store data without worrying about cost. When users want to run queries or pull data, it spins up the hardware it needs and gets the job done in seconds. They’re fast, scalable, and we expect to see a lot of companies using them in 2017. – Lloyd Tabb, Founder, Chairman & CTO, Looker

The rise of “applied governance” to unstructured data. Earlier this year, more than 20,000 pages of top-secret Indian Navy data, including schematics on the their Scorpene-class submarines, were leaked. It’s been a huge setback for the Indian government. It’s also an unfortunate case study for what happens when you lack controls over unstructured information, such as blueprints that might be sitting in some legacy engineering software system. Now, replace the Indian Navy scenario with a situation involving the schematics for a Nuclear power plant or consumer IoT device, and the value of secure content curation becomes even more immeasurable. If unstructured blueprints and files are being physically printed or copied, or digitally transferred, how will you even know that content now exists? Tracking this ‘dark data’ – particularly in industrial environments – will be a top security priority in 2017. – Ankur Laroia, Leader – Solutions Strategy, Alfresco

Organizations have viewed data governance as a tax. It’s something you had to do for compliance or regulatory reasons, but it wasn’t adding value to the business. In reality, governance is crucial to driving business value. Think about the enormous amount of time and money being spent these days to harness the value of data – the whole Big Data movement. Organizations know there is tremendous value to be had, but many of them aren’t actually getting the value despite their investment. Gartner says: Through 2018, 80% of data lakes will not include effective metadata management capabilities, making them inefficient. Why? Two reasons: First, they don’t have the lineage and provenance of the data they’re analyzing. When they put bad or misleading data into their analysis, they’re going to get unreliable results back out. That’s a lack of data governance. Second, and perhaps even worse, organizations are afraid to share the data they’ve gone to great expense to create. They can’t answer questions such as: Under what agreements was the data collected? Which pieces are personal information? Who’s allowed to see it? In which geographies? With what redistribution rights? If you can’t answer these questions, you can’t share the data. Your data lake is fenced off. This is another failure of governance. Businesses will realize that governance gives them the highest quality results, that can be shared with the right audiences, and drive the greatest business value. – Joe Pasqua – EVP Products, MarkLogic

The Chief Data Officer position will pick up steam significantly. This is a sure sign of the pendulum swinging back: A company officer centrally managing the value of data. And a CDO’s job isn’t to empower analysts per se, although that will often be part of what they do. If that were all it was, companies could save a lot of money by handing out tools and not creating the CDO position. The CDO’s job is to extract maximum value from data. That can be done in many ways, including customer-facing portals, large-scale analytical apps, data feeds that stem from unified views of business entities, embedded BI inside other enterprise applications, and so on.So as the CDO position picks up steam, we can expect to see larger data-focused projects where information is managed and shared across divisional and even company boundaries, leading to better data monetization, lower per-user cost of data, and higher business value per unit of data. – Jake Freivald, Vice President, Information Builders

Data Science

In 2017 we will see an increased valuation of the critical thinking in the workplace, as people realize that there is not a deficit of data in the enterprise, but a deficit of insight. Companies will realize that data without additional tenets of knowledge or value, is both polarizing and damaging. The role of data scientist will evolve to become “the knowledge engineer.” We will see fewer “alchemists” – promising magic from data patterns alone, and more “chemists” — combining the elements of knowledge, data, context, and insight to deliver productivity enhancements that we have yet to imagine. – Donal Daly, CEO, Altify

We spend a lot of time thinking about what developers want & need in a tool, both right now and in the future. In software development, complexity is inevitable – tech stack, libraries, formats, protocols – and that complexity won’t be decreasing any time soon. The most successful tool is one that is simple, but not dumbed down or less powerful. I believe that tools will need to become even more powerful in 2017, and the successful tools will be ones that work for the developer rather than the other way around. Tools will need to be smarter to learn from the user automatically, proactive to inform the user automatically, collaborative to connect users with others, and visual and tangible to show and manipulate. This meta-increase in toolsets is possible now for a number of reasons. Memory, processing power, and connectivity speed continue to explode, while at the same time visual tools (like 4K screens) get better and better. Plus, the continued rise of social coding increases the need to powerful collaborative tools to support the developer. – Abhinav Asthana, CEO of Postman

2017 will be the “Year of the Data Scientist.” According to the McKinsey Global Institute, demand for data scientists is growing by as much as 12 percent a year and the US economy could be short by as many as 250,000 data scientists by 2024. Thanks to advances driven by AI companies in 2017, however, 2018 is when AI will become buildable – not just usable – but buildable by non-data scientists. This is not to say that data science will become less useful or in-demand post-2017, rather that some of the simpler problems will be solvable through a hyper-personalized AI built by someone who is not a data scientist. This will open up capabilities for coders and data scientists that will be mind-blowing. – Jeff Catlin, CEO, Lexalytics

SQL will have another extraordinary year. SQL has been around for decades, but from the late-1990s to mid 2000s, it went out of style as people started exploring NoSQL and Hadoop alternatives. SQL however, has come back with a vengeance. The renaissance of SQL has been beautiful to behold and I don’t even think it’s near it’s peak yet. The innovations we’re seeing are blowing our minds. BigQuery has created a product that is essentially infinitely scalable, the original goal of Hadoop, AND practical for analytics, the original goal of relational databases. Additionally, Google recently announced that the new version, BigQuery Standard SQL is fully ANSI compliant. Prior to this release, BigQuery’s Legacy SQL was peculiar and so presented a steep learning curve. BigQuery’s implementation of Standard SQL is amazing, with really advanced features like Arrays, Structures, and user-defined functions that can be written in both SQL and Javascript. SQL engines for Hadoop have continued to gain traction. Products like SparkSQL and Presto are popping up in enterprises and as cloud services because they allow companies to leverage their existing Hadoop clusters and cloud storage for speedy analytics. What’s not to love? To top it all off, companies like Snowflake, and now Amazon Athena, are building giant SQL data engines that query directly on S3 buckets, a source that was previously only accessible via command line. 2016 was the best year SQL has ever had — 2017 will be even better. – Lloyd Tabb, Founder, Chairman & CTO, Looker

The data skills gap widens. Problem: The demand for data scientists and data engineers continues to challenge enterprises who need to make the most of their data. And even when there are the right skillsets at play, the New York Times recently reported that these critical personnel are often spending more time cleaning the data than actually mining it. Prediction: Businesses will seek any tool that help to put more data in the hands of business analysts with the minimum data scientist intervention. In addition, new machine learning tools will emerge to help automate some of these data-focused tasks to scale the models that data scientists create. – SnapLogic

There will continue to be a shortage of qualified data scientists. I don’t expect the market to be in equilibrium until 2019 at the earliest. Every major university will have a data science program in place by 2017. – Michael Stonebraker, Ph.D., co-founder and CTO, Tamr

Data Scientists failed to predict the election—will they fail to predict your business? The other day I was giving a talk on ‘What is Machine Learning?’ and, barely two minutes in, someone said, ‘You’re saying we can do all these amazing things with big data and algorithms, but you had all the data in the world for the election, and you got it wrong. Why should we trust you?’ There are plenty of important takeaways from the election: First, Nate Silver and enterprise data scientists both try to learn from historical events to predict future events, and the margins of error can behigh in both. But in predicting an election you only get one chance. In business, you make predictions constantly, and the cost of error tends to be low. Also, there are fewer curve-balls in business. Customers and businesses tend to be pretty predictable. Voters and politicians are not. Second, the media committed the same sin we see business people make every day: falling too hard for the analytic ‘black box’ that does seemingly magical number crunching. Without a basic understanding of what types of analyses have been done on different types of data and why, the end users will never know the true value of the information they have at their disposal or how they should use it. There’s no better illustration of this than the little needle on The New York Times’ election ‘dial’ which bounced violently from Clinton to Trump in the middle of the evening and had me screaming at my phone. – Steven Hillion, Chief Product Officer, Alpine Data

Data is fueling an incredible pace of innovation and in turn, those new innovations are creating more diverse types of data, including everything from machine generated metadata to drone telemetry data. For data scientists and analysts, this means a new wealth of data for finding insights and creating business value, but it also means more challenges in understanding and working with these new types of data. In 2017, the tools that support data preparation and visualization in analytics will grow in importance. – Adam Wilson, CEO, Trifacta

GPUs and HPC

2017 will be the year when “accelerated compute” becomes known just simply as “compute”. This is a direct response to the use cases driving up utilization the most, and the explosion of accelerator availability in both the data center and the public cloud. As these use cases continue to ramp up in the Enterprise (particularly machine learning), we’ll see even more demand for computational accelerators. CPUs have been king for decades, and serve the general purpose quite well. But what we’re seeing now is an emphasis on deriving insight from data, versus just indexing it, and this requires orders of magnitude faster (and more specialized) resource in order to deliver feasible economics. It’s not that computational accelerators are necessarily “faster” than CPUs, but rather, they can be deployed as coprocessors and therefore take on very specialized identities. Because of this specialization, they can be programmed to do certain very discrete computations much quicker and at lower aggregate power consumption. Application developers and ISVs are pouncing on these capabilities (and their increasing availability) to create amazing new products and services. A good example of a red-hot technology in this space are GPU-accelerated databases, such as GPUdb from Kinetica (available as a turnkey workflow on the Nimbix Cloud). Rather than focusing on indexing massive amounts of information like a traditional RDBMS, it’s used to ingest fragments into memory for tremendously fast queries. In fact the queries are so fast that it blurs the line between analytics and machine learning (after all, machine learning involves processing massive data sets very quickly in order to create “models” that operate somewhat like human brains). Despite the advanced computing underneath, these tools serve traditional enterprise markets, not just “research labs”. Not only does its product name imply it, but the use case simply would be impossible without GPUs. This is a very real example of mainstream technology that demands computational accelerators. In talking with customers and business partners, the one common thread they all seek is more accelerated computational power (at reasonable economics) to do even more advanced things. I don’t see this trend slowing down anytime soon, which is why I’m predicting that we’ll drop the “accelerated” in front of “compute” as it will become a given. – Leo Reiter, CTO, Nimbix

Graphical Processing Units (GPUs) are capable of delivering up to 100-times better performance than even the most advanced in-memory databases that use CPUs alone. The reason is their massively parallel processing, with some GPUs containing over 4,000 cores, compared to the 16-32 cores typical in today’s most powerful CPUs. The small, efficient cores are also better suited to performing similar, repeated instructions in parallel, making GPUs ideal for accelerating the compute-intensive workloads required for analyzing large streaming data sets in real-time. – Eric Mizell, Vice President, Global Solutions Engineering, Kinetica

Amazon has already begun deploying GPUs, and Microsoft and Google have announced plans. These cloud service providers are all deploying GPUs for the same reason: to gain a competitive advantage. Given the dramatic improvements in performance offered by GPUs, other cloud service providers can also be expected to begin deploying GPUs in 2017. – Eric Mizell, Vice President, Global Solutions Engineering, Kinetica

While deep learning is now a high priority for the C-Suite, many organizations realize that they are not quite ready to execute or see benefits from these activities in 2017. What is in their sights, however, is the capability to extend their current analytical capabilities to larger and larger datasets – without introducing productivity killing wait times. This is what will drive deep adoption of supercomputing solutions like the DGX-1 from Nvidia alongside GPU-tuned analytics. – Todd Mostak, Founder and CEO, MapD Technologies


As I predicted last year, 2016 was not a good year for Hadoop and specifically for Hadoop distribution vendors. Hortonworks is trading at one-third its IPO price and the open source projects are wandering off. IaaS cloud vendors are offering their own implementations of the open source compute engines – Hive, Presto, Impala and Spark. HDFS is legacy in the cloud and is rapidly being replaced by blob storage such as S3. Hadoop demonstrates the perils of being an open source vendor in a cloud-centric world. IaaS vendors incorporate the open source technology and leave the open source service vendor high and dry. Open source data analysis remains a complicated and confusing world. Wouldn’t it be nice if there were one database that could do it all? Wait, there is one, it’s called Snowflake. – Bob Muglia, CEO, Snowflake Computing Inc.

Don’t be a Ha-dope! For all those folks running around saying Hadoop is dead – they’re dead wrong. In 2017, we’re going to see an increased adoption of Hadoop. So far this year, I haven’t talked to a single organization with a digital data platform who doesn’t see Hadoop at the center of their infrastructure. Hadoop is an assumed part of every modern data architecture and nobody can question the value it brings with its flexibility of data ingestion and its scalable computational power. Hadoop is not going to replace other databases but it will be an essential part of data ingestion in the IoT/digital world. – George Corugedo, CTO, RedPoint Global

Hadoop distribution vendors will have crossed the chasm — unstructured data in Hadoop is a reality. But, since the open source problem has not been addressed, they aren’t making much money. As such, there will be an acquisition of many of these vendors by bigger players. As well as the idea that bigger ISV Hadoop vendors will band together and create larger entities in hopes of capitalizing on the economy of scale. – Joanna Schloss, Director of Product Marketing, Datameer

The Failure (and future) of Hadoop. Problem: Fifty percent of Hadoop deployments have failed. While it’s commanded the lion’s-share of attention, it’s suffered from product overload. Because new projects are added every month and the nature of the data in the Hadoop cluster is ever-growing, it’s created a complex, multidimensional environment that’s difficult to maintain in production. Prediction: To actually make Hadoop work beyond a test environment, enterprises will shift it to the cloud in 2017, and abstract storage from compute. This enables enterprises to select the tools they want to use (Spark, Flink or others) instead of being forced to carry excessive Hadoop baggage with them. – SnapLogic

Wider Adoption of Hadoop: More and more organizations will be adopting Hadoop and other big data stores, in turn, organizations will have a need to secure the massive amounts of data stored in Hadoop and other big data stores. – Axiomatics

In-Memory Computing

In 2017, in-memory computing will enter the mainstream as the enabling technology for adding operational intelligence to live systems, and it will supplant legacy streaming technologies. In 2017, the adoption of in-memory computing technologies, such as in-memory data grids (IMDGs), will provide the enabling technology to capture perishable opportunities and make mission-critical decisions on live data. Driven by the need for real-time analytics, the IMDG market alone – currently estimated at $600 million – will exceed $1 billion by 2018, according to Gartner. Unlike big data technologies, such as Spark, created for the data warehouse and legacy streaming technologies, in-memory computing enables the straightforward modeling and tracking of a live system by analyzing and correlating persistent data with live fast-changing data in real time, and it provides immediate feedback to that system for automated decision making. Gartner has recently elevated the term “digital twin” in its recent Top 10 strategic technology trends for 2017 to describe the shift in focus from data streams to the data sources which produce those streams. In-memory computing technology enables applications to easily create and manage digital representations of real-world devices, such as Industrial Internet of Things (IIoT) sensors and actuators, and this enables real-time introspection for operational intelligence. – Dr. William Bain, CEO and founder, ScaleOut Software

In-Memory and Temporary Storage become more important as new sources of data growth such as augmented and virtual reality, AI and machine learning become popular: While analyzing these new sources of data is becoming critical to long-term business goals, storing the data long term is both impractical and unnecessary when the results of analysis are more important than the data itself. Although 2017 will see plenty of data growth that will require permanent storage, most net new data generated next year will be ephemeral; it will quickly outlive its usefulness and be discarded. So despite exponential data growth, there won’t be as much storage growth as we might otherwise have expected. – Avinash Lakshman, CEO, Hedvig

In-memory computing becomes a key enabler of digital transformation. There is one common characteristic in any discussion of digital transformation use cases: the need to process huge amounts of data in real-time or near real-time. The use cases can be transactional or analytical and they can be deployed on premise, in the cloud or on a hybrid environment. And no matter the industry – financial services, fintech, ecommerce, online services, telecom, etc. – the two universal challenges are performance and scalability. In-memory computing (IMC) is the only technology enabling this kind of performance. In 2017, as enterprises look to make their own digital transformation, IMC will rise to the top of every IT solution checklist. – Abe Kleinfeld, CEO, GridGain


The future of IoT will be focused on security. Recently, a major DDoS attack caused outages at major organizations. This is going to be a growing issue in the near future, and the concern at the forefront of IoT will be safeguarding networks and connected devices. – Dr. Werner Hopf, CEO and Archiving Principal, Dolphin Enterprise Solutions Corporation

IOT grows up – The enterprise has paid attention to IOT for some time, though this year will be the year we move past the “wow” phase and into the “how do we do we securely and effectively bring IOT to the enterprise, how do we handle the high speed data ingest, and how do we optimize analytics and decisions based on IOT data.” Those will be the questions enterprises will need to solve in 2017. – Leena Joshi, VP of Product Marketing, Redis Labs

IoT continues to pose a major threat. In late 2016, all eyes were on IoT-borne attacks. Threat actors were using Internet of Things devices to build botnets to launch massive distrubted denial of service (DDoS) attacks. In two instances, these botnets collected unsecured “smart” cameras. As IoT devices proliferate, and everything has a Web connection — refrigerators, medical devices, cameras, cars, tires, you name it — this problem will continue to grow unless proper precautions like two-factor authentication, strong password protection and others are taken. Device manufactures must also change behavior. They must scrap default passwords and either assign unique credentials to each device or apply modern password configuration techinques for the end user during setup. – A10 Networks

The Internet of Things (IoT) is widely acknowledged as a big growth area for 2017. More connected devices will create more data, which has to be securely shared, stored, managed and analyzed. As a result, databases will become more complex and the management burden will increase. Those organizations which can most effectively monitor their database layer to optimize peak performance and resolve bottlenecks will be more strongly placed in a better position to exploit the opportunities the IoT will bring. – Mike Kelly, CTO, Blue Medora

The future of retirement is gearing up for a major shift and Internet of Things (IoT) along with it. Baby boomers are retiring, and there are many economic and lifestyle reasons for them to live in their homes longer. This means changes for insurance companies, healthcare, medical devices, and appliance manufacturers. The proliferation of the IoT or “the connected life” allows for monitoring the elderly in their homes, from monitoring blood pressure to typical daily habits such as whether or not they turned on the TV or opened the refrigerator. Elderly parents want autonomy and their children want them to be safe – connected technology can bridge the gap between the two. Basic monitoring as well as more advanced medical monitoring is shifting the way we will live out our retirement. – Kevin Petrie, Attunity

The Internet of Things (IoT) is still a popular buzzword, but adoption will continue to be slow. Analyzing data from IoT and sensors clearly has the potential for massive impact, but most companies are far (FAR!) from ready. IoT will continue to get lots of lip service, but actual deployments will remain low. Complexity will continue to plague early adopters that find it a major challenge to integrate that many moving parts. Companies will instead focus resources on other low-hanging fruit data and analytics projects first. – Prat Moghe, Founder and CEO, Cazena

The Internet of Things is delivering on the promise of big data. IoT will deliver on the promise of big data. Increasingly, big data projects are going through multiple updates in a single year – and the Internet of Things (IoT) is largely the reason. That’s because IoT makes it possible to examine specific patterns that deliver specific business outcomes, and this has to increasingly be done in realtime. This will drive a healthier investment, and faster return in big data projects. – Ettienne Reinecke, Chief Technology Officer, Dimension Data

Next year, organizations will stop putting IoT data on a pedestal, or, if you like, in a silo. IoT data needs to be correlated with other data streams, tied to historical or master data or run through artificial intelligence algorithms in order to provide business-driving value. Despite the heralded arrival of shiny new tools that can handle IoT’s massive, moving workloads, organizations will realize they need to integrate these new data streams into their existing data management and governance disciplines to gain operational leverage and ensure application trust. – Girish Pancha, CEO and Founder, StreamSets

The Internet of Things Architect role will eclipse the data scientist as the most valuable unicorn for HR departments. The surge in IoT will produce a surge in edge computing and IoT operational design. 1000s of resumes will be updated overnight. Additionally, fewer than 10% of companies realize they need an IoT Analytics Architect, a distinct species from IoT System Architect. Software architects who can design both distributed and central analytics for IoT will soar in value. – Dan Graham, Internet of Things Technical Marketing Specialist, Teradata

At Least one Major Manufacturing Company will go belly up by not utilizing IoT/big data: The average lifespan of an S&P 500 company has dramatically decreased over the last century, from 67 years in the 1920s to just 15 years today. The average lifespan will continue to decrease as companies ignore or lag behind changing business models ushered in by technological evolutions. It is imperative that organizations find effective ways to harness big data to remain competitive. Those that have not already begun their digital transformations, or have no clear vision for how to do so, have likely already missed the boat—meaning they will soon be a footnote in a long line of once-great S&P 500 players. – Ashley Stirrup, CMO, Talend

As IOT continues to grow, so will the realization that CPU-powered analytics solutions don’t scale elegantly or economically. This will precipitate broad deployment of GPUs in these infrastructure industries. – Todd Mostak, Founder and CEO, MapD Technologies

Unfortunately, the October Dynn DDoS attack was not an isolated event. As Internet of Things (IoT) devices become common-use, they will continue to come under attack. Because these smart devices are what’s known as “stick” – people who buy them rarely replace or upgrade them – the IoT device makers often include only minimum features, shortening the development process and cutting costs. This is particularly dangerous for privacy, since lesser security features means easier backdoor access. When one device is compromised, the hacker can easily overtake the whole system of interconnected devices. Smart devices shipped out in 2017 may have backdoors and malware already installed, and this will be a huge privacy concern. – NordVPN

Internet of Things (IoT): It is about interconnectivity and the increasing number of interconnected electronic devices producing massive amounts of data. Interconnectivity between these devices could reveal new attack vectors to compromise security. – Axiomatics

Machine Learning

In-memory computing techniques will leverage the power of machine learning to enhance the value of operational intelligence. The year 2017 will see an accelerated adoption of scenarios that integrate machine learning with the power of in-memory computing, especially in e-commerce systems and the Internet of Things (IoT). E-commerce applications benefit by offering highly personalized experiences created by tracking and analyzing dynamic shopping behavior. IoT applications, such as those associated with windmills and solar arrays, benefit by delivering predictive feedback based on rapidly emerging patterns. In both of these applications, machine learning techniques can dramatically deepen the introspection and enhance operational intelligence. Once only practical only on supercomputers, machine learning techniques have evolved to become increasingly available on standard, commodity hardware. This enables IMDGs to apply them to the analysis of fast changing data and specifically to dynamic digital models of live systems. The ability of IMDGs to perform iterative computation in real-time and at extreme scale enables machine learning techniques to be easily integrated into stream processing which provides operational intelligence. – Chris Villinger, Vice President, Business Development and Marketing, ScaleOut Software

Machine learning will change the fabric of the enterprise – Machine learning will enable the adaptive enterprise, one that aligns business outcomes and customer needs in new and different ways. – Leena Joshi, VP of Product Marketing, Redis Labs

In 2017, I expect to see an increased emphasis on democratization of machine learning and artificial intelligence (AI). We’ve seen machine learning evolve from IBM Watson a few years ago to most recently with Salesforce and Oracle. While many think machine learning has gone mainstream, there is the potential for much more, such as performance monitoring and intelligent alerting. While companies might face false starts and initial mishaps while trying to crack the code, the increased number of organizations turning to AI and machine learning will lead to more successes next year. This increased adoption will help bring innovations faster to market, especially from a wide range of industries. – Mike Kelly, CTO, Blue Medora

There has been a lot of hype around machine learning for some time now, but in most cases it hasn’t been used very effectively. As we move forward, organizations are learning how to bring together all the ingredients needed to leverage machine learning – and I think that’s the story for 2017. We’ll see machine learning move from a mystical, over-hyped holy grail, to seeing more real-world, successful applications. Those who dismiss it as hocus-pocus will finally understand it’s real; those who distrust it will come to see its potential; and companies that are poised to leverage this capability for appropriate, practical applications will be able to ride the swell. It will still be a few years before machine learning becomes a tidal wave, but in 2017 it will be clear that it has a credible place in the business toolkit. – Jeff Evernham, Director of Consulting, North America, Sinequa

In 2017, ‘centralized-only’ monolithic software and silos of data disappear from the enterprise. Smart devices will collaborate and analyze what one another is saying. Real time machine-learning algorithms within modern distributed data applications will come into play – algorithms that are able to adjudicate ‘peer-to-peer’ decisions in real time. Data has gravity; it’s still expensive to move versus store in relative terms. This will spur the notion of processing analytics out at the edge, where the data was born and exists, and in real-time (versus moving everything into the cloud or back to a central location). – Scott Gnau, Chief Technology Officer, Hortonworks

Machine Learning will become de rigeur in the enterprise without many even noticing: What’s unique to today’s machine learning technology is that much of it originated and continues to be open source. This means that many different products and services are going to build machine learning into their platforms as a matter of course. As a result, more enterprises will be adopting machine learning in 2017 without even knowing they’re doing it because vendors are actively using ML to make their products smarter. Even existing products will soon use some variety of machine learning that will be delivered via an update or as an extra perk. – Avinash Lakshman, CEO, Hedvig

The Future of Machine Learning. We will finally deliver on the promise of machine learning: building models that can directly suggest or take action for large audiences. When we effectively scale machine learning, we can greatly increase the action-taking bandwidth of an enterprise. Instead of presenting a small number of business users in the enterprise with historical statistics à la business intelligence, companies can bring specific recommendations to thousands of front-line individuals responsible for taking action on behalf of the business. – Josh Lewis, VP of Product, Alpine Data

Machine learning-washing – Expect the market to be flooded with solutions that promise machine learning capabilities and grab headlines, but deliver no substance. – Toufic Boubez, VP Engineering, Machine Learning, Splunk

Machine learning and data science are rapidly emerging trends that will have a significant impact across major technologies. While at the outset they may not seem entirely connected to the fixed wireless broadband market, the opportunities they create are boundless. There is too much data but not enough insight. Understanding often requires domain expertise and a significant time investment to parse operational details. This is true whether one is troubleshooting wireless connectivity or quantifying how well an IoT infrastructure is operating. The amount of data can be staggering, and most organizations lack the capability to extract meaningful decision-making information from this massive influx. Data collection, including storage and streaming, forms the first part of this revolution, whether in wireless optimization or IoT management. What must follow are systems intelligent enough to autonomously synthesize the data into actionable information for decision makers. Machine learning will be key to ensuring that information is relevant, and human-centered design guarantees it is readily used. In fixed wireless broadband, this means network issues will become easier to diagnose and mitigate, and deployments will be protected by proactively anticipating problems and automatically applying the most effective remedies. Operational expertise will become more a feature of the infrastructure, with built-in inherent intelligence for increased reliability and efficiency. – Atul Bhatnagar, CEO, Cambium Networks


In 2017, NoSQL’s coming of age will be marked by a shift to workload-focused data strategies, meaning executives will answer questions about their business processes by examining the data workloads, use cases and end results they’re looking for. This mindset is in contrast to prior years when many decisions were driven from the bottom up by a technology-first approach, where executives would initiate projects by asking what types of tools best serve their purposes. This shift has been instigated by data technology, such as NoSQL databases, becoming increasingly accessible. – Adam Wray, CEO, Basho Technologies


Cloud and data security agility will gain further importance — This is a rather obvious prediction, given the phobia of data breaches and the reticence of industries such as the financial sector to use public cloud technologies. Meanwhile, life sciences and retail, to name two industries, continue to forge ahead, realizing efficiencies while adhering to some of the strictest privacy and governance requirements set forth by regulators. With requirements such as the General Data Protection Regulation (GDPR) now in effect, companies not only have to ensure that their data is physically housed in the right geographic centers, but that the access complies with the most stringent regulations related to personal access and approvals for use of that data. Many vendors are now taking steps to provide the most secure, validated and agile infrastructure possible. Partnerships and use of Amazon Web Services, Google Cloud, and Microsoft Azure go a long way to providing the confidence and flexibility that many companies are looking for. In 2017, vendors offering Platform as a Service (PaaS) and tools themselves must also do their part in complying to Service Organization Control (SOC) types, as well as in the case of healthcare data, HITRUST (Health Information Trust Alliance), that provides an established security framework that can be used by all organizations that create, access, store or exchange sensitive and regulated data. – Ramon Chen, CMO, Reltio

Under the covers, machine learning is already becoming ubiquitous as it is embedded in many services that consumers take for granted. Increasingly, machine learning is becoming embedded in enterprise software and tooling for integrating and preparing data. Machine learning is placing a stress on enterprises to make data science a team sport; a big area for growth in 2017 will be solutions that spur collaboration, so the models and hypotheses that data scientists develop do not get bottled up on their desktops. – Ovum

Expect IoT to be even more vulnerable. Previous hacks into connected devices can be deemed as minor or inconvenient. But the recent DDoS attack involving Dyn shows IoT hacks are taking place on a larger and more disruptive scale. Hacking lightbulbs or setting off fire alarms is on the more mischievous side of the spectrum, but having the ability to override a car’s brake system or a “smart” pacemaker, for example, can turn connected devices into deadly weapons. Even worse, the lack of one standard for IoT (unlike Wi-Fi) will just make our devices more susceptible to large-scale breaches. Vendors have to recognize the parallels between security issues when Wi-Fi hit the mass market, and what’s happening with IoT. If they don’t move quickly to address the vulnerabilities, government regulations will need to come into play. Still, it would take something disastrous to galvanize government into action. – Richard Walters, SVP of Security Products, Intermedia

Over the past year there has been increased focused on data privacy, especially with the passing of the GDPR which represented one of the most comprehensive and refined set of standards put forth to date. In 2017, the trend line will to continue to move in the same direction and there will be a higher premium on data protection. With increased sensitivity around personal data, software vendors and enterprises will need to focus on what is being done to protect and manage personal data within the enterprise. To be successful companies must embrace privacy by design for themselves and the service providers they work with.” – Anthony West, CTO, Actiance


Spark and machine learning light up big data. In a survey of data architects, IT managers, and BI analysts, nearly 70% of the respondents favored Apache Spark over incumbent MapReduce, which is batch-oriented and doesn’t lend itself to interactive applications or real-time stream processing. These big-compute-on-big-data capabilities have elevated platforms featuring computation-intensive machine learning, AI, and graph algorithms. Microsoft Azure ML in particular has taken off thanks to its beginner-friendliness and easy integration with existing Microsoft platforms. Opening up ML to the masses will lead to the creation of more models and applications generating petabytes of data. In turn, all eyes will be on self-service software providers to see how they make this data approachable to the end user. – Dan Kogan, director of product marketing at Tableau

Analytics will experience a revolution in 2017. In the past, conversations about big data always included Hadoop (HDFS). But the industry today has hit a wall with its limitations to back up and preserve big data. As a result big data has become a black hole in the HDSFS cluster with no one managing it. In 2017, the Spark operating model – through ‘in memory analytics’ – will become a popular Big Data analytics option due to its ability to significantly reduce data movement and allow analytics to occur much earlier and faster in the process. – Vincent Hsu, VP, IBM Fellow, CTO for Storage and Software Defined Environment, IBM


People may think backup and recovery is dead, but they are sorely misunderstood and the move to the cloud actually makes backup and recovery more important than ever to safeguard data. Relying on the cloud won’t take care of everything! The need for backup and recovery will become very real as organizations continue betting on enterprise applications. Moreover, backup and recovery will take center stage as IT Ops and others in organizations have never stopped worrying about recovery, particularly as companies aggressively move toward modernized application and data delivery and consumption architectures. The likelihood of not knowing how to address or who to turn to in the event of an outage is just too great a risk. – Tarun Thakur, Co-founder and CEO at Datos IO

The Rise of the JBOD. In 2017, more users will come to understand that the storage for their scale-out nodes — whether you call it software-defined, “server SAN,” DAS, hyperconverged, whatever — can be attached externally to servers instead of buying servers with lots of disks and SSDs, without losing any of the performance or ease-of-use of internal DAS. Using simple, dumb, industry standard SAS JBODs (Just a Bunch Of Disks) means not having to throw away your storage when you upgrade your servers and vice-versa. It also gives you better flexibility and density in your deployments. – Tom Lyon, Chief Scientist, DriveScale


One of the ongoing challenges in using big data to improve outcomes in healthcare has been its siloed natured. Healthcare providers have detailed clinical (patient) data within their organizations, while health insurers (payers) have more general claims data that goes across many providers. That is beginning to change, though, as the move to value-based care is encouraging providers and health payers to share their data to create a more complete picture of the patient. The latest trend is to bring in additional behavioral data, such as socio-economic and attitudinal data, to create more of a 360 degree view of not only what patients do but also what drives them to do it. Much as Facebook and use behavioral data to match users to relevant content. By applying next-generation analytics to this larger dataset, providers and payers can work together to help patients become healthier and stay healthy, reducing costs while helping them lead happier, more productive lives. – Rose Higgins, President, SCIO Health Analytics

We’ll usher in the next iteration of personalized care. Increased self-tracking, preventative care efforts, and advances in data science will give us more information on patients than ever before. We’ll use this data to create highly individual portraits of patients, that in turn, enable us to match physicians to patients in a very specific way. We can assign physicians based on their past success in treating similar patients and enable patients to have more informed and personal care. – Mark Scott, Chief Marketing Officer, Apixio
Sign up for the free insideBIGDATA newsletter.

Data Analytics will go vertical (financial, medical, etc), and companies that build vertical solutions will dominate the market. General-purpose data analytics companies will start disappearing. Vertical data analytics startups will develop their own full-stack solutions to data collection, preparation and analytics. – Ihab Ilyas, co-founder of Tamr and Professor of Computer Science at the University of Waterloo

Big Data Will Transform Every Element of the Healthcare Supply Chain: The entire healthcare supply chain has been being digitized for the last several years. We’ve already witnessed the use of big data to improve not only patient care, but also payer-provider systems, reducing wasted overhead, predict epidemics, cure diseases, improve the quality of life and avoid preventable deaths. Combine this with the mass adoption of edge technologies to improve patient care and wellbeing such as wearables, mobile imaging devices, mobile health apps, etc. However, the use of data across the entire healthcare supply chain is about to reach a critical inflection point where the payoff from these initial big data investments will be bigger and come more quickly than ever before. As we move into 2017, healthcare leaders will find new ways to harness the power of big data to identify and uncover new areas for business process improvement, diagnose patients faster as well as drive better more personalized, preventative programs by integrating personally generated data with broader healthcare provider systems. – Ashley Stirrup, CMO, Talend

As traditional industries (e.g. oil and gas) continue to digitize aspects of its business, they’ll eventually catch up to industries on the forefront of digital transformation. Healthcare, marketing and finance were industries leading the trend in 2016; and in 2017 manufacturing will tap into the data being generated from IoT to become a leader in digital transformation. – Adam Wilson, CEO, Trifacta


Sign up for the free insideBIGDATA newsletter.


  1. This is quite interesting. Please allow me to add share something as well. Do you know about Binfer? Very easy tool to transfer big data.

Leave a Comment


Resource Links: