I found an interesting discussion going on in the Global Big Data & Analytics group on LinkedIn – “Why do Hadoop projects fail?” Having just returned from the Hadoop Summit 2014 in San Jose, I witnessed plenty of use case examples of Hadoop implementations that were wildly successful. I was therefore intrigued by the notion to itemize causes for failed projects. Here is a list of causes that served to start off the discussion:
- Inability to ingest data
- Inability to access data in-place
- Inability to get a cluster running and stable
- Inability to reach a successful proof of concept
- Inability to complete a successful pilot project
- No financially compelling use case
- Lack of exec sponsorship, budget, priority
The discussion took some interesting paths. Here is a sampling of comments accompanied by my own take:
Lack of articulation and clarity of the business problem that one is attempting to solve linked with what data is required and how the question would be answered using the data.”
This is an excellent point, one that I’ve seen in my own data science consulting practice. If a client says “Here’s some data, now go do your magic” then I’ll run for the hills because the project is destined for failure without a singular purpose and well-defined goal. But this is true of all data science projects, not only those based on Hadoop.
I am surprised there is no mention of organizational barriers. Unlike many other technologies, Big Data may require coordination and agreement across many business units that are not used to work together. Because one of the most common use cases is to centralize data, or create a data lake, companies need to deal with wide initiatives to enable the implementation of such projects. Business units will usually look at the benefits of such corporate initiatives at their own level, balancing this against their existing business priorities and roadmap. In order to succeed, Big Data projects may require organization structure changes.”
So true! Someone in the enterprise needs to “own” the big data project, and a Hadoop project is no different. The need may originate with finance, but in all likelihood, IT will need to weigh in since they are probably responsible for the company’s data assets. If the organization has no concept of data governance, then a big data project is going to be a stretch for its organizational capabilities. A step back to put policies in place may be well worth the time.
I’ve led teams on two successful Hadoop projects, but I’m not sure that Hadoop projects are any different than any other technical projects, and if I think about the reasons on tech projects that I’ve been a part of and failed it’s pretty consistent with the literature on why tech projects fail (which is not often about the technology) – poor executive sponsorship, no clear objectives, poor scoping, bad requirements, poor project management, mismatched resourcing, cross functional alignment, mismanaged expectations, etc.”
The above obviously said by someone who’s been down in the trenches for an enterprise-wide technology project – his experience speaks volumes. Great advice!
If you’re been down this road yourself and have some wisdom to offer, please leave us a note here.
Daniel – Managing Editor, insideBIGDATA
Sign up for the free insideBIGDATA newsletter.