Snowflake Big Data Industry Predictions for 2024

Print Friendly, PDF & Email

Our friends over at Snowflake have prepared a special set of compelling technology predictions for the year ahead. From the company’s point of view 2024 should be quite a year! Straight from the executive suite, you’ll learn about what’s predicted to happen with AI, GenAI, LLMs, BI, data science, data engineering, and much more. Enjoy these special perspectives from one of our industry’s best known movers and shakers.

Sridhar Ramaswamy, SVP of AI, Snowflake

Prediction: Generative AI’s negative impacts will be hard to manage early on — including job loss, deep fakes, and a deepening digital divide.

Although generative AI is reimagining how we interact with machines, there are some immediate concerns that will be particularly challenging in the early years of widespread AI and language model adoption. For a lot of people involved in what we loosely call “knowledge work,” quite a few of their jobs are going to vaporize. Rapid change makes it hard to quickly absorb displaced workers elsewhere in the workforce, and as a result both the private sector and governments will need to step up. Deep fakes are also another hurdle, and we can expect increased attacks on what we humans collectively think of as our reality — resulting in a world where no one can, or should, trust a video of you because it may be AI-generated. Finally, advances in AI will exacerbate the digital divide that has been happening over the past 20-30 years between the haves and have nots, and will further increase inequality across the globe. I can only hope that by making information more accessible, this emerging technology leads to a new generation of young adults who better understand the issues and potential, and can counter that risk.

Prediction: Ethical guardrails for AI will emerge, from both private and public sectors, faster than with other tech upheavals such as privacy.

I’d like to think that we’ve learned from our past when it comes to establishing safe and ethical rules for leveraging new technologies, with the lack of privacy frameworks and guidelines around sensitive data serving as a cautionary tale of what not to do. Governments are stepping up earlier in the cycle when it comes to AI adoption and use. For example, in mid-September the U.S. Senate hosted a private, informational round table that included leaders from OpenAI, NVIDIA, Google, Meta and more. However, quick regulatory intervention will not solve all problems, and I suspect the industry will primarily be responsible for defining what “responsible AI” means. Narrow tech regulation is very hard. Look at Section 230, part of the 1996 U.S. Communications Decency Act, which provides immunity to websites for what third parties post. While it made the internet as we know it possible, the internet is also rife with lies, hate speech, and bullying. We’ve seen that well-meaning regulation can sometimes play out in bad ways.

Prediction: LLMs will become commonplace, but most people will use “MLM”s (smaller models trained using the very large ones) because we don’t all need trillion-parameter models!

As large language models (LLMs) become more democratized, we’ll see most organizations start to downsize — with smaller language models becoming the industry standard. There will still be some big players, but in general most vendors will fine-tune smaller models catered toward specific verticals and use cases. I see a future with millions of smaller language models, operating at the company or department level, and providing hyper-customized insights based on the employee or need. Smaller language models require less time and resources to maintain, can be operated inside a company’s existing security perimeter, and are often faster and more accurate because they’re optimized for a narrower set of tasks compared to the do-it-all models that have garnered most of the attention to date. There is more and more proof that you can get a 20 billion parameter model to do most of the things that you want from language models — when compared to the ~1.8 trillion parameter model of OpenAI’s GPT-4 — and they are just as effective if not more. 

Sunny Bedi, CIO and CDO, Snowflake

Prediction: AI will be your best work buddy. 

One of the most exciting ideas for workplace productivity is the deployment of AI assistants that help employees become — and continue to be — efficient and effective. For example, onboarding new workers is a complex process of educating the worker about systems, processes, and culture, alongside ensuring that they quickly gain access to only the right systems and projects. Looking forward, we can expect AI assistants tuned to specific departments and roles to provide that orientation, tied to their individual persona and accompanying them throughout their tenure at the company. As an organization’s processes and needs become more mature, they can then train the agent to do the same thing on their behalf next time. And when they hire a new person into their organization, that person gets that full wealth of knowledge from the beginning. Taking it one step further, these AI assistants will start contributing to larger enterprise knowledge. By ingesting new documentation and thorough feedback from existing employees, scouring the internet for new ways to optimize processes, and more, these AI assistants will become every employee’s best work buddy. This will quickly become the workplace standard and table stakes for increased productivity.   

Prediction: Developers expect to be 30% more efficient using generative AI assistants.

I asked my developer team to estimate how much of the code they’re writing could be produced by a generative AI tool, and they consistently estimate 30%. While this has yet to be proven, that level of efficiency is a real game changer. Beyond that initial increase in productivity, there are also the benefits of reusability and sharing. The AI-generated piece of code that makes development 30% more efficient today could be reused to help transform the AI tool from an assistive technology to a semi-autonomous technology. By continually improving the AI tool, organizations could then get it to extend code generation and efficiency even further, and leverage it again on other projects — increasing the overall capabilities of future developments. In addition, I expect testing and quality assurance to eventually be assisted by an AI agent, leading to faster and higher-quality deployments. Developers will still be imperative for the creative thinking and problem solving necessary to drive innovation forward, but their day-to-day activities and where they spend their brain power will shift.

Jeff Hollan, Director of Product Management, Snowflake

Prediction: Data engineering will evolve — and be highly valued — in an AI world.

There’s been a lot of chatter that the AI revolution will replace the role of data engineers. That’s not the case, and in fact their data expertise will be more critical than ever — just in new and different ways. To keep up with the evolving landscape, data engineers will need to understand how generative AI adds value. The data pipelines built and managed by data engineers will be perhaps the first place to connect with large language models for organizations to unlock value. Data engineers will be the ones who understand how to consume a model and plug it into a data pipeline to automate the extraction of value. They will also be expected to oversee and understand the AI work.

Prediction: Data scientists will have more fun. 

Just as cloud infrastructure forced IT organizations to learn new skill sets by moving from builders of infrastructure and software, to managers of third-party infrastructure and software vendors, data science leaders will have to learn to work with external vendors. It will be an increasingly important skill to be able to pick the right vendors of AI models to engage with, similar to how data scientists today choose which frameworks to use for specific use cases. The data scientist of tomorrow might be responsible for identifying the right vendors of AI models to engage with, determining how to feed the right context into a large language model (LLM), minimizing hallucinations, or prompting LLMs to answer questions correctly through context and formalizing metadata. These are all new and exciting challenges that will keep data scientists engaged and hopefully inspire the next generation to get into the profession.

Prediction: BI analysts will have to uplevel. 

Today, business intelligence analysts generally create and present canned reports. When executives have follow-up questions, the analysts then have to run a new query to generate a supplemental report. In the coming year, executives will expect to interact directly with data summarized in that overview report using natural language. This self-service will free up analysts to work on deeper questions, bringing their own expertise to what the organization really should be analyzing, and ultimately upleveling their role to solve some of the challenges AI can’t.

Christian Kleinerman, SVP of Product, Snowflake

Prediction: It’s not just generative AI — it’s the apps and experiences it enables that will revolutionize how we live and work.

If you look at the smartphone in your hand, it’s not the phone itself that makes it revolutionary — it’s all the various applications that you use every day for different functionalities. The generative AI revolution draws similar parallels, with the data and apps being where the value will be realized for this next phase of innovation. An explosion of apps and novel use cases was already underway, but it will be significantly accelerated by generative AI and large language models. We’ll see applications across categories and industry verticals leverage these technologies, and many more apps will have AI-based search, conversational, and assistive experiences built in. These apps will bring true disruption, mostly around end-user experiences and interactions. How will people access systems in the future? They’ll need to know less, lowering the bar to access and broadening the reach of technology. Today there’s expertise around how to manage a CRM or point-of-sale app. In the fullness of time, app management will converge toward simpler experiences with natural language as a core interface. It’s going to be a different experience that redefines roles and responsibilities across the board.

Adrien Treuille, Director of Product Management and Head of Streamlit, Snowflake

Prediction: “Generative AI Apps” like chatbots will be easily built by enterprise data teams, and become commonplace for daily tasks.

Large language models are not only a super powerful new form of AI, but also incredibly easy to use. The recent State of LLM Apps 2023 report analyzed how over 13K+ developers created 21k+ generative AI apps in just a matter of months. About 74% of those apps were built with OpenAI, which democratized the development of generative AI apps through their simple API — catalyzing a new wave of innovation in the open source community. Now however, major data and compute platforms are bringing these simple APIs into the enterprise. Looking ahead to 2024 and beyond, that same level of creativity, bold experimentation, and new applications we’ve seen in the open source world will surge within the corporate sector. Generative AI will become a driving force in business, making AI applications a regular feature for various daily business operations and tasks.

Prediction: The hallucination problem will be largely solved, removing a major impediment to the adoption of generative AI. 

In 2023, “hallucinations” by large language models were cited as a major barrier to adoption. If generative AI models can simply invent facts, how can they be trusted in enterprise settings? I predict however, that several technical advances will all but eliminate hallucinations as an issue. One such innovation is Retrieval Augmented Generation (RAG), which primes the large language model with true, contextually relevant information right before prompting it with a user query. This technique, still in its infancy, has been shown to dramatically decrease hallucinations and is already making waves. According to a recent State of LLM Apps 2023 report analyzing 21K+ apps built by 13K+ different developers, 20% of the apps use vector retrieval – a key ingredient of RAG. In the coming year, I predict that RAG, along with better trained models, will begin to rapidly solve the hallucination problem — ultimately paving the way for widespread adoption of generative AI within enterprise workflows.

Prediction: The open source ecosystem around generative AI will parallel and rival the corporate ecosystem.

Six months ago, many were worried that the generative AI landscape would be dominated by the few big companies that could afford to build “foundation models.” These massive AI brains with tens of billions of parameters form the basis for all other generative AI apps — from summarization, to chat, to image generation. However, a startling development of the past couple of months is Meta’s open-sourcing LLaMA and LLaMA2, essentially putting such foundation models directly in the hands of academics and the open source community. Next year, we’ll continue to witness significant developments in large language models and large generative models from the open source community. We’ll see a combination of more models becoming open source, as well as new technologies emerging like Low Rank Adaptation (LoRA), which lets academics fine-tune existing models faster while consuming less memory. LoRa took everyone by surprise, and I suspect to see even more innovation outside corporate structures in this supposedly impossibly rarified world of 70 billion parameter models. 

Prediction: Generative AI will accelerate the impact of small, open source software teams

History shows that tiny teams — sometimes just one person! — can have outsized impact with open source. Generative AI is going to amplify this “open source impact effect” to incredible new levels. When we look at the cost of developing open source, actually writing the code itself isn’t the expensive part. It’s the documentation, bug handling, talking to people, responding to requests, checking examples of code on GitHub and more — all of which is very human-intensive. The open source community will benefit from generative AI for the same reason so many other efforts will: efficient elimination of tiresome human tasks. By helping with all of that, large language models will accelerate open source development this upcoming year, making smaller teams even more powerful.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter:

Join us on LinkedIn:

Join us on Facebook:

Speak Your Mind