Heard on the Street – 5/15/2023

Print Friendly, PDF & Email

Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!

Generative AI will create zettabytes of data over the next 5 years. Is data storage ready? Commentary by Colin Presly, Senior Director, Office of the CTO, Seagate Technology

“Generative AI’s transformative potential has captured imaginations across the technology sector and beyond. But there has been little discussion of the practical data storage implications needed for organizations to maximize this technology. Generative AI isn’t waiting for anyone. In only 2 months ChatGPT gained more than 100 million users and companies are rushing to release their own AI bot solutions and integrate the technology in their current processes.  

Data is the currency of AI. As companies develop and adapt AI tools for internal use, they will train them on internal and external data. They can only do this effectively and efficiently if they have the right data storage and management processes in place. This includes comprehensive data classification and the ability to move data seamlessly and in real time to where it can provide the most value.   

Simultaneously, chatbots as well as image and video AI generators will give rise to more data for companies to manage as they generate content. First, the more data AI models can train on, the better and more robust their inferences. This stresses the need to preserve all the data that companies can store. Second, Seagate predicts that generative AI apps (chatbots and image and video AI generators) will also create more data for companies to manage as they generate answers, which will need to be preserved to feed future AI algorithms. 

By 2025, Gartner expects generative AI to account for 10% of all data produced, up from less than 1% today. By cross-referencing this study with IDC’s Global DataSphere Forecast study, we can expect that generative AI technology like ChatGPT, DALL-E, and DeepBrain AI will create at least several zettabytes of data over the next five years.  

Massive data sets need mass-data capacity storage. Organizations can only take advantage of AI applications effectively if their data storage strategy allows for simple and cost-effective methods to train and deploy these tools at scale.” 

Super cloud – what is it? Commentary by William Collins, Chief Cloud Architect,  Alkira

“Organizations looking to stay ahead of the digital transformation curve are realizing that the super cloud is key. By leveraging cloud-native tools from multiple clouds, and optimizing performance with a service-oriented approach, networking complexity becomes no more than an afterthought. The enterprise sector has fully embraced the cloud; nearly all large companies (94%) leverage multiple clouds for their IT needs. By utilizing a super cloud, businesses can speed deployment times by up to 80% and save on infrastructure costs by up to 60%. With the super cloud network teams can now keep up with DevOps and deliver a flexible, fast, and modern infrastructure that moves at the speed of the business.”

On which industries will be most affected by machine learning within the next 5 years. Commentary by Bernadette Nixon, CEO at Algolia

“AI tech is advancing so quickly that within the next five years, companies in industries like retail and ecommerce that are not AI-powered will face obsoletion. Tech like ML-powered search engines are already setting new standards for retailers, as they learn from user behavior to predict intent, inform buyer profiles and provide relevant results as well as respecting user privacy. Consumer behavior is constantly changing, so retailers with the most speedy, user-friendly platforms will stay competitive.”

Why a Pause in AI Development is Not the Answer. Commentary by Michael Hayes, CEO of InWith AI

“The AI community has been a-buzz with nervous chatter ever since Elon Musk’s open letter suggested a pause in AI development. 2022 ignited a cascade of AI innovation at a pace previously unseen, and now skeptics are questioning whether the collective movement has created a Frankenstein. But abandoning this technology now would have dire consequences. You simply cannot put the genie back in the bottle. For as advanced as we think we are we are only seeing the tip of the iceberg in terms of the benefits and uses of this technology. To put the brakes on now would unnecessarily cripple impactful societal innovation. society’s progress. There are safe ways to leverage AI and guard against the risks of misinformation and harm. Rather than stepping away, our job is to build better models with superior control mechanisms.”

Data visualization upskilling is the next milestone for the modern workplace. Commentary by John F. Bremer Jr., CTO and Chief of Business Development at LiftedViz

“We’ve all heard about incidents of data loss costing companies copious amounts of money, but what about all the businesses that aren’t even putting their data to good use? The unexplored asset of data might as well represent a loss for those organizations. Nowadays, the more value you can garner from your data, the more returns you’ll be able to receive from it (be it in productivity, process efficiency, or straight-up revenue streams). The next question is how to extract this value and who should be able to do it in the workplace. The answer is anyone in any department. Hiring a data analyst to create reports and find trends in datasets might be a costly expense. However, data analysis tools, such as data visualization platforms, are becoming widely available for free, allowing anyone to leverage them without needing a professional degree. Visually analyzing data can help business owners and staff uncover future market trends, examine business performance, find process inconsistencies, and even spot reasons for revenue loss. So, to yield these benefits and close the gap between productive and stagnant data, businesses must focus on upskilling their employees on data visualization. Thankfully, it’s a rewarding and straightforward task: Learning the basics to make the most of visualization tools takes one to three hours of training, and video tutorials are available online for free or through inexpensive online courses. As businesses produce more data by the minute, upskilling workers to make sense of it via visualizations is becoming increasingly essential in the workplace. The benefits of understanding your data will prove to be countless.”

GPT-4 Provides Improved Answers While Posing New Questions. Commentary by Richard Searle, Vice President of Confidential Computing at Fortanix

“It has been a busy time in the world of large language models (LLMs). In late March, OpenAI disclosed a breach of users’ personal data. ChatGPT does not appear to have issued an independent apology. As a mere algorithm, how could it? “Right” and “wrong” are only known to it by the artifice of its content filters. Soon after reporting of the initial OpenAI breach came the Future of Life Institute’s open letter, calling for a six-month halt to what they termed “giant AI experiments.” The letter was signed by many of the signatories to another open letter from Future of Life Institute in 2015 that warned of the dangers of autonomous weapons that wield the power of AI to cause harm. With the scale of investment in AI, the breadth of potential applications, and the riches to be gained through competitive AI advantage, it is unlikely that the call by the Future of Life Institute will be heeded. The new capabilities of GPT-4 and services based on the prior generation of GPT technology are imbued with significant risks, many of which remain unforeseen and unimagined. While the risk to data privacy and confidentiality is evident at the implementation level, those risks are an intrinsic component of GPT-4 and other centralized AI systems. How we address these risks as a society, to exploit the welcome benefits of AI without the sacrifice of cherished human rights, is one of the many questions posed by accelerating advancements in machine intelligence. If we are not sufficiently concerned about the information our human-machine conversations disclose about us today, we might worry more when, as a team of AI researchers have already demonstrated, one day soon, machines will possess the capability to actually read our minds. There is much for humanity and GPT-4 to ponder.”

The need for strong European players in the AI space and the potential of smaller, more targeted language models. Commentary by Victor Botev, CEO and Co-founder of Iris.ai

“With evident progress in AI development from the US and now China, it’s more important than ever for Europe to have strong players in the AI space. Large Language Models (LLM), such as ChatGPT and Alibaba’s equivalent, are already prone to hallucinations and errors. Competing biases, instilled through the development process of models from major players in different regions, pose a huge problem for the acquisition of objective information using AI. Smart language models, with a focus on high-quality data for a specific use over large quantities of data, can avoid these issues. In the international race to build the next LLM, let’s not forget that language models targeted at specific use cases are already having a practical use for many organizations, augmenting human capacity and providing high quality outputs with no factuality compromises.”

On the Future of AI Language Modeling. Commentary by Jake Klein, CEO of Dealtale

“There’s a sweet spot at the intersection of data science and business where causal AI modeling is actually helping organizations use their own data to improve decision-making outcomes within the business – meaning models could be trained to understand the language and approach of each industry or even each company. Using causal inference can allow marketers to analyze campaign strategies to predict the best avenues to success for the business within seconds. This enables an unprecedented level of understanding, literacy, and intimacy with the data that ultimately helps departments prove their value to the organization during a time when every department has to bring something to the revenue table.”

ChatGPT underscores importance of data quality & model training. Commentary by Tooba Durraze, VP of Product (AI and Data) at Qualified

“People who are going to rush to implement gimmicky features – which are still really cool – are probably not going to be the ones who find actual utility out of it. The people who are more structured in terms of what they go out with or at least start to make long-term investments are going to be the ones to take advantage of this long term. AI can emulate a human no problem. It can answer a question like you and it can physically even look like you, but in order for it to be that good depends on how long you train the model and the data you use to train the model. Knowing that the technology is a lot further ahead than even the use cases we are talking about today is going to make a difference, especially for business leaders.”

Transparent, Explainable AI in Healthcare is Necessary to Build Clinician Trust and Deliver Personalized Care. Commentary by Christine Swisher, PhD, Chief Scientific Officer, Ronin 

“Clinicians are naturally passionate about delivering high-quality care, but healthcare needs to be faster to adopt cutting-edge technologies that improve patient outcomes. The discussion surrounding the use of AI in healthcare has become pervasive due to its potential to efficiently analyze disparate data, surface hidden insights into a patient’s disease state, and inform personalized treatment. While healthy skepticism exists around its safety and efficacy, AI can be an unknown black box for clinicians who lack data science training. To increase the adoption of transformative approaches to care, like AI, vendors must build AI-human interfaces designed for transparency and explainability. Delivering this technology requires a deep understanding of clinical workflows and how clinicians access and utilize data. Components vital to transparent, explainable AI interfaces include an intuitive user experience, a concise presentation of inherent biases, and a clear breakdown of the models using accessible language. There is no technology to replace humans when it comes to delivering care. However, we can impact patient lives by expertly designing technology for the clinicians’ experience and build trust by removing friction points via transparent, explainable AI technology.”

Core pillars needed to navigate an increasingly complex data landscape. Commentary by Óscar Méndez Soto, CEO and Founding Partner, Stratio BD

“Today businesses are dealing with a huge amount of data that is growing exponentially at scale. Having to also navigate a proliferation of data intelligence platforms and dashboards from different sources is resulting in increasingly dynamic, complex, and often siloed data sets. With so many challenges afoot, data fabric solutions are clearly the next stage in data management maturity. Companies that try to create a unified data layer via human effort alone are realizing that it is impossible to do at scale. Data fabric solutions can do this automatically with no human effort required. It also makes the data easy to understand by different business units that have no expertise in this area – pointing to true data democratization – while being more trustworthy in terms of quality and security. Finally, data fabric greatly enhances the efficiency in getting value from data, at 4 x the speed of human efforts. This capability might sound unbelievable, but a growing number of global financial, retail, and telecom companies are already embracing it, realizing that data fabric is the future.”

Don’t Let Cloud Costs Catch You Off-Guard. Commentary by Adit Madan, Director of Product Management at Alluxio

“With the focus on cost containment as the economic outlooks continues to be uncertain, containing cloud costs has become a business imperative. While major public cloud providers, such as AWS, Google Cloud, and Microsoft Azure, offer free data input to the cloud, they impose high fees for data retrieval from the cloud, also known as data egress fees. Data egress fees are retroactively charged and are therefore often considered hidden fees. This means that applications, workloads, and end-users may unknowingly accumulate high fees until they receive the bills. Data caching is a very effective technique for data access cost reduction because it reduces data round trips across the network by bringing data closer to applications. Efficiency should be a guiding principle in any data architecture, and one of the goals should be to avoid unnecessary data copies, duplication of storage, or network traffic.”

Generative AI Beyond Prompting for Artists. Commentary by Dr. Ahmed Elgammal, CEO and Founder of Playform.ai

“Last year witnessed an explosion in generative AI text-to-image tools, pioneered by DALL-E2, and open source Stable Diffusion. Creatives took note of the potential power of generative AI that can produce images and videos, all from a simple text prompt. With it emerged heavy debates around the ethics of using images to train large models without the artist’s consent, debates on copyright infringement, and whether art created using AI tools should be considered “art.” Text-to-image prompts, however, do not line up with the creative process of most artists. 

As visual thinkers who will use visual imagination and references for their projects, text prompts can feel very foreign and unnatural in the creative process. This is partly because language is a higher-level intelligent construct that is more sophisticated than visual stimuli that is communicated through visual art. It takes a non obvious effort to describe an art work with language. Additionally, “prompt engineering” requires the right keywords to invoke certain visual effects and generate desirable image work. This reverse-engineering is a process totally foreign to an artist’s creative process. Moreover, all the effort that an artist does to learn how to engineer a prompt will soon become obsolete with the release of the next version of these models, requiring them to start the reverse-engineer process all over again.

The bottom line is that text-to-image tools are for consumers to create visuals fast, which is not really what artists want. Artists are finding AI tools that fit their creative process and allow them to take their art to the next level. There are other ways to generate AI images that do not involve text prompting and do not invoke copyright infringement. There is AI training technology out there that allows you to train your own AI model from scratch based on your own images, where there is no worry about violating any copyright laws. I am looking forward to seeing these AI training tools evolve to provide a natural way for artists to harness the power of AI as a partner in their creative process.”

Contributory Data Models. Commentary by Tyler Jones, chief customer officer at CLARA Analytics 

“AI solutions are hungry for data. AI solutions with a contributory data model allow enterprises across the industry to benefit and improve their outcomes. Large enterprises who have long tried to develop their own AI solutions in-house are finding value in purchasing outside solutions with contributory data models. The broader industry view improves AI performance and helps mitigate bias. The value of sharing anonymized transactional data is now outweighing the perceived competitive advantage of keeping it walled within the enterprise.”  

How Ubiquitous Can AI Get? Commentary by Vid Jain, CEO and founder, Wallaroo.AI

“Data is an untapped goldmine for whole categories of organizations.  Professional sports teams, for example, generate vast amounts of data from a variety of core business sources including ticketing, concessions, retail, social media, vendors, and more. But historical trends have been less predictive of where the business needs to go, due to changes in the economy and the increasing fickleness of the consumer. AI is increasingly important as these teams try to understand new data. Plus, they can improve other areas of their business, such as better security, parking and concession experience, better matching pricing to demand to increase revenue, improving performance marketing, customer support and more. 

What excites me the most is the potential to bring personalization at scale with AI – How do you optimize the experience for each and every customer through all the touchpoints they have with a business? Organizations need to be able to deploy, test, and iterate on new AI ideas quickly and see which ones are working. That’s been a historical hurdle but no longer. Today we have great options – and that’s going to make AI everywhere all at once possible.”

How Optical AI Can Protect Against Counterfeit Art. Commentary by Roei Ganzarski, CEO of Alitheon

“As the global art market continues to grow, so does the counterfeit art market. Experts estimate that at least 50% of artwork in circulation may be fake which causes investors to lose millions of dollars as well as increases the lack of trust in the industry. Traditional anti-counterfeiting measures have become easier to copy and render ineffective, but the rise of optical AI technology has created a new solution to an old problem.    

Optical AI technology is at the forefront of the fight against counterfeit art as it can easily authenticate genuine pieces of art with no special equipment, physical markings or false positives. Using a unique digital ‘fingerprint’ inherent in each piece, anyone can identify a verified piece of artwork with just a smartphone. So as the art world continues to evolve, we must turn toward cutting-edge technology like optical AI to protect investors and ensure art authenticity for generations to come.” 

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Speak Your Mind