Heard on the Street – 3/8/2023

Print Friendly, PDF & Email

Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!

The link between data democratization and your team’s agility. Commentary by Felipe Henao Brand, Senior Product Marketing Manager at Talend

Businesses are overflowing with data, but without the right processes, they won’t be able to share the right data with the quality needed to make informed business decisions. Prioritizing data democratization, or the process of making data available and accessible to everyone within an organization, can improve your team’s agility. Having the right data at every touchpoint will keep your teams focused and connected, leading to increased productivity and success. To achieve operational excellence in this aspect, leaders must establish clear policies and procedures for data access that are centered around transparency. This is a critical step in creating an efficient data democratization system because your team will not only have the access necessary to maintain data-driven insights, but they’ll also have an understanding of why this data is important in their day to day. 

The role of AI and ML in Identity & Access Management. Commentary by Jim Barkdoll, CEO of Axiomatics

The role of artificial intelligence (AI) and machine learning (ML) in Identity & Access Management (IAM) gets a lot of attention, but isn’t well understood. We are starting to see a shift as security leaders move from a traditional compliance-focused security approach to a more risk-driven approach. The difference is that in a traditional approach the organization reacts to known compliance requirements, while in a risk-based approach it continuously evaluates the threat landscape and takes proactive action to prevent threats. For IAM solutions, ML can be deployed both to determine whether someone is who they claim to be, as well as whether the data or apps they are trying to access are authorized to use. Machine learning technology can evaluate user requests in real time, assess the security context such as the device, network, and related behavioral data, and evaluate the risk. Access control policies can include this risk data to allow access, deny access, or require more stringent authentication. When AI and ML are introduced with the appropriate monitoring and reporting tools, enterprise organizations can visualize network access and reduce overall breach risk using intelligent, adaptable IAM policies. 

ChatGPT and Ethical Decision Making: Can the Two Co-Exist? Commentary by Nick Orlando, Director of Product Marketing at Kore.ai

ChatGPT burst onto the scene last November, bringing the state and potential of AI to the forefront of everyone’s mind. We are now seeing applications of generative AI being integrated across various companies, which of course will spur economic growth. From healthcare and financial services to retail and entertainment, there’s a clear opportunity for generative AI to drive not just profit, but also additional innovation in the constantly evolving space. However, with the rapid mainstream adoption of large language models (LLMs) such as ChatGPT, many are wondering: What ethical considerations need to be taken into account? Technology is advancing at a pace that could potentially leave current legislation in the dust. If the industry continues to move in this direction, it is absolutely vital that ethical considerations such as security and data privacy are prioritized. AI ethics should not be an afterthought in this race to build and deploy solutions. Industry leaders must have open and ongoing discussions about the ethical implications of AI that must take place so that disastrous consequences are avoided. Society needs to be intelligent and diligent about how this technology is deployed. Simply put, we need to think about the main question at hand – it is no longer ‘can we do this?’ but ‘should we do this?’. 

Embracing advanced automation to strengthen data analysis. Commentary by Beena Wood, Senior Vice President of Product Management, Safety, ArisGlobal

With established models trained on industry-relevant data, organizations from the smallest startup to the largest enterprise can take advantage of automation — like artificial intelligence and machine learning — to transform manual, complex, and expensive processes in drug discovery and development giving way to novel innovations, advanced personalized therapies, and accelerated access of safe treatments to patients. Organizations need to focus more on data analysis, which leads to better outcomes, lowered risk, and reduced cost in the highly competitive life sciences space. Yielding accurate insights and connecting to real-world data (RWD) sources empowers organizations to build more accurate models to enhance drug discovery, discover potential adverse events, and strengthen post-marketing potential.

The data behind ChatGPT. Commentary by Travis Taylor, co-host of What the Hack with Adam Levin.

We’ve heard much about the capabilities of ChatGPT, but one question that has largely flown under the radar is where OpenAI acquired the data to enable it. The answer, unfortunately, is by scraping content that was posted online over the last few years.  This presents several concerns, both in terms of privacy and ethics. The data used in results may be compiled, at least in part, from copyrighted works, communications or messages not intended for wider use, proprietary code, or information meant to have been taken down under the GDPR’s “right to be forgotten.” As ChatGPT is already used by over 100 million people and is integrated into the Bing search engine, there’s currently no way to know what sensitive data may have been inadvertently shared with absolute strangers and no means of requesting its removal. This should have been addressed before its release to the general public.

In 2023, AI-driven insights will help merchants thrive amidst an unpredictable economic climate. Commentary by Michael Reitblat, CEO and co-founder, Forter

Retailers that harness AI and ML insights to understand their customers on a deep level (and on the flip side identify who is not a legitimate customer) will create superior experiences in-store and online. Ultimately, this will lead to stronger customer loyalty and lifetime value, all while stopping fraud from impacting the bottom line.

The value of AI and Big Data for clients support cannot be overestimated. Commentary by Dmytro Tokar, co-founder of Zadarma

Running a successful business can be quite challenging. Combining functions such as sales, marketing, and quality control, and ensuring the security of your business and customer data is no easy task. However, there are AI-powered tools that can optimize business processes. Sales and support by phone are relevant for most companies. Taking call centers as an example, AI can help answer the question of how to evaluate the effectiveness of a conversation. Why won’t it end in a sale? Why is the client dissatisfied? Services such as speech analytics, based on AI, are able to quickly and efficiently detect problems and shortcomings of operators or the conversation script. Chatbots are able to take the load off the operators in the evenings and weekends, and the need to wait for an employee to become available is eliminated, which significantly reduces the time to resolve the problem and satisfy the client. Moreover, based on the Big Data collected through the website, you can offer a potential or existing client the most optimal or interesting solution to his problems. When it comes to marketing AI can help build optimal strategies, improve the path to purchase, and change the way leads are acquired and converted with a personalized, segmented approach. Call tracking and end-to-end analytics help determine not only the effectiveness of advertising channels but also the ROI of attracting each client. Thus, modern technologies represented by AI and Bigdata are something that a customer-oriented business simply cannot do without.

Controversy in the Legality of AI. Commentary by Timothy Porter, CTO, Mod Tech Labs

Advancement in diffusion models, the latest cutting edge AI trend that generates a multitude of unique high-resolution images, has increased public interest in generative models massively. The Intellectual Property Rights surrounding artificial intelligence is a hot topic. According to the Supreme Court, all images and assets created purely by artificial intelligence have no copyrights because there is no human creator involved. There are also larger legal ramifications of the illegal use of images, movies, videos, etc in the creation of these models. There have already been several lawsuits filed, some of them with multi-million infringements being cited for unlawfully using images in training as a misappropriation of intellectual property copyrights and therefore copyright infringement. Ongoing advancements to the use cases and data sources will extend deeper and further than just image and image generation, into the intellectual property rights surrounding individual information. Several standards groups have taken note and are working to create a more level playing field for consumers and businesses leveraging this technology. One persistent issue that has continued to plague the advancement is the availability of open datasets. Many companies have created patents and have processes that are considered trade secrets to cover both the output of their models as well as the input in order to avoid litigation. Input needs to have further scrutiny to ensure no infringement of copyright and copyrighted images, assets, objects, or other data is being used in addition to making sure that the data has minimal bias. Historically open source datasets are biased against a number of different groups, including minorities and this has caused many secondary issues including bank loan approvals, misidentification suspects, and other situations due to datasets that lack diverse inputs.  

It’s Hard to be the Bard: Human intervention still needed in AI/ML. Commentary by Chief Product Officer at Tamr, Anthony Deighton

The reality is that while AI/ML makes it easy to resolve large amounts of data quickly and at scale, it lacks the human feedback needed to improve models. While machines excel at resolving solutions fast, humans provide feedback and ensure accurate results. Supervised AI/ML combines the best of what machines and humans have to offer. The more feedback humans provide, the better the machine becomes. Organizations benefit from the power of the machine to clean and curate data from a myriad of sources across multiple data silos while also reaping the value of human feedback in delivering the best results. The best modern data mastering solutions should be 80% machine, 10% humans and 10% rules.

Injecting LLMs into Key Applications: Commentary by Amin Ahmad, Chief Technology Officer at Vectara

Almost every single application developer in big and small companies is being asked a key question, “How/where/when should I build LLMs into our application architecture?” Good engineers realized early on that not every user request is served well by Generative AI. Engineers embedding LLMs into some of the most widely used sites and applications have learned that design patterns like fast-LLM-powered retrieval as a precursor step to a generative prompt can help to reduce the cost, increase the performance, and reduce the possibility of AI hallucination and other trust issues. Leading app dev teams are beginning a new wave of innovation that includes redefining information discovery experiences, changing the way everyone from shopping consumers to data analysts to executives will request, discover and consume information. And as Generative AI creates new data, this will be an equal demand for applications that manage, serve and organize this new wave of information. Generative language models will continue to combine with neural search in new and unforeseen ways to provide unprecedented levels of automation to consumers and users. However, the key hurdle to reaching this future is the complexity of the underlying infrastructure required for adding these new capabilities into applications. Machine learning is notoriously complex to develop and deploy, with NLP models being among the largest and most computationally intensive of the bunch. It will be nearly impossible for application teams to avail of these breakthroughs in language understanding until the requisite infrastructure, including dual encoders, vector databases, cross-attentional models, generative models, and purpose-built hardware such as TPUs and GPUs, are packaged into a cohesive, serverless platform accessible through simple APIs.

Gen AI’s Role in Data Scarcity? Ajoy Singh, COO, Fractal

As generative AI begins to be incorporated across all industries to expedite production and revenue streams, the prospect of wrong, poor,  or limited data being used in the machine learning algorithm can be a cause of concern for industry leaders, especially when the end results are significant and widespread. However, these concerns may be overblown, and in many cases, misplaced altogether. That is because as generative AI incorporation becomes more widespread, the number of safeguards used in the model to produce accurate data will also increase over time. Machine learning models will combat the problem of data scarcity by pooling patterns & knowledge from a vast amount of reliable, trustworthy sources of information. The models will also cross-reference information to ensure that the data used in the model is good, reducing the risk of incomplete or inaccurate results. Furthermore, while generative AI products like ChatGPT can produce a significant amount of results, these products will be programmed to only generate outcomes that are data-driven and therefore will not develop answers for questions or tasks where there is not enough knowledge or information available to produce a reliable result. For such situations, these products will learn over time based on user inputs and new data sources and improve the results.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Speak Your Mind