Heard on the Street – 2/21/2023

Print Friendly, PDF & Email

Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!

How ChatGPT will improve education. Commentary by Research Director at The Learning Agency, Perpetual Baffour

ChatGPT, a revolutionary chatbot technology, is sparking controversy, debate, and even citywide bans. OpenAI, a research company specializing in AI and machine learning, released the large language model late last year, and it can read, analyze, and create original text akin to a human. While there are legitimate concerns about ChatGPT’s fit in the classroom, natural language processing (NLP) tools like ChatGPT demonstrate the promise of artificial intelligence (AI) in helping students learn and write better. For instance, ChatGPT can be used for automated writing evaluation, systems that use NLP technologies to automatically assess student writing. Automated writing evaluation can benefit classrooms as students rarely receive writing tasks in school, teachers struggle to find time to provide feedback on student writing, and few students in the U.S. graduate high school proficient in writing as a result. As a virtual assistant, NLP tools like ChatGPT can identify areas of student writing that need improvement and provide automatic suggestions in grammar, spelling, clarity, cohesion, word choice, and more. They can also generate writing prompts to help students hone their writing skills, or tailor feedback to specific student needs, like the Feedback Prize algorithms with English Language Learners. The automatic assessment also saves time and effort for the teacher in manual grading, and AI text generators can further support teachers in generating ideas for lessons, tests, and quizzes. Educators should not fear this technology. NLP tools will provide more opportunities for students to learn and empower teachers in assessment and instruction. 

How AI models like Chat-GPT change sales. Commentary by Parth Mukherjee, global VP of product marketing at Mindtickle

While some fear that ChatGPT is coming for their jobs, it’s almost impossible for AI to completely replace a human’s finesse and expertise. These advances in AI are just changing the way salespeople interact with potential customers. In recent years, B2B buying has become more reliant on digital sources of information, with buyers doing their own research online rather than having to rely on their sales contact for basic information. AI models like ChatGPT will have a huge impact on this stage of the selling process, with the AI consuming more signals from the vendor’s website and other online sources like G2, making the digital research process easier and more in-depth. The next stage, where the selling happens, will still require a human, personal touch that can’t be automated.

ChatGPT- How Cyber Risk Professionals Should Adapt Threat Models. Commentary by Igor Volovich, Vice President of Compliance Strategy at Qmulos

The advent of ChatGPT and other AI technologies represent a major shift in the cyber security landscape. On one hand, ChatGPT has the potential to greatly augment the abilities of cyber risk professionals by automating tedious and time-consuming tasks. This allows security personnel to focus on more complex and strategic tasks, increasing their efficiency and effectiveness. However, the lower barrier to entry for attackers that comes with AI technology also presents new challenges. With ChatGPT and other AI tools readily available, malicious actors now have access to advanced capabilities that were previously only accessible to highly skilled cybersecurity professionals. This means that even novice attackers can now carry out sophisticated attacks with ease. As a result, it is more important than ever for cyber risk professionals to adapt their threat models to incorporate these new technologies. This includes not only utilizing ChatGPT and other AI tools to improve their own security protocols, but also staying informed about the latest advancements in AI and anticipating the potential new threats that come with these new technologies. Embracing big data analytics and investing in automation that goes beyond simple task or workflow efficiency improvements (i.e. RPA) should be considered a strategy priority for smart enterprises seeking to maintain a credible defensive posture in the climate characterized by stagnating security budgets, talent acquisition and retention challenges, and the constantly evolving threat landscape.

Launch of data mesh creator’s Nextdata OS toolset. Commentary by Shane Murray, Field CTO of Monte Carlo

The launch and potential of Nextdata are exciting because it seems like instead of trying to offer ‘data mesh in a box’ they are offering ‘data product in a box’. In other words, rather than trying to create a platform that would try and solve every single tick box on the data mesh implementation journey, they focused on creating a solution that addresses real problems organizations are experiencing implementing data mesh at scale. Nextdata isn’t trying to tell you what your domains should be or how to organize your team– they’ve smartly set those tactical-process related questions aside–what they are doing is creating a “data product container” that will make data products more discoverable, governable, understandable, and scalable. Right now when you look behind the curtain, 9 out of 10 times an organization’s data product is actually just a production table. That isn’t necessarily a bad thing, but the reason is because these tables made good units of value that could be interoperable and standardized across the multiple tools needed to scale governance, data quality, and self-service frameworks without too much additional complexity. The “data product container” concept has a lot of promise but the reality of adoption will hinge on if these containers make this task easier than scaling production tables. If so, how will it work alongside other solutions like data catalogs, data governance/access management, dataops, and data observability solutions?

The unstructured data boom. Commentary by Frank Liu, Director of Operations at Zilliz

By utilizing unique, cloud-native architectures, companies such as Snowflake and Databricks have revolutionized the way structured data. However, structured data only forms approximately 20% of all data generated today. The other 80% of this data is unstructured, and has traditionally been difficult, if not impossible, to store, index, and search. The modern computing era has coincided with the rise of AI, giving us new ways to represent unstructured data. Modern AI algorithms can transform all manner of unstructured data into high-dimensional vectors – two semantically similar images of cars, for example, can be transformed into corresponding vectors which are very close to each other in terms of distance. By leveraging the power of modern AI and a database purpose-built to store these vectors, companies and organizations now have access to a wealth of different ways to process unstructured data.

How organizations can use data intelligence to uphold ethical data usage and business practices. Commentary by Jay Militscher, Head of Data Office at data intelligence company Collibra

We continue to hear stories about organizations mishandling data, which makes customers feel that their privacy has been violated. A company cannot grow without its customers’ trust, and in the worst-case scenario, a damaged reputation can cause business failure. How can organizations ensure they are getting data privacy right? Proper use of data is not just about privacy compliance; it’s really about data ethics. Organizations should aim to weave data ethics into their company culture across all business units. Data does not have its own voice, so the people using data determine that voice, including how data is informing decision-making. The end results of these choices – negative or positive – are based on the moral decisions of those using the data. Leadership should implement data ethics training for all employees and enforce high standards to ensure data ethics becomes a part of everyday processes. Data ethics initiatives should be transparent about your data practices, prescriptive with your policies and guidance, and values-driven in your data ethics strategy. Upholding customer trust is of utmost importance for maintaining and growing a business, and the decisions made with data have lasting implications for businesses and people. Organizations need to approach this wisely in the years to come.

Is data modeling dead? Commentary by Stewart Bryson, Head of Customer Experience at Coalesce 

With legacy on-prem data platforms, we were limited by compute and storage, so modeling techniques arose to mitigate these limitations. We found the most efficient way to store data once, because that’s all we could afford. Now, in the data cloud, these limitations don’t exist. It’s not so much that data modeling is dead, it’s simply evolved. We need to ask ourselves if the techniques we used on-prem are the same ones we should use in the data cloud.

AI Modeling is Dead. Commentary by Gantry CEO Josh Tobin

AI modeling has been king for the last few years. It’s been the end goal for ML researchers globally. But it’s quickly coming to an end. The truth is the barrier to building sufficient models is significantly lower than it used to be. Everything from platforms, mega-APIs and ample educational resources have made building models accessible to ML professionals at all levels, obviating the need for expensive, specialized talent. That’s why modeling as we know it is dead. Benchmarks and winning Kaggle competitions are no longer metrics of success; it’s about building applications with models that solve real problems for real people. But to make this shift ML practitioners need to broaden their roles beyond just building and handing off static models. This is because building a model isn’t a one-and-done deal. Any successful model requires constant monitoring and nurturing in order to become production-ready; it requires real-world data from the very beginning. As an industry, we need to take a page from the world’s most talented ML teams—OpenAI, Tesla, and TikTok, just to name a few. These teams understand that keeping a model in a lab for months before it’s ready to see the light of day is a mistake. The way to accelerate time to production is to get a model out in the model and use the data from users to continuously make the model better.

Why ChatGPT can’t (and shouldn’t) answer all your questions. Max Shaw, SVP of Product Management at Yext

The early success of ChatGPT is inspiring more industries to explore how they can leverage generative and conversational AI in their work, and many individuals see this as an opportunity to drastically improve their business’s efficiency. However, there first needs to be a deeper understanding of how these technologies generate responses. ChatGPT and other large language models (LLMs) need to be supplied with authoritative information to answer a user’s questions effectively and accurately. If left to their own devices, they’ll pull information from a wide variety of sources across the internet, and the validity of generated responses is immediately thrown into question. Bad information means bad answers, and for certain industries, like financial services and healthcare, bad answers can be a serious issue. In other scenarios, inaccurate information can erode consumer confidence in your product or service. To address this, organizations can build and maintain a strong repository of structured content in the form of a knowledge graph. Authoritative information can be fed directly from a knowledge graph into an LLM to ensure that generated responses are accurate. Put differently, LLMs can be used for natural language understanding, but they shouldn’t be used for domain or company-specific knowledge. This is why effective approaches to conversational AI must leverage both LLMs and knowledge graphs. Even then, conversational AI may not be the magic bullet businesses want it to be. Certain interactions will always require a human touch, and AI still requires a “human-in-the-loop” approach for most enterprise use cases.

Businesses Shouldn’t Rush to Adopt Generative AI. Commentary by Scott Varho, Chief Evangelist of 3Pillar

Generative AI, a form of artificial intelligence that creates net new content, has been getting a lot of buzz for its novelty, unique applications and potential impact on the business world.  New technologies offer fertile ground for new breakthrough products, services and experiences. However, companies shouldn’t be clamoring to implement generative AI without determining the customer value and business impact it can bring that is also better than alternatives.  Innovation isn’t limited to leveraging new inventions – it’s also using existing technologies in a new way that creates new value. Value is the key. Before investing deeply into generative AI, businesses need to evaluate their target market for a compelling use case that will create value for customers and benefit the business.  Then leaders can test that use case in a lean way to enrich their understanding of the value potential (as well as potential pitfalls). If generative AI demonstrates value through a modest investment, then more investment makes sense. New technologies ignite the imagination – rushing to over-invest in them can do more harm than good.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Speak Your Mind