Heard on the Street – 5/19/2022

Print Friendly, PDF & Email

Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!

Meta and the New Boardroom Imperative on Data. Commentary by Rajiv Dholakia SVP of Products, Privacera

The realization that Facebook doesn’t really know where all of its data goes is a scary one. However, you would be mistaken to think Meta, Facebook’s parent company, is alone. Gartner suggests that, through 2025, 80% of organizations seeking to scale their digital business will fail because they don’t take a modern approach to data governance. Companies, Meta included, will need to consider a few things as they try to match their digital transformation initiatives with the reality of the responsibility of living up to the trust consumers put in them when they share personal information. Whether Meta takes data privacy seriously or not is up to them to decide at a board and leadership level, but there is clear evidence that many leading organizations are building their reputations and brand on the notion that consumers can trust them – Apple being a prime example. Data governance has been in a serious transformational journey as a result of the digital transformation and cloud migration we’ve experienced. Ten years ago, data was locked in the enterprise data warehouse under IT control. Today, it lives in multiple cloud-based data storage and processing environments. Command and Control IT-centric data governance has given way to a decentralized and often chaotic free-for-all on how data is collected, stored and processed by anyone with a credit card and access to a data science notebook. A new style of data governance is called for – one that offers centralized oversight with distributed or federated responsibilities and guardrails.

What is the most important trait for IT leaders in 2022. Commentary by Rupert Colbourne, CTO, Orbus Software

Rather than focusing on technical skills today, IT leaders need to be able to define a vision for the technology roadmap to transform the business and communicate the plan and the execution. This also requires interacting with multiple stakeholders and creating alignment on the plan. Therefore, strong communication and collaboration skills are vital for an IT leader in order to be able to sell their technology roadmap for improving the business. To articulate a clear plan, you need to focus on building up your planning skills and the ability to roadmap the strategy. Start small and work towards more ambitious, wides-cale plans. In addition, you need to work on your communication skills by engaging with different parts of the business. You can’t break down silos if you don’t break out of your IT silo! To do this requires joining cross-functional projects to understand more about the different needs of stakeholders. This, in turn, will improve your strategy and planning skills as you will have a much better understanding of what is really working and where the pain points are that need to be fixed.

AlloyDB by Google. Commentary by Amit Sharma, CEO of CData

As businesses modernize their tech stack, they are increasingly shifting to cloud databases to make their data more easily accessible and agile. AlloyDB now offers a new level of speed, scalability, and reliability that will change the way customers use cloud databases to power their applications. By partnering with leading data connectivity providers, Google is providing an unmatched level of access between AlloyDB and the hundreds of other tools in their customers’ data ecosystem in near real-time. Now, Google customers can enjoy direct, manageable access to their data in the cloud without disrupting existing processes or burdening IT as data gatekeepers.

Bad Data: A $12.9 Million Problem. Commentary by Jonathan Grandperrin, Co-Founder and CEO, Mindee

As of 2021, Gartner estimates that companies lose $12.9 million annually due to poor-quality data. Funds get wasted in two time-exhaustive processes: digitizing data from paper and from digital documents, capturing the correct information. Apart from the impact on revenue, bad data (or the lack of it) leads to poor decision-making and business assessments in the long run. Luckily, by adopting advanced technologies such as Application Programming Interfaces (APIs), we can mitigate the issue at hand. To get started, leaders must first identify their valued data and what form it is in – then, it’s time to find (or build) the right solution. I highly recommend finding a built-in low- or no-code solution already available. Considering the automation of exact data extraction is a complicated process that involves machine learning, computer vision, algorithms, and deep learning engines, a low/no-code system can streamline the adoption.. In addition, by adopting an existing solution, leaders can skip the tedious task of building their own deep learning solution and focus on maximizing the benefits of a trained model – saving money and time while remaining accurate and competitive. 

Why we should treat emotion AI like data privacy. Commentary by Theresa Kushner, Data and Analytics Practice Lead at NTT DATA Services

Emotional AI is still relatively new, and all of its use cases – both good and bad – have yet to be explored. I sit on the board of AI Truth, a NGO dedicated to ensuring integrity in AI applications, so of course, any “new” application of AI is going to give me pause as we consider its potential negative implications. In this case, when personal information such as the emotion your face is expressing, is made available to corporations I think about the potential impact on consumers. I liken these concerns to the issues around data privacy – just because an individual gives permission for a company to access their personal information, doesn’t mean that they’re okay with that company making inferences about their emotional state based on that information. Someone’s emotional fingerprint is just as much a part of their private self as their real fingerprint. That said, emotional AI can be used in very positive ways. One exciting use case of emotional AI is using it in conjunction with avatars that can respond like human beings, or “digital humans”. Think of the possibilities for improving patient care in hospitals that are short on nursing staff.  Digital humans can be there to record what’s happening with the patient and communicate with nurses that are on staff. Improving customer service and experience is what these digital humans do best.  They never get tired or annoyed at others. Instead, they can pick up on unhappiness and, if programmed appropriately, respond in a way that alleviates it. 

S3 Select: fast decisions for AI/ML apps. Commentary by Cloudian CTO Gary Ogasawara

As organizations look to leverage rapidly growing data sets to support AI, ML and advanced analytics use cases, they often struggle to efficiently filter and score the data. For some ML use cases, 90% of the work is simply cleansing and filtering data. The emerging S3 Select API provides an new way to cleanse and sort data, making it much easier for organizations to support AI, ML and analytics workloads in production. S3 Select uses SQL syntax to look inside an object and return a subset of that object’s data – “selectively” querying that data. For example, when querying a CSV file of bank transactions, instead of retrieving the whole file (which may be 100s of GBs) and then processing it on the client side, an application can use the S3 SELECT API to filter out to a specific date range and only retrieve withdrawals greater than a specific value. For AI, ML and analytics use cases, S3 Select offers advantages of reducing network traffic, reducing the compute load of data processing, and reusing the same base object for multiple uses. S3 SELECT is useful in all environments, but it is especially advantageous for edge applications where fast decision-making is often required and fewer compute and storage resources are available.

Improving Observability in DevOps Lifecycles. Commentary by Brian Rue, Co-founder and CEO of Rollbar

As the cycle of software development speeds up to meet customer demand for more features, the release cycle and DevOps cycles also need to speed up. Ultimately, DevOps is about shipping code to users. So how can we make the cycle more efficient? There are two different planes to the DevOps cycle, the plane of infrastructure and the plane of code. One way to make the process more efficient is to help move DevOps out of the way. Giving developers their own observability into how their own code is functioning is one way to make the whole cycle more efficient and faster. Giving developers observability tools that are tailored to the code simplifies governance because DevOps no longer has to be the go-between, to get developers logs and other information they need to debug and fix code issues. Automating these tools also helps make the process more efficient and also more easy to manage. That makes governance more efficient and the whole process faster.

Border Gateway Protocol (BGP) hijacking. Commentary by Arrcus CTO Keyur Patel

BGP hijackings continue to present a major threat to enterprises as attackers reroute internet traffic. During a time of economic volatility, these attacks can be particularly effective and devastating,” said Keyur Patel, Founder and CTO of Arrcus, who was formerly with Cisco and helped write the BGP protocol. “Given the digitalization of the global economy, building a more resilient internet is critical, and enterprises need to proactively address the vulnerabilities in their architecture. Service providers must invest in intelligently routed redundant infrastructure that ensures reliability. It’s particularly important to include BGP route origin validation and security to stop wide-scale propagation of invalid routes at large hub networks. This helps both to protect end users by minimizing the impact of route leaks and hijacks and reduce exposure to potential attack.

Leveraging Data For Better-Decision Making. Commentary by John Peebles, CEO, Administrate

When it comes to driving improvements with executive data-driven decision making, it’s essential first to ensure leaders can make sense of all the data generated in the first place. Why are we collecting this information, and how do we use it to answer questions and drive better decision-making? We often get asked how we shorten and improve the on-boarding experience. Complicated on-boarding is a real problem for our clients, and one we could not answer without getting into data-driven decision-making and learning archetypes. Organizations can engage in data-driven decision-making by building out a learning archetype that streamlines training data into one central ecosystem. We find that, for many training teams, adding more tools, systems, and manual processes to their data pipeline only creates poor data hygiene. The end result? An enormous amount of data that isn’t very useful. We think the key elements of a learning analytics architecture help jump-start healthy internal transformation into data-driven decision making for training teams. Securing these archetypes into the strategy allows an organization to draw credible conclusions that future proof their business for decades to come.There are six key elements of Learning Analytics Architecture: Catalyzing change, Descriptive analytics, Diagnostic analytics, Predictive analytics, and Futureproof.

Consequences of low IT talent retention. Commentary by Josh Perkins, Field CTO at AHEAD

The professional landscape has undergone a tremendous amount of change over the past two years, impacting not only where employees work but also how much they work. When employees exit the workforce in droves, especially those who have been with an organization for years, they also take with them their accrued knowledge and experiences. This exposes knowledge gaps among remaining workers and processes. Gaps and blind spots in day-to-day functions pose great risk to an organization’s core capabilities, employee engagement levels and its bottom line financially. To fill in these gaps, organizations are now turning to artificial intelligence. As AI/ML solutions enable smarter, more efficient processes, organizations can equip their employees with the necessary tools to foster knowledge continuity across teams. For example, as modern organizations gather larger and larger quantities of data, AI is able to help cut through the clutter during transitions by extracting the company’s most important insights and making them readily available to new employees. The departing employee is no longer taking their legacy knowledge with them when they depart. Instead, AI processes are able to provide on-boarding consistency, as well as company auditability across the board. Utilizing AI/ML solutions will equip organizations with the right tools to avoid stumbling when legacy employees make the decision to leave.

On the role of machine learning in automating data analytics and cross channel optimisation in online advertising. Commentary by Torkel Öhman, CTO & co-founder of Amanda AI

When it comes to online advertising, rising costs, fierce competition and time-consuming manual methods are, quite frankly, straining the resources of e-commerce sites. However, the advancements seen in the AI sector over the past couple of decades mean that, for many online retailers, ML is now an invaluable tool for establishing and optimizing their marketing campaigns. AI is now able to provide businesses with the data-driven edge they need to thrive online by removing the needless hands-on work involved with traditional digital marketing campaigns. In fact, AI can now be trained to automate the entire digital marketing process from start to finish. To begin with, web crawlers can be used to process e-commerce sites, evaluating and categorizing titles, tags and meta descriptions for potential advertisements. Data-driven modules can then analyze thousands of different target groups and demographics to ensure said advertisements are only reaching the audiences with the highest probability of conversion. Such processes require the analysis of vast amounts of data, a task simply unimaginable by human beings. However, as impressive as this is, the ability to publish millions of unique advertisements is not the only application of ML in the field of digital marketing. Algorithms can also be tailored to undertake the continuous maintenance and optimization of these marketing campaigns. For instance, AI engines can be programmed to evaluate historical data and adjust the budget spent across different search and social channels, whether that be Facebook, Instagram or Google. This continuous and automatic refinement, without need for human support, will ultimately lead to a return on investment far beyond that of standard approaches. Technologically, we have now reached a point where the field of online marketing is ripe for automation. Long gone is the need for the manual effort required to maintain traditional search and social campaigns. The future of advertising will be driven by AI.

Kids and Coding. Commentary by self-taught software engineer and author, Cory Althoff

The workplace is changing fast. Some of the most critical skills you need to succeed in today’s workforce are problem-solving, creativity, and persistence. Coding teaches all three. That’s one of the many reasons I will teach my daughters how to code as soon as they are old enough. I want my kids to experience the joy of not just playing games but building them. Most kids won’t study math or history for fun. If taught correctly, they will code games for fun, though. And even more than the skills I mentioned earlier, a love of learning is the most important skill you should teach your children if you want them to have successful careers. 

Mitigating loss of knowledge during the Great Resignation by changing the search paradigm. Commentary by Jeff Evernham,Vice President Product Strategy at Sinequa   

The Great Resignation is not only about losing good talent, but also about losing organizational intelligence – the invaluable tacit knowledge that departing employees take with them. This risk, along with the existing challenges associated with the unfathomable amounts of data and content generated in today’s workplace, has caused a resurgence of interest in Knowledge Management. KM is back on C-level agendas as one of the most important initiatives for business resiliency in the years ahead. As IT leaders assess how best to implement technologies that discover, cultivate, and protect the collective knowledge of an organization and seamlessly disseminate it to employees where and when they need it, having the right automation and tools and aligning them to maximize knowledge use is critical. One best practice is injecting relevant knowledge into the flow of work, and technologies such as neural search are changing the paradigm that underpins KM. Natural Language Understanding (NLU) leveraging AI and deep learning neural networks use language models to understand language and context, improving relevance. This brings the right information to employees across a broader range of use cases, even when the employee doesn’t know where or how to look – or even that they should look in the first place! This is transformative — shifting the experience from “employees searching for information” to “information finding employees.” This push model makes work more intuitive and is bringing a step change to contextual relevance, speed, and performance. This new search paradigm contributes not just to workplace effectiveness but also to satisfaction for employees as they are empowered with the information they need to do their jobs well. This reduces the impact of the Great Resignation in two ways: it ensures that relevant information is always findable, minimizing the knowledge loss of the employees that do leave and it makes employees’ jobs better which leads to increased satisfaction and reduced turnover in the first place.

How AI will change the future of data centers. Commentary by Tony Pialis, co-founder and CEO of Alphawave IP

Artificial intelligence is well known for transforming businesses and workforces through faster and smarter software and applications. On the hardware side, there’s an emerging and highly-competitive market for developing AI chips that’s poised to revolutionize computing performance. We’re seeing a surging amount of computational power running on AI chips, which means that these chips are growing exponentially in complexity. However, this isn’t necessarily practical or economical in the long term. Companies like Intel have recently announced major inroads in building advanced semiconductors equipped with high-end connectivity to tackle this challenge, and there will undoubtedly be more focus in this area, as AI chips reshape the traditional architecture in next-generation data centers. By separating CPU, GPU, storage and memory components into dedicated  rooms instead of residing together in server racks, this reconfiguration will all feed into racks of AI chips. This will enable low latency and near-zero ping time connections.

New CA Privacy Agency Impacts Enterprises. Commentary by from Steve Padgett, CIO for Actian

California’s new Privacy Protection Agency was created to regulate tech giants collecting consumer data – the first government body of its kind in the U.S. The problem is the industry has yet to establish set standards for self-reporting that adheres to data privacy standards. Similar to restaurant grading, there would need to be a grading system in place to bestow organizations with correlating grades based on their adherence to these standards – to be published on their websites for public view. The only way this is possible is if the industry pushes for standards around how policy is mapped to data pipelines with the data lineage recorded with a blockchain. Organizations need to be willing to adapt and be flexible when adopting new guidelines based around privacy regulation standards. There are three rules all businesses handling consumer data must follow to ensure they are meeting these standards: always obtain and log Opt-Ins, immediately encrypt all personal data at rest, and unmask data only when de-identifying/anonymizing it. Data lineage is critical because for all the steps, making sure to continuously log any actions – ideally utilizing blockchain to ensure there is no tampering with data. 

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1

Speak Your Mind