Sign up for our newsletter and get the latest big data news and analysis.

Heard on the Street – 1/24/2022

Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!

COVID-19: A Data Tsunami That Ushered in Unprecedented Opportunities for Businesses and Data Scientists. Commentary by Thomas Hazel, founder & CTO at ChaosSearch

From creating volatile data resources to negatively impacting forecasting models, there have been countless challenges the pandemic has caused for organizations that rely on data to inform business decisions. However, there is also an upside to the data tsunami that COVID-19 created. The movement to all-things-digital translated into a tsunami of log data streaming from these digital systems. All this data presented an incredible opportunity for companies to deeply understand their customers and then tailor customer and product experiences. However, they’d need the right tools and processes in place to avoid being overwhelmed by the volume of data. The impact spans all industries, from retail to insurance to education. Blackboard is a perfect example. The world-leading EdTech provider was initially challenged at the start of the pandemic with the surge of daily log volumes from students and school systems that moved online seemingly overnight. The company quickly realized they needed a way to efficiently analyze log data for real-time alerts and troubleshooting, as well as a method to access long-term data for compliance purposes. To accomplish this, Blackboard leverages its data lake to monitor cloud deployments, troubleshoot application issues, maximize uptime, and deliver on data integrity and governance for highly sensitive education data. This use case demonstrates just how important data has become to organizations that rely on digital infrastructure and how a strong data platform is a must to reduce the time, cost, and complexity of extracting insights from data. While the pandemic created this initial data tsunami, tech-driven organizations that have evolved to capitalize on its benefits, like Blackboard, have accepted that this wave of data is now a constant force that they will have to manage more effectively for the foreseeable future.

Cloud Tagging Best Practices. Commentary by Keith Neilson, Technical Evangelist at CloudSphere

While digital transformation has been on many organizations’ priority list for years, the Covid-19 pandemic applied more pressure and urgency to move this forward. Through their modernization efforts, companies have unfortunately wasted time and resources on unsuccessful data deployments, ultimately jeopardizing company security. For optimal cyber asset management, consider the following cloud tagging best practices: Take an “algorithmic” approach to tagging. While tags can represent simple attributes of an asset (like region, department, or owner), they can also assign policies to the asset. This way, assets can be effectively governed, even on a dynamic and elastic platform. Next, optimize tagging for automation and scalability. Proper tagging will allow for vigorous infrastructure provisioning for IT financial management, greater scalability and automated reporting for better security. Finally, be sure to implement consistent cloud tagging processes and parameters within your organization. Designate a representative to enforce certain tagging formulas, retroactively tag when IT personnel may have added assets or functions that they didn’t think to tag and reevaluate business outputs to ensure tags are effective. While many underestimate just how powerful cloud tagging can be, the companies embracing this practice will ultimately experience better data organization, security, governance and system performance.

Using AI to improve the supply chain. Commentary by Melisa Tokmak, GM of Document AI, Scale AI

As supply chain delays continue to threaten businesses at the beginning of 2022, AI can be a crucial tool for logistics companies to speed up their supply chain as the pandemic persists. Logistics and freight forwarding companies are required to process dozens of documents – such as bills of lading, commercial invoices and arrival notices – fast, and with the utmost accuracy, in order to report data to Customs, understand changing delivery timelines, collect & analyze data about moving goods to paint information about the global trade. For already overtaxed and paperwork-heavy systems, manual processing and human error are some of the most common points of failure, which exacerbate shipping delays and result in late cargo, delayed cash flow & hefty fines. As logistics companies have a wealth of information buried in the documents they process, updating databases with this information is necessary to make supply chains more predictable globally. Most companies spend valuable time analyzing inconsistent data or navigating OCR and template-based solutions, which aren’t effective due to the high variability of data in these documents. Machine learning-based, end-to-end document processing solutions, such as Scale AI’s Document AI, don’t rely on templates and can automate this process; AI solutions allow logistics companies to leverage the latest industry research without changing their developer environment. This way, companies can focus on using their data to cater to customers and serve the entire logistics industry, rather than spending valuable time and resources on data-mining. ML-based solutions can extract the most valuable information accurately in seconds, accelerating internal operations, reducing the number of times containers are opened for checks—decreasing costs and shipping delays significantly. Using Scale’s Document AI, freight forwarding leader Flexport achieved significant cost savings in operations and decreased the processing time of each document. Flexport’s documents were formerly processed in over two days, but with Document AI, were processed in less than 60 seconds with 95%+ accuracy, all without having to build and maintain a team of machine learning engineers and data scientists. As COVID has led to a breakdown of internal processes, AI-powered document processing solutions are helping build systems back up: optimizing operations to handle any logistic needs that come their way at such a crucial time.

IBM to Sell Watson Health. Paddy Padmanabhan, Founder and CEO of Damo Consulting

IBM’s decision to sell the Watson Health assets is not an indictment of the promise of AI in healthcare. Our research indicates AI was one of the top technology investments for health systems in 2021. Sure, there are challenges such as data quality and bias in the application of AI in the healthcare context but by and large there has been progress with AI in healthcare. The emergence of other players, notably Google with its Mayo Partnership, or Microsoft with its partnership with healthcare industry consortium Truveta are strong indicators of progress.

Data Privacy Day 2022. Commentary by Lewis Carr, Senior Director, Product Marketing at Actian

In 2022, expect to see all personal information and data sharing options get more granular as to how we control them – both on our devices and in the cloud – specific to each company, school or government agency. We’ll also start to get some visibility into and control over how our data is shared between organizations without us involved. Companies and public sector organizations will begin to pivot away from the binary options (opt-in or opt-out) tied to a lengthy legal letter that no one will read and will instead provide the data management and cybersecurity platforms with granular permission to parts of your personal data, such as where it’s stored, for how long, and under what circumstances it can be used. You can also expect new service companies to sprout up that will offer intermediary support to monitor and manage your data privacy across.

Data Privacy Day 2022. Commentary by Rob Price, Principal Expert Solution Consultant at Snow Software

The adoption of cloud technology has been a critical component to how we approach privacy and data protection today. A common misconception is that if your data is offsite or cloud-based it’s not your problem – but that is not true because the cloud is not a data management system. Two fundamental factors for data protection and security are the recovery point objective (how old can data be when you recover it) and the recovery time objective (how quickly can you recover the data). Every company’s needs are different, but these two factors are important when planning for data loss.

Data Privacy Day 2022. Commentary by Ricardo Amper, CEO and founder of Incode

There are a lot of misconceptions about how facial recognition technology is currently used. However, despite the reported privacy mishaps and concerns, there is a true inclination among consumers to embrace this technology. Trust is essential and is often missing when consumers aren’t in the forefront of the conversation around privacy. The individual must be put first, which means getting their consent. The more an individual feels that they can trust the technology, the more open they will be to using it in additional capacities.

Data Privacy Day 2022. Commentary by Karen Worstell, Senior Cybersecurity Strategist at VMware

The lines between work and our personal life have increasingly been blurred over the past few years as our homes now double as our offices. This is not likely to change soon, as companies continue to delay their return to office plans. As we settle into a new era of anywhere work, enterprises must understand that data privacy practices rest on a foundation of strong cybersecurity controls. Data Privacy Week is a time for organizations to set goals for implementing best practices that improve data protection and cybersecurity. These include robust vulnerability management, implementing multifactor authentication, threat hunting, and network micro-segmentation, among others.

Data Privacy Day 2022. Rajesh Ganesan, Vice President of Product Management at ManageEngine

Data protection is only successful when all components within the infrastructure—including all employees—are prepared to handle it. To do this efficiently, data protection must be built right from the design stages of all services and operations. Moreover, data protection should be present as a strong, invisible layer; it shouldn’t hamper operations, nor should it require big changes or specialized training. It’s best to educate employees on the do’s and don’ts of data protection in a way that is contextually integrated into their work, as opposed to relying solely on periodic trainings. To do this, leaders should implement alerts in the system that pop up and inform users about any violations to data protection policies the users’ actions are causing. Such alerts help employees learn contextually, and ultimately, this training results in fewer data management errors. 

Data Privacy Day 2022. Commentary by Cindi Howson, Chief Data Strategy Officer, ThoughtSpot

Data privacy, governance, and business success are very much intertwined. Those working with data must feel a sense of responsibility as if they were keeping their best friend’s most vulnerable secret. In a digital world, data links back to real people – where they went in that Uber, what store they visited before shopping at a lingerie store, and what movie they streamed on their phone. Data enables personalized digital interactions and more efficient movement of goods. But failure to respect customer’s data privacy risks loss of trust, revenue, and brand value. With more digital data, businesses need to be more transparent in the data they collect and how it’s used.

Data Privacy Day 2022. Commentary by Amy Yeung, General Counsel & Chief Privacy Officer, Lotame

Although awareness of data privacy is at an all-time high – our heavier digital footprint especially a result of COVID-19 – there’s a healthy debate about the best course of action and next steps. The local, state and federal dialogue continue to form, while outside the U.S., there are countless (sometimes conflicting) laws about how data should and shouldn’t be handled. Despite the many challenges businesses face in this climate, there’s ample opportunity. Integrating and aligning with the underlying policy goals of laws like GDPR within and outside the EU, rather than doing only “what the law requires,” can maintain space to stay the business course during the fluctuating regulatory requirements. Like Lotame, many in our industry see the negative impacts of staying in a boat so close to the waves rather than buoying ourselves to a higher plane, and 2022 brings with it an opportunity for data privacy by design to effectively execute on our business roadmaps.

Data Privacy Day 2022. Commentary by Pritesh Parekh, Chief Trust & Security Officer, VP of Engineering at Delphix  

Modern technologies – such as data masking – could help to mitigate these attacks and improve data privacy throughout an organization. Data masking can automatically identify where sensitive data resides — across every system including non-production environments for development, testing, and analytics. It then applies algorithms that replace the original value with a fictitious but realistic equivalent in an irreversible way. This, ultimately, decreases the risk of a breach and prevents hackers from getting hold of valuable data. The more masked data your company has, the less there is for bad actors to steal.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1

Leave a Comment

*

Resource Links: