Sign up for our newsletter and get the latest big data news and analysis.

Video Highlights: A Path Into Data Science

Are you interested in getting ahead in data science? On this TalkPython podcast episode, you’ll meet Sanyam Bhutani who studied computer science but found his education didn’t prepare him for getting a data science-focused job. That’s where he started his own path of self-education and advancement. Now he’s working at an AI startup and ranking high on Kaggle.

Video Highlights: Thinking Sparse and Dense

The video below, “Thinking Sparse and Dense” is the presentation by Paco Nathan from live@Manning Developer Productivity Conference, June 15, 2021. In a Post-Moore’s Law world, how do data science and data engineering need to change? This talk presents design patterns for idiomatic programming in Python so that hardware can optimize machine learning workflows.

Circular Statistics in Python: An Intuitive Intro

In this contributed article, Amit Babayoff, a data scientist at Deeyook, discusses the principles of circular statistics, by looking at some its basic principles and tools and why conventional linear methods don’t work well on circular data. She also explores how a simple filtering for handling noise can be constructed from these basic tools.

Video Highlights: BigQuery + Notebooks: Building an Analytics Pipeline on Kaggle

Your architecture choices impact how efficiently you’re able to use your data. In this “Snapshots” video produced by Kaggle, Data Scientist Wendy Kan demonstrates how she incorporates BigQuery and Kaggle Notebooks into her workflow. Watch her create an interactive network analysis graph that explores the most commonly installed Python packages!

The Impact of Python: How It Could Rule the AI World?

In this contributed article, writer, AI researcher, and business strategist Michael Lyman discusses the growth of use of the Python language and how it is playing a significant role in the rise of AI and deep learning. Python’s power and ease of use has catapulted it to become one of the core languages to provide machine learning solutions.

Interview: Terry Deem and David Liu at Intel

I recently caught up with Terry Deem, Product Marketing Manager for Data Science, Machine Learning and Intel® Distribution for Python, and David Liu, Software Technical Consultant Engineer for the Intel® Distribution for Python*, both from Intel, to discuss the Intel® Distribution for Python (IDP): targeted classes of developers, use with commonly used Python packages for data science, benchmark comparisons, the solution’s use in scientific computing, and a look to the future with respect to IPD.

Supercharge Data Science Applications with the Intel® Distribution for Python

Intel® Distribution for Python is a distribution of commonly used packages for computation and data intensive domains, such as scientific and engineering computing, big data, and data science. With Intel® Distribution for Python you can supercharge Python applications and speed up core computational packages with this performance-oriented distribution. Professionals who can gain advantage with this product include: machine learning developers, data scientists, numerical and scientific computing developers, and HPC developers.

CryptoNumerics Announces CN-Protect for Data Science Python Library

CryptoNumerics , a Toronto-based enterprise software company, announced the launch of CN-Protect for Data Science which enables data scientists to implement state-of-the-art privacy protection, such as differential privacy, directly into their data science stack while maintaining analytical value.

Python: Unlocking the Power of Data Science & Machine Learning

Python stands out as the language best suited for all areas of the data science and machine learning framework. Designed as a flexible general purpose language, Python is widely used by programmers and easily learnt by statisticians. Download the new guide from ActiveState that provides a summary of Python’s attributes, as well as considerations for implementing the programming language to drive new insights and innovation from big data.

Book Review: Python Data Science Handbook

I recently had a need for a Python language resource to supplement a series of courses on Deep Learning I was evaluating that depended on this widely used language. As a long-time data science practitioner, my language of choice has been R, so I relished the opportunity to dig into Python to see first hand how the other side of the data science world did machine learning. The book I settled on was “Python Data Science Handbook: Essential Tools for Working with Data” by Jake VanderPlas.