Google’s New Big Data View of Ethereum: What to Know

Print Friendly, PDF & Email

Google wants to make all blockchain data associated with Ethereum easily accessible for people to study. It’s doing that by making all Ethereum data sets available through BigQuery, Google’s enterprise-level and highly scalable data warehouse geared towards data analysts.

People Can Query Data Tables for Free

One of the most advantageous aspects of this development for data scientists is that they can dive into the data for free. When using BigQuery’s Python library, it’s possible to use Kernels, a complimentary, browser-based coding platform available on Kaggle, a well-known data science website. The Kernels tool lets people work with Ethereum data tables through SQL queries.

Ethereum and Smart Contracts

Ethereum is a platform intended for using tokens to make transactions known as smart contracts. In simple terms, smart contracts enforce relationships with cryptographic code and function exactly as their creators dictate. For example, an individual could make a smart contract that sends a particular amount of Ether, the cryptocurrency associated with Ethereum, on a given date and does so on a repetitive basis. Other smart contracts only function if a minimum number of people agree to enter into them.

To demonstrate what an Ethereum query could tell people through BigQuery, Google sought to determine the most popular smart contract associated with Ethereum. They found it was one connected to a game called CryptoKitties, and that the transaction happened more than 2.3 million times.

An Expansion of What’s Already Available

Software exists that allows checking the balance in an Ethereum wallet or finding out about the status of a transaction. However, what Google offers is different because it gives access to all data stored on the Ethereum blockchain.

This progress could spur more possibilities for big data scientists who are interested in analyzing the blockchain but previously may have lacked resources. This way to use big data could also help people realize there are plenty of ways to use blockchain technology that don’t relate to cryptocurrencies. For example, a business might create a fully automated supply chain management tool or give updated business information to stakeholders in real time. Google’s Ethereum data visualizations represent data pulled from the Ethereum ledger each day, so data scientists rest assured of up-to-date data.

The company ran another example related to the previously mentioned CryptoKitties game where they performed a query that told them how many users had at least 10 so-called CryptoKitties creatures in the game.

If data scientists wanted to use the data differently, they might use a similar query showing the most popular ways people use Ethereum and how they change over time. Alternatively, they could track surges in activity and attempt to discover the reasons behind those spurts of abnormal usage.

Google Explains Its Big Data Blockchain Offering

A blog post by Google goes into more details about why the company decided to give people access to Ethereum data this way. For starters, API endpoints for aggregate data related to Ethereum didn’t exist before Google made this move.

In the blog entry, the writer suggests a business depending on the Ethereum architecture for part of its operations might use a big data query to conclude it’s time for an architecture upgrade. It’s also possible to use a query to find out transaction frequencies between particular wallet addresses.

Moreover, BigQuery can give insights about the functions of particular smart contracts, even if users do not have the source codes of those contracts. So, if a company that’s considering starting to use a certain kind of smart contract employs a data scientist, they could glean data that shows whether those contracts are commonplace or still emerging.

More Blockchain Data Could Be Forthcoming

While discussing the process of updating the Ethereum blockchain data in BigQuery, Google briefly mentions the company welcomes other kinds of blockchain information for its system, as well as more contributors. Since Google is such a respected name in the technology sector at large, people may soon see more diverse data on BigQuery.

Similarly, people who work in fields including data science and machine learning might begin investigating with Google’s tool, then develop their own platforms or coordinate with others in a team effort to do so.

If that happens, blockchain data becomes readily available, opening up countless possibilities for people interested in data science to work with platforms specialized for their needs.

About the Author

Contributed by: Kayla Matthews, a technology writer and blogger covering big data topics for websites like Productivity Bytes, CloudTweaks, SandHill and VMblog.

Sign up for the free insideBIGDATA newsletter.

Speak Your Mind



  1. The reason for making the Ethereum blockchain information open on Google Cloud is to make all information put away on the blockchain effortlessly available. While Ethereum’s product contains APIs for capacities that can be gotten to arbitrarily, for example, checking wallet adjusts, the API endpoints are not effortlessly available for all information put away on the blockchain.