Building a Better AI Brain with Object Storage

Print Friendly, PDF & Email

In this special guest feature, Michael Tso, CEO of Cloudian, discusses how AI is rapidly changing the business world, and for AI to deliver business value, the storage industry will play a key role – scale-out object storage with full S3 compatibility matches this role perfectly. Michael Tso holds 36 patents and has been a technology trailblazer for over 20 years. Michael co-founded Cloudian and Gemini Mobile Technologies, and built business and engineering operations in US, Japan, and China. For more than 10 years, Cloudian and Gemini Mobile have provided mission critical carrier grade infrastructure software which serve hundreds of millions of users. Now, Cloudian is trailblazing distributed object storage for cloud and enterprise storage use cases.

The rise of artificial intelligence (AI) and machine learning will completely reshape the business world, much the same way that personal computers have done over the past two decades. Companies and workers are transforming to AI-assisted organizations where vast amounts of data are collected, stored, and consumed by machine learning, enabling real-time decisions precisely matched to business and consumer needs. We see examples of this already with recommendation engines on shopping sites like Amazon, or digital billboards in Tokyo that match advertisements to individual cars. But this is just the beginning. No one quite knows exactly how AI will change the business world, just like no one foresaw the boundless changes ushered in by the introduction of PCs.

Training Data is the Key to AI Readiness

AI and machine learning are only as good as the quality and quantity of training data. A child with high IQ raised in a library will be far more knowledgeable than one raised without access to information. For AI to deliver business value, the storage industry will play a key role. The emphasis on building more ‘intelligent’ technology means companies must retain massive volumes of unstructured data. Powerful computers equipped with AI software will iteratively consume the data in a training process. After each training run, the “knowledge model” created by the AI software will be tested for effectiveness, and additional data will be gathered or existing data reorganized for the next training run for improvement.

As an example, to train deep learning software to automatically recognize a particular car model, thousands of images of that car must be tagged with the model name for training; equally important, thousands more images of other cars must be tagged with different model names so the learning software can identify them as non-matches.  Here, the effectiveness of the learning software is completely determined by the training data.

Similarly, a company’s data will be the key differentiator in an AI world – the data embodies the essential business know-how and intellectual property. Businesses must protect their training data with the same vigor as any confidential designs or business secrets. Would an online store put its crown jewels — their customers’ buying and browsing history — on the internet?

Therefore, for a company to be AI ready, it must do the following:

  1. Store vast amounts of unstructured data cheaply.
  2. Make it easy to add or modify the data.
  3. Tag data sets with attributes as training hints.
  4. Make the data accessible via an industry standard API supported by current and emerging AI tools.

Scale-out object storage with full S3 compatibility matches these criteria perfectly.

Most companies are new to AI and machine learning, but starting now to maintain and grow this data is essential.  When they are ready to adopt AI technologies, companies with years of historical data available for training will be far ahead of companies starting from scratch. If their data is already in an S3 compatible scale-out object store, they will realize value from AI technologies much sooner than not storing the data at all or storing in traditional storage silos. AI training tools are great, but they are only as good as the quality and quantity of data you have to train them.

The Advantage of Object Storage

Scale-out object storage offers clear advantages when it comes to the storage and management of data to train machines. It allows all the data to reside in a single namespace at a low cost. With object storage, you can tag data with metadata, which is a key component of machine learning. Furthermore, S3 API compatibility opens access to new tools.

Just like businesses deploying PCs gave rise to email, office productivity applications, the internet, and the cloud, AI and machine learning technology will become mainstream with use cases we cannot even imagine today. Internet and retail search engines already complete search words based on what they ‘think’ we might want. It knows your search and buying history data – what you have searched and eventually purchased.

Five to 10 years from now, companies that are not AI-ready will fall by the wayside, just like companies who were slow to adopt PCs. AI will be the new norm, feeding organizations critical business intelligence about their consumers and how to reach them.

For those wanting to invest in AI or machine learning, the organization’s data history will be its biggest differentiator in the future. Learning from that data history will feed the AI engine tomorrow, but only if the data is retained and is easily consumable. The IT world is shifting and the winners will keep data in a format that is AI friendly.


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind