Interview: Paulo Sampaio, Data Scientist at EDITED

Print Friendly, PDF & Email

I recently caught up with Paulo Sampaio, Data Scientist at EDITED, to talk about applying machine learning, neural networks, natural language processing, and big data analytics to the retail industry. Paulo came from the professional services world, which is more accustomed to machine learning approaches, and has therefore needed to research and adopt approaches from tech titans like Google in retail. Paulo and his team are applying neural networks, machine learning and other models to analyze over 520 million products in real-time across 42 countries to make gradual distinctions in clothing styles, sizes and categories.

insideBIGDATA: How is the EDITED team applying neural networks, machine learning and other statistical models to analyze products in real-time across a wide geographical area?

Paulo Sampaio: EDITED gives apparel retailers around the world the real-time data they need to always have the right product at the right price, at the right time. We apply machine learning, AI, neural networks and other statistical models to help brands understand competitor’s pricing, merchandise the best product assortments, and spot key trends early to gain a competitive edge.

For EDITED, it’s about applying the best tool to each situation – for instance, product categorization is a Natural Language Processing (NLP) classification task. We use a support vector machine-inspired algorithm that learns from a custom-built dataset – products are labeled by our apparel specialists which then predict the labels for all the other items in the database. For product color detection, EDITED uses a shape detection algorithm combined with pixel level clustering to identify and isolate the products in the pictures. We also use a random forest model to brush off any skin left after in the process, which better isolates only the colors belonging to the product, and not the background or the model wearing it.

Because the apparel and retail industry is incredibly fast-paced, with shrinking product cycles and high consumer demand, it’s absolutely critical for our customers to have the latest insights on opportunities and trends in an instant. We are constantly looking at ways to apply the latest techniques to our platform, which currently analyzes over 550 million product SKUs (adding over half a million more each month), across 42 countries.

insideBIGDATA: What would you estimate to be the coolest use of AI in the fast moving apparel retail sector?

Paulo Sampaio: One of the most interesting uses of AI in the apparel retail sector today is convolutional neural networks. This is part of virtually any sort of computer vision application you can imagine – including product visual identification, object detection, visual similarity and more.

From a scientific research perspective, last year we saw a lot of papers about GAN (Generative Adversarial Networks) being published, which is a branch of neural network research that studies the possibility of training models able to create data based on the known input data. For instance, you can train a model to understand a particular style of floral dress. Then you can ask the model if it can create a new floral dress from scratch, and it will hopefully create a new and unique design based on what it understood was a floral dress during the training phase. Such applications are still in their infancy, but it’s very interesting and exciting for the apparel retail industry!

insideBIGDATA: How does EDITED work with customers including some large fashion brands?

Paulo Sampaio: At EDITED, we have a data manager with extensive experience in the retail industry. She acts as the bridge between knowing our customer’s challenges and requirements, to then conveying these issues to the data scientist team to work towards the best solution.

It’s very important to have a business-oriented person working together with the data scientists, so that the company doesn’t lose sight of the customer’s needs and has the experience and insights to back this up. As a data scientist, it can be easy to get distracted with the cool research, so working together with a business partner helps everyone focus.

insideBIGDATA: How does the EDITED technology help to boost sales using AI?

Paulo Sampaio: EDITED provides the largest and most accurate source of real-time apparel retail data in the world, delivering all the insights a retailer can ask for – such as historical product information, pricing averages, replenishment data, and so on.

Using AI, this gives our customers the possibility to evaluate how their assortment is performing compared to their competitors. They can then plan their buying decisions, pricing strategies, markdowns and so on. As an example, let’s imagine that you are assessing denim jeans at a given retailer. You can create a dashboard showing black, blue and khaki styles from your competitors and you see that the black jeans have been priced up over the last couple of weeks, and are selling out across the board. This is probably the right time to replenish your stock and evaluate the best price.

insideBIGDATA: What does EDITED have in store for the future?

Paulo Sampaio: The apparel retail market is one of the most competitive industries in the world, driven by shrinking product cycles, changeable consumer tastes, and the ability to buy anything from anywhere in the world with a click of a button. We’re seeing that the savviest retailers are taking advantage of the latest advancements in machine learning, deep analytics and AI to deliver a more targeted and personalized shopping experience. At EDITED, we plan to continue innovating to help brands and retailers grow, and be better, faster and more efficient in how they do business.


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind