Cleanlab for E-Commerce and Retail

Ensure accurate information in your website, product listings, customer reviews, and internal data. Deploy more reliable ML Models and Analytics once you have more accurate information.

Case StudyPing An Insurance

Ping An Insurance used Cleanlab in an e-commerce application to: find 10% noise in their data labels, filter the detected bad data, and more robustly train their product classifier.

10%

reduction in label noise

If the classifier is trained with these noisy images directly, its performance could be degraded. In view of this, we attempted to find label errors in the image dataset with an open source tool cleanlab, a framework powered by the theory of confident learning. Specifically, we trained multiple ResNet50 image classifiers to compute the predicted product category probabilities for all the training samples in a cross-validation manner. Then the cleanlab tool could utilize the matrix of predicted probabilities to find noisy samples, ordered by likelihood of being an error. We removed the top 10% noisy samples from the training set.

A Multimodal Late Fusion Model for E-Commerce Product Classification

Ping An Insurance is a Chinese holding conglomerate whose subsidiaries provide insurance, banking, asset management, financial, healthcare services.

HOW CLEANLAB CAN HELP YOUR BUSINESS

Better estimate the true quality of product from noisy reviews. In this example, Cleanlab Studio automatically found the given label to be incorrect and suggested the correct label of "5 stars".

Cleanlab Studio enables data-centric AI to build accurate ML models for messy real-world tabular or text data. You can effortlessly harness AutoML for various data types, including text, image, and tabular formats (Excel, CSV, Json), allowing you to focus on the most important aspect: the data. Learn more about Cleanlab:

Cleanlab Studio scans any image dataset for common real-world issues such as images which are blurry, under/over-exposed, oddly sized, or (near) duplicates of others, enabling you to produce high quality computer vision datasets. Learn more.

Videos on using Cleanlab Studio to find and fix incorrect labels for:

product reviews (text data)
product categories (image data)
tabular data (e.g. numeric/categorical product metadata like price, rating, brand, etc.)

Detect errors in product descriptions/categorizations and issues like (near) duplicate or anomalous SKUs. Learn more.

Try Cleanlab Studio for freeGet in touch

Related applications

Cleanlab Studio auto-corrects raw data to ensure reliable predictions so you can maximize customer experience.

Case Study
Automated quality assurance for product catalogs

Cleanlab Studio was used to improve an E-commerce website, product listings, and analytics. Finding and fixing errors in product descriptions/metadata can be entirely automated, and improves: customer experience, product discoverability, SEO, advertising, as well as analytics/decision-making.

Read more: Enhancing Product Analytics and E-commerce with Cleanlab Studio

Cleanlab Studio seamlessly handles data with image, text, and structured/tabular features (eg. product price, size, etc) to auto-detect many common issues in product catalogs including:

Products (SKUs) that are miscategorized or have incorrect tags (tax-classifications, age-restrictions, ...)
Near-duplicate products (SKUs)
Products with images that are low-quality or NSFW
Products with low-quality text descriptions
Products whose image does not match description
Text in descriptions or review comments containing: toxic language, Personally Identifiable Information, or is not English

Cleanlab for E-Commerce and Retail

Case StudyPing An Insurance

HOW CLEANLAB CAN HELP YOUR BUSINESS

Related applications

Customer Service

Business Intelligence / Analytics

Data Entry, Management, and Curation

Content Moderation

Foundation and Large Language Models

Data Annotation & Crowdsourcing