About this project
Color Matching Beyond Brand Shade Names
Shade names like "Velvet Plum," "Midnight Berry," and "Spiced Rosewood" are evocative and aspirational, but not helpful when searching for a specific shade among thousands of options. Online shopping makes it worse: coarse color groupings like "Pink," "Berry," and "Nude" leave users scrolling through hundreds of products... and even those groupings span a wide range of shades!
So I built a color search and discovery tool that skips the names entirely, matching lip products by actual color using CIELAB and Delta E, the same standard cosmetics manufacturers use internally.
Behind the tool is an ML pipeline — image classification, semantic segmentation, and perceptual color clustering — applied to 9,000+ raw product images to build a searchable color index from scratch.
And it's a sizable market
$17.49B → $23.77B. The global lipstick market is projected to grow from 2024 to 2030 at 4.7% a year, an enormous category that still has considerable user pain points for search and discovery.
The why, what & how, in full below↓
WhyThe palette
Every major retailer collapses the entire color spectrum into about a dozen named groups — Sephora has 14, Ulta has 12. And while a dozen groupings might sound like a lot, a single category can hold hundreds of products and very different shades of the color group.
Take "Pink," for instance. The Sephora and Ulta palettes above already show that "Pink" means something different at each retailer. Apply that filter and the result is a wide range of shades, hundreds of products, endless scrolling, and no way to quickly find the color someone is searching for:
Marketplaces like Amazon and Google Shopping offer just a search bar and no color filter at all. Finding something more specific than "red" or "pink" requires fluency in makeup vocabulary: knowing that the shade in mind is "terracotta," "mauve," or "dusty rose." These terms mean nothing to a casual shopper. Moreover, discovery becomes impossible when there are no words for what you're looking for.
In fact, here is what searching "pink lipstick" returns:
The takeaway
Searching for "Pink" returns every shade below (and more) at once, making it difficult to discover and search lip products.
WhatThe approach
Color search works differently depending on how much is already known about the target shade. The three modes below cover everything from casual exploration to color-perfect precision.
ASelect a hue from the color wheel.
BMatching products ranked by similarity.
CZoom into the selected hue for lighter to deeper options.
DResults re-rank to the new choice.
AUpload an image and sample a color.
BMatching products and more options.
Pick any color you love or paste a hex, and we'll find the lipsticks closest to it.
AEnter a hex number or open the color field to find similar lip products.
BOpening the full color field allows for further discovery across hues and shades.
From the wishlist
Saved products become a starting point. Anything in the wishlist can anchor a new search, making it possible to explore similar shades without starting from scratch, or find a cheaper alternative to a favorite.
HowUnder the hood
Building the tool required an end-to-end ML pipeline, from raw scraped images to a searchable perceptual color index. Each stage is outlined below.
Scraped product listings and imagery across major retailers. Validated and repaired downloaded images.
No labeled dataset existed, so one was built. Stratified-sampled images and extracted CIELAB values as ground truth.
Trained a classifier to distinguish image types such as swatch, bullet lipstick, and liquid lipstick, among others. Image type determines the color extraction strategy, rather than applying a one-size-fits-all method.
A segmentation model identifies the color region for each image type — the bullet tip, the wand — isolating the actual product color from packaging and background.
Color is extracted in CIELAB space for each product, giving every shade a precise, perceptually-grounded coordinate. A Gaussian Mixture Model (GMM) clusters over the full catalog to define color wheel.
Every product becomes a point in CIELAB space. So a query color returns the closest shades by ΔE, across brands, type of lip product, and price point.
Outcome
The web app lets users search 9,000+ products by color. For most products, the top match is as close to the target shade as the human eye can detect.
Pipeline code, model choices, evaluation, and the color-matching logic, are all documented on GitHub.
About me
Data Scientist who builds the measurement systems, experiments, and ML pipelines behind AI products: what to build, why they fail, how to improve them, and when to launch.
I'd love to hear if you found the app useful. You can connect on LinkedIn and include a message that you checked out the app!
Connect on LinkedIn