How Color Search Works

To implement color search, we used a bit of code to analyze each image in our collection and identify its predominant colors in hexadecimal color notation. The colors identified in this analysis are then matched up with two separate palettes to make them functional for our users. That is, because the potential number of colors is quite large—there are over 16 million colors expressable through hexadecimal notation—we need to make the colors less specific so that there is a greater likelihood that they will match colors found in other images. For instance, if one image has one shade of red (#e81603) in it and another has a slightly different shade of red (#eb1203) in it, we can match both of those shades to a common red (#ff0000) that is found in the color palette.

The first palette, used for more specific matches, is made up of 139 colors defined in the CSS4 Color Module; the second palette, used for color family, is made up of 16 general CSS4 colors. Because of the nature of color and our methodology, users may sometimes encounter matches that seem off. Another limitation users may encounter is that our color analysis occurs on the entire image, not just the object in the image. Because many of our images are sketches, the color of the paper itself is sometimes identified as a predominant color even though that is less useful to users. On the whole, though, these limitations are far outweighed by the benefits of this feature.

Our work here has followed the lead of many of our colleagues, specifically the innovative uses of color in the digital collections at the Cooper Hewitt and Google Arts and Culture. The code we used for this project was developed largely by using code written by the Cooper Hewitt for their digital collections and detailed on the Cooper Hewitt Labs blog. If you have any comments, suggestions, or questions about our work, please feel to contact us.