Comparison of methods for characterizing skin pigment diversity in research cohorts

Michael S. Lipnick
Danni Chen
Tyler Law
Kelvin Moore
Jenna C. Lester
Ellis P. Monk
Carolyn M. Hendrickson
Yu Chou
Caroline Hughes
Ella Behnke
Seif Elmankabadi
Lily Ortiz
Fekir Negussie
Gregory Leeb
Odinakachukwu Ehie
Isabella Auchus
Elizabeth N. Igaga
Ronald Bisegerwa
Olubunmi Okunlola
Philip Bickler
John Feiner
Leonid Shmuylovich

0 evaluations Published on Feb 25, 2025

This article on Sciety

Abstract

Background

Some pulse oximeters perform worse in people with darker skin, and this may be due to inadequate diversity of skin pigment in device development study cohorts. Guidance is needed to accurately and equitably characterize skin pigment to ensure diversity in research cohorts. We tested multiple methods for characterizing skin pigment to assess comparability and impact on cohort diversity.

Objectives

Assess reliability and comparability of common skin pigment measurement methods
Compare findings from different anatomical sites
Demonstrate that pigment cannot be assumed from US National Institutes for Health (NIH) race categories

Methods

We used three subjective methods (perceived Fitzpatrick pFP, Monk Skin Tone MST and Von Luschan VL) and two objective methods (Konica Minolta CM-700d spectrophotometer and Delfin Skin Color Catch DSCC colorimeter) for individual typology angle (ITA), across multiple measurement sites in adults. We calculated ΔE to estimate operator perceptibility thresholds for subjective methods and to determine reproducibility for objective methods. We used each method to categorize participants as ‘light, medium, or dark’ and compared the impact of method selection on cohort diversity.

Results

We studied 789 participants, with 33,856 assessments. The MST had the widest luminosity range, and VL had the least discernible adjacent categories. With ‘dark’ defined as ITA <-30°, 14% of participants were categorized ‘dark’ as compared to 26% by pFP or 16% by MST. Approximately half of the ‘dark’ cohort had an ITA <-50°. With an ITA threshold <-50°, only 7% of the cohort was categorized as ‘dark.’ When ‘Black or African American’ self-identification was used to define ‘dark’, 23% of the cohort was categorized as such. Each self-assigned NIH race category included a wide range of ITA and subjective scale categories. Both ITA and L* from the KM-700d and DSCC demonstrated strong correlation (⍴ > 0.7).

Conclusion

Common methods for skin pigment characterization, especially the use of race or subjective scales, have significant limitations. When applied to the same cohort, different methods yield significantly different results, and some may overestimate diversity. Previously published ITA thresholds for defining ‘dark’ skin are too light and lead to underrepresentation of people with darker skin.

Related articles are currently not available for this article.