Next Article in Journal
Three-Dimensional Bone-Image Synthesis with Generative Adversarial Networks
Previous Article in Journal
Toward Closing the Loop in Image-to-Image Conversion in Radiotherapy: A Quality Control Tool to Predict Synthetic Computed Tomography Hounsfield Unit Accuracy
Previous Article in Special Issue
Impact of Display Sub-Pixel Arrays on Perceived Gloss and Transparency
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evaluation of Color Difference Models for Wide Color Gamut and High Dynamic Range

1
Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, 119333 Moscow, Russia
2
Smart Engines Service LLC., 117312 Moscow, Russia
3
Institute for Information Transmission Problems of the Russian Academy of Sciences, 127051 Moscow, Russia
4
Department of Computing Machines, Systems and Networks, Institute of Information Technologies and Computer Science, National Research University “Moscow Power Engineering Institute”, 111250 Moscow, Russia
*
Author to whom correspondence should be addressed.
J. Imaging 2024, 10(12), 317; https://doi.org/10.3390/jimaging10120317
Submission received: 8 November 2024 / Revised: 2 December 2024 / Accepted: 4 December 2024 / Published: 10 December 2024
(This article belongs to the Special Issue Color in Image Processing and Computer Vision)

Abstract

:
Color difference models (CDMs) are essential for accurate color reproduction in image processing. While CDMs aim to reflect perceived color differences (CDs) from psychophysical data, they remain largely untested in wide color gamut (WCG) and high dynamic range (HDR) contexts, which are underrepresented in current datasets. This gap highlights the need to validate CDMs across WCG and HDR. Moreover, the non-geodesic structure of perceptual color space necessitates datasets covering CDs of various magnitudes, while most existing datasets emphasize only small and threshold CDs. To address this, we collected a new dataset encompassing a broad range of CDs in WCG and HDR contexts and developed a novel CDM fitted to these data. Benchmarking various CDMs using STRESS and significant error fractions on both new and established datasets reveals that CAM16-UCS with power correction is the most versatile model, delivering strong average performance across WCG colors up to 1611 cd/m2. However, even the best CDM fails to achieve the desired accuracy limits and yields significant errors. CAM16-UCS, though promising, requires further refinement, particularly in its power correction component to better capture the non-geodesic structure of perceptual color space.

1. Introduction

With rapid advancements in high dynamic range (HDR) display technology—now achieving luminance levels exceeding 1000 cd/m2 in accordance with new HDR standards [1,2], which also incorporate wide color gamut (WCG) defined in the earlier BT.2020 standard [3]—tools for precise color reproduction assessment have become essential. Color difference models (CDMs) play a crucial role by estimating perceived color differences (CDs) between colors. Mathematically, a CDM can be expressed as a function F ( c 1 , c 2 , p ) d , where d is the perceived difference between colors c 1 and c 2 , and p represents the observation parameters. A basic CDM approach uses the Euclidean distance in a uniform color space (UCS), described as F ( c 1 , c 2 , p ) = | | T ( c 1 , p ) T ( c 2 , p ) | | 2 , where T is the transformation from the colorimetry color space to the particular UCS. A UCS can be used not only as a CDM but also as a space for performing color transformations. It has been shown that the performance of image processing algorithms significantly depends on the chosen CS or CDM, as demonstrated in applications such as tone mapping [4], color correction [5], and image recognition [6].
CDMs are developed and validated using datasets comprising color pairs and their perceived differences, as determined through psychophysical experiments. While various CDMs have been proposed for HDR contexts [7,8,9,10,11], HDR- and WCG-specific experimental data to evaluate these models are still sparse. Existing datasets, often collected based on textile and printed samples [12,13,14,15], lack WCG coverage, and recent monitor-based datasets are confined to luminance levels up to 300 cd/m2 [16,17]. Thus, dedicated experimental setups are necessary to acquire data covering both HDR and WCG ranges.
There are various types of experiments to assess CDs with different magnitudes: threshold CD (TCD) or just noticeable difference (JND), small CD (SCD), and large CD (LCD, exceeding 5–8 JNDs). Measuring CDs across these ranges is critical due to the non-geodesic nature of perceptual color space, where the cumulative JNDs along the shortest path (geodesic) can surpass the direct CD between endpoints [18,19,20]. This discrepancy challenges traditional UCS models using Euclidean metrics for varying CD magnitudes. Consequently, UCSs specifically designed for LCD or SCD applications based on the CIECAM02 model [21] have been developed, and a generalized approach using power functions (with exponents < 1 ) has been applied to current UCS models [22], including the state-of-the-art CAM16-UCS [23].
A recent review by Luo et al. [24] evaluated psychophysical datasets on CDs from over 20 studies, benchmarking CDMs across these datasets. For decades, such datasets have played a central role in the iterative development of CDMs, forming the foundation for refining gold-standard models widely used in color evaluation. This progress has, in turn, driven advancements in applied algorithms. While HDR datasets [17] (up to 1128 cd/m2) and WCG datasets [25] (covering the BT.2020 gamut) exist, the review acknowledges a significant gap: the absence of datasets encompassing both HDR and WCG. Additionally, existing HDR and WCG datasets are limited in their coverage of large CDs. Large CDs are particularly important in display manufacturing, where defective pixels significantly affect perceived color [26]. Addressing these gaps is critical for further advancing UCS development, necessitating new datasets that cover WCG, HDR, and large CD ranges. To this end, we obtained a new color difference dataset spanning threshold, small, and large CDs in HDR and WCG. Using this dataset, we developed an analytically fitted CDM and benchmarked several CDMs with STRESS and significant error fraction metrics across both new and established datasets, providing a robust foundation for future UCS development.

2. Materials and Methods

2.1. Developed Experiment Setup

To obtain psychophysical data across HDR and WCG color spaces at various color difference (CD) scales, we developed a custom experimental setup centered around a purpose-built stimulus generator. Since no commercially available device met the required HDR and WCG specifications, we collaborated with Visionica Ltd. (Moscow, Russia) (http://www.visionica.biz/, accessed 8 November 2024) to design a stimulus generator, named VCD. VCD provides four independent stimuli, each comprising several dozen LEDs per primary color, ensuring both high luminance (up to 5000 cd/m2) and uniformity of the light field.
Our goal was to cover as much of the BT.2020 gamut as possible using the VCD. While the BT.2020 [1] gamut triangle is theoretically defined, its primaries, located on the spectral locus, can only be practically achieved using lasers. With LEDs, it is feasible to achieve two out of three primaries with adequate luminance levels, but achieving a high-saturation green LED with sufficient luminance remains a challenge. This issue, known as the “green gap” [27], arises from the physical limitations of current LED technology. To address this, we incorporated two green LEDs with different spectral power distributions, resulting in a total of four LEDs in the VCD. This configuration expanded the device’s color gamut (see Figure 1). Adding more LEDs would provide only marginal gains in gamut coverage while significantly complicating the device design.
The full experimental setup included a light booth with two rear windows (for different scenarios), an illuminator, and the custom stimulus generator discussed above. Figure 2 shows views of the setup.
The light booth, a 70 × 70 × 70 cm cube with five white paper walls, was top-lit by a Remez 7W 5700K LED lamp (color rendering index: 98.5 and correlated color temperature: 5553 K), providing 100 cd/m2 at the center of the back wall and ensuring uniform lighting. Spectral measurements were conducted using an X-Rite I1Pro spectrophotometer (3.3 nm resolution) with ArgyllCMS software (version 2.1.2) (http://argyllcms.com/, accessed 8 November 2024). Figure 3 shows the spectral power distributions (SPDs) inside the booth alongside CIE D50, D55, and D65 illuminants.
The VCD generator produced four stimuli (10 × 10 cm quadrants), each controlled independently. A 1 mm vertical separator divided the stimuli horizontally, with a 2 cm space between the upper and lower pairs; see Figure 4. The VCD was positioned 17 cm from the booth’s back wall, and a black separator was placed between viewing windows. Observers sat 80 cm from the windows, ensuring an angular stimulus size of 3.58°.
For TCD measurements, a 28 mm round hole in the back wall provided a 2° angular stimulus size; see Figure 5.
For SCD and LCD measurements, two 5 × 5 cm rectangular windows separated by 2 cm were used. The chinrest was adjusted to block the observer’s view of gaps between the VCD quadrants.

2.1.1. VCD Stimulus Generator Characterization

The VCD stimulus generator produced four independently controlled stimuli, each in a separate quadrant (Figure 4). Each quadrant featured a board with four primary LEDs (see Section 2.1.2) controlled at 16-bit accuracy, with settings managed remotely by a computer.
Uniformity testing via photographs revealed a 15% luminance drop along the left edge of Q1 near Q2, consistent across quadrants and imperceptible to the human eye. This variation was negligible for SCD and LCD measurements (see Section 2.2.2) but became significant for TCD if colors were compared side by side. Therefore, we employed an alternate procedure for TCD, where colors were flashed alternately within the same quadrant (see Section 2.2.1).

2.1.2. Primaries Spectra

The VCD featured four primary channels—red (R), green (G), blue (B), and cyan (C)—each controlled independently via 16-bit buses, allowing luminance values from 0 to 65,535. Spectral power distributions (SPDs) were measured for various channel values in Q1 (Figure 6), with settings of 30,000 for R and B yielding luminance around 1000 and 1400 cd/m2, and 5000 for G and C yielding around 1000 and 1100 cd/m2. Minor SPD shifts due to LED heating, most notable in the green channel, were consistent across quadrants and accounted for in the VCD control software. Luminance linearity was assessed by measuring primary outputs at various channel values, approximated by a parabolic function to ensure accuracy (Figure 7).
To verify calibration, we compared the chromaticity coordinates of the target colors that the VCD intended to produce with the actual colors formed, which were measured using an X-Rite I1Pro spectrophotometer in its colorimeter mode. Approximately 40 different xyY values, evenly distributed across the chromaticity diagram at two luminance levels (300 and 1000 cd/m2), were tested. Chromaticity errors were calculated as the distance between the measured and target colors on the CIE 1964 xy chromaticity diagram for the 10° observer. The chromaticity error was found to depend on both the y and Y coordinates, ranging from 0.007 (for y < 0.5 at 1000 cd/m2) to 0.038 (for y ≥ 0.6 at 300 cd/m2), with a mean error of 0.017, which is considered small.

2.2. Experimental Procedures

2.2.1. TCD Measurements

TCD measurements were conducted using the “strict substitution” technique [28,29]. In this method, a color center (reference color, Stimulus A) is displayed, and a test color (Stimulus B) is shifted in a chosen direction within the color space away from the color center. Each stimulus is shown for 1.5 s in a periodic, instantaneous alternation.
Observers are given unlimited time to detect any difference; if none is observed, they increase the shift of Stimulus B by pressing a button. The just noticeable difference (JND) is the smallest shift at which the observer perceives a distinction between the two stimuli.
Stimulus B is shifted in discrete steps along specific directions in CIE xyY space. While a smooth surface would require measurement in numerous directions, which was not feasible, we used 18 directions combining chromaticity and luminance shifts and 8 directions with fixed luminance to approximate this surface (details in Figure 8).

2.2.2. SCD and LCD Measurements

SCD and LCD were measured using the “gray-scale method”, which matches color differences to a gray reference scale [25,30,31,32]. Observers viewed two pairs of colors: a reference gray pair with a known difference (3 JND for SCD, 12 JND for LCD) and a test pair.
The viewed stimuli are shown in Figure 4. The reference gray pair (stimuli Q1 and Q2) had a fixed difference, while the test pair (Q3 and Q4) started at the same color center (Q4). Observers incrementally adjusted Q3 along one of 18 preset directions (Figure 8) until its perceived difference from Q4 matched that of the gray reference pair. Upon matching, the color center and the selected shift were recorded.
The reference pair had chromaticity coordinates corresponding to D65. One sample in the reference pair remained fixed as GS-1, with the other as GS-3 for SCD or GS-5 for LCD measurements (see Table 1). For improved observer’s brightness adaptation, gray-scale luminances were increased 20-fold, with a reference white luminance of 2000 cd/m2, as detailed in Table 1, where L * denotes the CIELAB lightness value.

2.2.3. Operating Conditions

Human color perception is strongly influenced by viewing conditions. For instance, the perceived magnitude of color differences depends on the chromatic adaptation of the human visual system to dominant illumination [33,34]. Chromatic adaptation transforms (CATs) describe these dependencies [23,35] and are often integrated into UCSs [23,36]. Developing such models requires the collection of specialized datasets [37,38]. However, it is not feasible to cover all possible scenarios within a single dataset, as this would require sampling a highly multidimensional parameter space. Consequently, different studies typically focus on specific subsets of this space. In our work, we addressed a four-dimensional space comprising three color coordinates and the magnitude of color difference. Nonetheless, datasets targeting other regions of the parameter space are still required to further improve these models. We maintained consistent viewing conditions throughout the experiment. Table 2 summarizes potential biases introduced by the experimental setup and observers, as well as the measures taken to mitigate them. Despite these efforts, some residual biases may remain. These biases can be categorized into two types: biases consistent across all measurements, and biases that vary between measurements. To assess the impact of the first type, we verified that our measurements aligned with previous datasets and evaluated their consistency, as described in Section 3.1. For the second type, we confirmed that such biases were not present as the model performance was not undermined: the CDM optimized on the training subset of our measured data achieved high accuracy on the validation subset, as detailed in Section 3.2. These two independent verifications demonstrate that our dataset collection process was not significantly influenced by large biases.

2.3. Collected Data

We collected comprehensive human color perception threshold data, covering small and large color differences (CDs) across 136 color centers and 2702 color pairs:
  • TCDs were measured in 109 color centers (2046 pairs, 17 participants);
  • SCDs were measured in 18 color centers (342 pairs, 9 participants);
  • LCDs were measured in 17 color centers (314 pairs, 9 participants).
Each measurement captured three-dimensional data, forming ellipsoids in WCG and HDR areas, with color center luminances up to 1000 cd/m2 and a maximum luminance of 1611 cd/m2. Figure 9 shows the chromaticity and luminance distribution of the collected data. The convex hull of the collected color pairs forms a polygon covering 80% of the BT.2020 or BT.2100 [1] area on the proLab chromaticity diagram [36]. ProLab is a UCS based on the 3D projective transformation of CIE XYZ, with chromaticity coordinates similar to CIE xy: x + = a + / L + , y + = b + / L + , where L + , a + , and b + are coordinates in proLab.
The color centers were selected using two approaches. First, we measured color differences at the color centers recommended by the CIE [39] to validate our setup by comparing the results with previously published data. Second, we randomly selected color centers from a uniform distribution in proLab for luminance levels ranging from 0 to 1000 cd/m2. This selection excluded the sRGB color gamut (up to 300 cd/m2). Consequently, for the standard luminance range, we focused on highly saturated colors, while, for HDR, we explored the full saturation range.
CDs were measured in 18 directions in xyY space, including 8 in the chromaticity plane (Figure 8). In each session, the following steps were taken:
  • A total of 8 pairs were measured with Δ x and Δ y shifts (fixed Δ Y );
  • If the observer was not overly fatigued, we measured 2 pairs with positive/negative Δ Y (fixed Δ x , Δ y ) and 8 pairs with combined Δ x , Δ y , and Δ Y shifts.
A total of 231 measurement sessions were conducted, each with an observer focusing on one color center (data in certain directions were occasionally omitted if the observer reached the device’s gamut limits):
  • A total of 8 directions were measured in 150 sessions (19 had partial data due to gamut limits);
  • A total of 18 directions were measured in 81 sessions (3 had partial data due to gamut limits).
The order of color centers and measurement directions was randomized per observer. Each color pair within a direction was measured four times to minimize variability (following CIE recommendations [40]), using medians for analysis.
We recruited 22 observers with normal or corrected vision, who passed the Color Assessment and Diagnosis test within age-appropriate thresholds. All participants received instructions and provided informed consent.

2.4. Available Datasets

The widely used COMBVD dataset [41] includes small CDs (under 5 CIELAB units) and consists of four sub-datasets: BFD-P [12] with 2779 pairs, Witt [13] with 418 pairs, Leeds [14] with 307 pairs, and RIT-DuPont [15] with 312 pairs. We used this dataset to validate our own data and for CDM benchmarking. Additionally, we used the Size-dependent Color Threshold (SDCTh) dataset [42], which includes 80 pairs across 10 color centers, each measured by 10–13 observers to obtain 8 median pairs per center for stimuli with a radius of 72 min. Both datasets cover sRGB, limited to a luminance of 100 cd/m2 (standard dynamic range, SDR). HDR and WCG datasets exist [17,25] but are not publicly available and were thus not used here.
Table 3 summarizes the viewing conditions for each dataset (illuminant, adapting luminance L a , luminance factor of the neutral background Y b , and surround). Our booth lighting was set to 100 cd/m2, i.e., L a = 100 . Since stimuli were displayed against a white background, L b = L w = 100 , resulting in Y b = 100 ( L b / L w ) = 100 .

2.5. Quality Measures of Color Difference Models

The similarity between predicted CDs and reference measurements from human studies is commonly evaluated using the STRESS measure [43,44]. Let x R n represent the vector of experimental CDs for n color pairs from a psychophysical dataset, and y R n represent the vector of CDs for the same color pairs computed using a given CDM. The STRESS metric, STRESS ( x , y ) , is defined as
STRESS ( x , y ) = k * x y 2 y 2 = i = 1 n ( k * x i y i ) 2 i = 1 n y i 2 ,
where k * is a scaling factor used for normalization:
k * = argmin k k x y 2 = x T y x T x .
STRESS values range from 0 (perfect agreement) to 1 (maximum discrepancy). For context, the state-of-the-art CDM, CIEDE2000, achieves a “good” STRESS value of 0.292 on the COMBVD dataset, on which it was trained. However, this value depends on the dataset’s internal consistency and noise. The upper limit for STRESS on the new dataset is estimated in Section 3.1. In contrast, the poorly uniform CIE XYZ color space achieves a STRESS value of 0.692 on the same dataset, providing a reference for poor performance.
Various modifications have been proposed for handling multiple datasets [23,45]. Recently, a modified STRESS version, STRESSgroup, was introduced to ensure that all color pairs are accounted for with equal importance and to adjust for CD scale variations across datasets [46].
While STRESS evaluates the average error, it is equally important to assess significant errors, as a CDM serves as a measurement tool for evaluating other algorithms, such as those used in image processing. To address this, we propose analyzing the error distribution and focusing on errors exceeding a certain threshold, Δ 0 , which we define as unacceptable. Errors below Δ 0 = 1 JND are considered imperceptible, making such a threshold unreasonable. We selected Δ 0 = 2 , as errors Δ 0 > 2 form the heavy tail of the error distribution for current state-of-the-art CDMs (see Figure 10).
To quantify significant errors, we propose calculating the fraction of errors exceeding 2 JND:
FEM 2 JND ( x , y ) = 1 n i = 1 n [ ( k * x i y i ) > 2 ] ,
where k * is defined as in Equation (2).

3. Results

3.1. Comparison of the Obtained and Previously Published Data

Establishing the consistency of our experimental data with previously published data is crucial for validating our dataset. We compared our data with the COMBVD dataset [41] (see Section 2.4), which includes SCDs in the sRGB and SDR range. We focused on sub-datasets obtained under the D65 illuminant, excluding the BFD-P sub-dataset derived under the M and C illuminants.
Our consistency evaluation method leverages the fact that human color perception can be described by a continuously differentiable function without non-physiological discontinuities. This concept implies that, within a small region of color space, perceived color differences can be approximated by a linear model. By constructing a locally linear model from data in a small color space region, we can evaluate the consistency of our data with previously published datasets.
Let us consider a dataset of color pairs and their corresponding CDs. The i-th color pair in the dataset is represented as
p i = def ( c i 1 , c i 2 ) .
To define the distance between two color pairs, we first establish distances for “codirectional” and “opposite” pairs in terms of the color ordering we have chosen:
ρ ( p i , p k ) = d ( c i 1 , c k 1 ) + d ( c i 2 , c k 2 ) 2 ,
ρ ( p i , p k ) = d ( c i 1 , c k 2 ) + d ( c i 2 , c k 1 ) 2 ,
where d is a distance measure, with CIEDE2000 chosen as our metric.
The distance between color pairs is defined as
ρ ( p 1 , p 2 ) = min ( ρ ( p 1 , p 2 ) , ρ ( p 1 , p 2 ) ) .
It should be noted that this distance measure may not satisfy all properties of a metric. To identify regions of high data density in color space, we constructed a complete weighted graph from the set of color pairs in the datasets, calculating all possible distances between them. We then retained only the edges with weights below a certain threshold value ρ t h r .
To identify clusters of closely positioned color pairs containing N pairs, we ranked the graph’s vertices in descending order based on their degree, i.e., the number of edges connecting each vertex to its neighbors.
The following algorithm was employed for the extraction of clusters from the graph G t h r :
  • Initialize i = 1 .
  • If the degree of the i-th vertex k i > N 2 , form a cluster C i of size k i + 1 consisting of the i-th vertex and its connected vertices.
  • Remove the edges connecting each vertex in cluster C i to other vertices within the same cluster, and update G t h r .
  • Repeat steps 2–4 for i = i + 1 .
Within this cluster detection framework, any color pair can belong to multiple clusters, but any two color pairs belong to only one cluster. Sub-datasets within each cluster may vary in size.
To estimate internal consistency, we set ρ t h r = 2 and N = 100 ; then, we optimized the linear model on each cluster. Since we assumed that the units of perceptual distance may differ across datasets by a proportionality coefficient, we employed S T R E S S g r o u p [46] as the loss function.
First, we calculated the internal consistency of the COMBVD-D65 dataset as a reference. The results of the linear model optimization are presented in Table 4 for all five clusters we detected. The dataset consistency is represented by the highest value, S T R E S S g r o u p = 0.216 .
Next, we evaluated the consistency between our data and the COMBVD-D65 dataset. The results of the linear model optimization for each detected cluster are shown in Table 5. The highest consistency value for our data with COMBVD-D65 is S T R E S S g r o u p = 0.316 . While this is slightly worse than the internal consistency of COMBVD-D65, it remains comparable. Thus, we conclude that our data are consistent with COMBVD-D65, at least regarding the CD scale factor. We will address the issue of data consistency on an absolute scale in Section 3.3.

3.2. Fitting the Color Difference Model to the Data: Optimally Fitted Δ E f

We employed a new flexible parametric model that implements a semi-metric in color space. We also provided an analytical method for finding its optimal parameters.
Our approach built on the CAM16-UCS model with power law correction:
Δ E CAM 16 UCS PC = 1.41 ( Δ E CAM 16 UCS ) 0.63 .
We enhanced this model with an additional trainable factor for improved accuracy. The model is defined in the following linear space:
Δ E f ( x , y | θ ) = def x y 2 p | ( b ( v ( x , y ) ) + b ( v ( y , x ) ) ) T θ | ,
where x = ( J x , a x , b x ) , and y = ( J y , a y , b y ) are CAM16-UCS coordinates of a color pair, and p = 0.63 . Here,
v ( x , y ) = def J x , a x 2 + b x 2 J x , x y 2 ,
where b : R 3 R d is a basis function dictionary listed in Appendix A, and θ R d is the vector of CDM parameters.
This model is a semi-metric, as it does not satisfy the triangle inequality, similar to the “gold standard” CIEDE2000 formula.
To train the model, we optimized the parameter vector θ to minimize STRESS on the dataset:
θ * = def arg min θ STRESS ( e , M θ ) = arg min θ min k k e M θ 2 2 M θ 2 2 ,
where
e = def e 1 e 2 e n T , M = def m 1 m 2 m n T , m i = def x i y i 2 p ( b ( v ( x i , y i ) ) + b ( v ( y i , x i ) ) ,
are vectors of experimental distances and model base matrix, respectively, and n represents the number of color pairs in the dataset.
This formulation results in a quadratic fractional programming problem:
θ * = def arg min θ min k θ T k A θ T k T θ T k B θ T k T ,
where
A = def M e T M e , B = def M 0 T M 0 .
This problem can be solved analytically, avoiding numerical optimization methods. To solve (13) analytically, we first regularize matrices A and B:
A ^ = def A + λ I , B ^ = def B + λ I , λ = 0.01 .
Using the Cholesky decomposition, we find an upper triangular matrix R such that R T R = B ^ . The matrix V of eigenvectors of K = def R T A ^ R 1 is
V = def v 1 v 2 v d + 1 : K v i v i .
The optimal solution corresponds to one of the columns of W = def R 1 V :
θ * k * = argmin θ k W k e M θ 2 2 M θ 2 2 .
The optimization of CDM parameters is typically performed using numerical methods, which are computationally intensive and do not guarantee the discovery of a global minimum for non-convex objectives like STRESS. In contrast, the proposed analytical optimization approach ensures that the global minimum of STRESS is achieved while maintaining low computational complexity (approximately 0.12 s on an Intel Core i7-8550U).
The model was trained on a combined dataset comprising COMBVD, SDCTh, and our data. The dataset was split into 75% for training and 25% for testing. Using the analytical optimization Equations (15)–(17), the model achieved a STRESS value of 0.390 on the training set and 0.393 on the test set. These results demonstrate that the model is not overfitted.

3.3. Benchmarking Color Difference Models

We used the optimally fitted Δ E f to analyze our dataset. Given the similar STRESS values across both previously published datasets and ours (see Table 6 and Table 7), we conclude that the datasets are consistent in terms of absolute CD scale. To evaluate the randomness of the obtained STRESS and FEM2JND values, we constructed confidence intervals at a confidence level of p = 0.95 % , corresponding to 1.96 standard deviations. For STRESS, bootstrapping [47] was applied, while the Wald confidence interval was used for FEM2JND [48].
We benchmarked the following CDMs on psychophysical datasets. CAM16-UCS, introduced in 2016 [23], is the current state-of-the-art UCS [24]. CAM16-UCS with power law correction (see Equation (8)), which accounts for the non-geodesic properties of human color space. Other UCS models designed for WCG and HDR include ICaCb [8], ICTCp [9], and Jzazbz [10]. We also assessed Oklab [49], promising for LCD, and proLab [36], which offers good geometric properties. Additionally, we considered the CIEDE2000 formula [41], the state-of-the-art for SCD.
The results confirmed that CAM16-UCS with power correction (CAM16-UCS-PC) is the most versatile model, demonstrating the best performance on the combined dataset across both sRGB/SDR and HDR/WCG domains (see Table 8, where lower STRESS values indicate better consistency). However, CAM16-UCS-PC is not without limitations. Its accuracy could potentially improve by at least 1.3 times, as suggested by the optimally fitted Δ E f . For the combined dataset, the standard deviation for CAM16-UCS-PC is 0.013, while that for the optimally fitted Δ E f is 0.041, meaning the difference between the mean values exceeds the standard deviation. Moreover, despite an acceptable average error, CAM16-UCS-PC shows a notable proportion of significant errors, measured as the fraction of errors above 2 JND (Table 9). A comparison between CAM16-UCS and CAM16-UCS-PC reveals that the power correction increases significant errors by 1.8 times. While CAM16-UCS-PC performs well on average, it still produces a considerable number of significant errors.
A universal model is yet to be developed, but CAM16-UCS is a promising foundation, with power correction being a key area for improvement in CAM16-UCS-PC.

4. Conclusions

We collected a new experimental dataset on perceived color differences spanning three scales (threshold, small, and large). The dataset covers a wide color gamut (80% of BT.2020) at high luminance levels (up to 1611 cd/m2) using a custom stimulus generator. Measurements were conducted with 22 observers, yielding color difference data for 2605 color pairs.
To ensure dataset consistency, we introduced a method leveraging the continuous nature of human color perception. This involved fitting linear models to small psychophysical data clusters and evaluating them using the STRESS metric. Our analysis confirmed consistency between our dataset and established datasets.
We further improved the power-corrected CAM16-UCS (CAM16-UCS-PC) model by introducing a novel color difference model, optimally fitted Δ E f . This refinement employed an analytical optimization approach capable of efficiently minimizing non-convex objectives, such as STRESS, while avoiding the risk of local minima. Our results demonstrated that existing models, including CAM16-UCS-PC, have not yet reached the theoretical accuracy limit, as evidenced by the performance of the optimally fitted Δ E f on the combined dataset of new and established data.
Benchmarking across multiple datasets revealed that CAM16-UCS-PC is currently the most versatile model, achieving the best STRESS performance for sRGB, WCG, SDR, and HDR domains. However, the model is not without limitations. In traditional domains (SDR and sRGB), CAM16-UCS-PC exhibits fewer significant errors than CAM16-UCS. Conversely, in HDR-WCG domains, the number of significant errors for CAM16-UCS-PC is 1.8 times higher than for CAM16-UCS without power correction.
These effects can be attributed to the fact that neither model was trained on HDR or WCG data. While CAM16-UCS-PC introduces power correction to address the non-geodesic nature of color space, it remains a less appropriate model and performs worse than the base CAM16-UCS when predicting new data.
Our findings suggest that incorporating our dataset into the development of UCS could enhance future models. However, the root causes of these discrepancies remain unclear and require further investigation. Additional measurements, particularly in regions with the greatest errors, are necessary to improve model accuracy and better understand the observed effects.

Author Contributions

Conceptualization, I.N.; methodology, I.N.; software, V.K., M.K., and A.S.; validation, I.K. and O.B.; formal analysis, V.K. and O.B.; investigation, S.G.; resources, I.N.; data curation, S.G.; writing—original draft preparation, M.K. and A.S.; writing—review and editing, O.B. and A.S.; visualization, I.K.; supervision, I.N.; project administration, S.G. and M.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by project FA2019101041 at Saint Petersburg State University.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Ethics Board of Institute for Information Transmission Problems of the Russian Academy of Sciences (The Ethics Committee Conclusion dated 13 March 2023. The Approval Code is EC-2023/5).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to the fact that the data are shared by several organizations and the issue of their transfer will be decided individually each time.

Acknowledgments

The authors thank Alexander Belokopytov and Pavel Maksimov for their support with the experimental work, and Dmitry Nikolaev for his valuable discussions and insightful comments on this study.

Conflicts of Interest

Author Olga Basova was employed by the company Smart Engines Service LLC. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CDMColor difference model
CDColor difference
UCSUniform color space
WCGWide color gamut
HDRHigh dynamic range
TCDThreshold color difference
JNDJust noticeable difference
SCDSmall color difference
LCDLarge color difference
LEDLight emitting diode
SPDSpectral power distribution
GSGray scale
SDRStandard dynamic range

Appendix A. Basis Functions and Their Values for Optimally Fitted Δ E f

We selected 92 basis functions for Equation (8):
b = ( b i ) i = 1 92 :
b 1 = 1 ,
b 2 = v 1 , b 3 = v 2 , b 4 = v 1 , b 5 = v 2 , b 6 = v 1 v 2 , b 7 = v 1 v 2 , b 8 = v 1 1.5 , b 9 = v 2 1.5 ,
b 10 = v 1 2 , b 11 = v 2 2 , b 12 = v 1 3 , b 13 = v 2 3 , b 14 = v 1 2 v 2 , b 15 = v 1 v 2 2 , b 16 = ( v 1 v 2 ) 2 , b 17 = ( v 1 v 2 ) 3 ,
b 18 = v 3 , b 19 = v 1 v 3 , b 20 = v 2 v 3 , b 21 = v 3 2 , b 22 = v 1 v 3 2 , b 23 = v 2 v 3 2 , b 24 = v 3 3 , b 25 = v 1 v 3 3 , b 26 = v 2 v 3 3 , b 27 = v 3 4 , b 28 = v 1 v 3 4 , b 29 = v 2 v 3 4 , b 30 = v 3 5 , b 31 = v 1 v 3 5 , b 32 = v 2 v 3 5 ,
b 27 + 6 ω = sin ( 2 ω v 1 ) , b 28 + 6 ω = sin ( 2 ω v 2 ) , b 29 + 6 ω = cos ( 2 ω v 1 ) , b 30 + 6 ω = cos ( 2 ω v 2 ) , b 31 + 6 ω = sin ( 2 ω v 1 v 2 ) , b 32 + 6 ω = cos ( 2 ω v 1 v 2 ) , ω = 1 , 2 , , 10 .
The values of these basis functions are shown in Table A1.
Table A1. Values of the basic functions.
Table A1. Values of the basic functions.
b i Value b i Value b i Value b i Value
b 1 −0.0029 b 24 2.976 × 10−6 b 47 −0.0002 b 70 −0.0166
b 2 −0.0057 b 25 −1.8581 × 10 8 b 48 0.0117 b 71 −2.5185 × 10 6
b 3 −0.004 b 26 3.9624 × 10 7 b 49 4.8434 × 10 5 b 72 0.028
b 4 0.0009 b 27 −4.9423 × 10 8 b 50 2.406 × 10 5 b 73 3.8023 × 10 5
b 5 0.0023 b 28 3.0614 × 10 10 b 51 3.3194 × 10 5 b 74 −7.4377 × 10 5
b 6 0.0066 b 29 −8.5736 × 10 9 b 52 0.007 b 75 −0.0003
b 7 −0.0008 b 30 2.8571 × 10 10 b 53 5.5821 × 10 5 b 76 −0.0289
b 8 −7.1602 × 10 5 b 31 −1.7522 × 10 12 b 54 −0.0036 b 77 3.0788 × 10 5
b 9 −0.0034 b 32 5.5611 × 10 11 b 55 4.657 × 10 5 b 78 −0.0304
b 10 2.2075 × 10 6 b 33 8.8041 × 10 5 b 56 −0.0001 b 79 −0.0002
b 11 −0.0035 b 34 0.0096 b 57 −0.0002 b 80 4.6816 × 10 5
b 12 −3.9465 × 10 10 b 35 −4.5735 × 10 6 b 58 0.002 b 81 −1.1926 × 10 5
b 13 −0.0084 b 36 −0.0034 b 59 4.3452 × 10 5 b 82 0.0207
b 14 −1.881 × 10 7 b 37 −8.2306 × 10 5 b 60 0.0242 b 83 3.1525 × 10 5
b 15 −4.2926 × 10 5 b 38 1.5887 × 10 5 b 61 1.9099 × 10 5 b 84 −0.0113
b 16 5.8133 × 10 6 b 39 −4.6291 × 10 5 b 62 −0.0002 b 85 −8.7425 × 10 5
b 17 −1.7872 × 10 8 b 40 0.0017 b 63 6.9818 × 10 5 b 86 1.7498 × 10 5
b 18 0.0006 b 41 −3.2055 × 10 5 b 64 −0.003 b 87 6.1974 × 10 5
b 19 −3.8521 × 10 6 b 42 -0.0089 b 65 0.0002 b 88 0.0016
b 20 4.0407 × 10 5 b 43 4.5402 × 10 5 b 66 −0.0001 b 89 −1.3631 × 10 5
b 21 −6.9548 × 10 5 b 44 9.0806 × 10 5 b 67 −0.0002 b 90 0.0055
b 22 4.3857 × 10 7 b 45 3.8185 × 10 7 b 68 −2.5034 × 10 5 b 91 4.3339 × 10 5
b 23 −5.5443 × 10 6 b 46 0.0106 b 69 9.7498 × 10 6 b 92 0.0001

Appendix B. Formula for J z a z b z

In this section, we provide the formulas for the uniform color space J z a z b z [10].
X Y = b X b Y ( b 1 ) Z ( g 1 ) X ,
L M S = 0.41478972 0.579999 0.0146480 0.2015100 1.120649 0.0531008 0.0166008 0.264800 0.6684799 X Y Z ,
{ L M S } = c 1 + c 2 { L M S } 10,000 n 1 + c 3 { L M S } 10,000 n p ,
I z a z b z = 0.5 0.5 0 3.524000 4.066708 0.542708 0.199076 1.096799 1.295875 L M S ,
J z = ( 1 + d ) I z 1 + d I z d 0 ,
where the variables X, Y, and Z represent the input CIE XYZ tristimulus values. X and Y are pre-adjusted values of X and Y, respectively, used to eliminate hue deviations before optimizing the model for improved perceptual uniformity. L ,   M , and S denote the cone primary responses, while L , M , and S are dynamically transformed cone responses. The constants are defined as follows: b = 1.15 , g = 0.66 , c 1 = 3424 / 2 12 , c 2 = 2413 / 2 7 , c 3 = 2392 / 2 7 , n = 2610 / 2 14 , p = 4289 , 1 / 2 5 , d = 0.56 , and d 0 = 1.6295499532821566 × 10 11 .

Appendix C. Formulas for CAM16-UCS and CAM16-UCS-PC

This section outlines the formulas for CAM16-UCS and CAM16-UCS with power-law correction [23].
The initial step involves calculating values and parameters that are independent of the input sample:
R w G w B w = M 16 X w Y w Z w , M 16 = 0.401288 0.650173 0.051461 0.250268 1.204414 0.045854 0.002079 0.048952 0.953127 ,
where X w , Y w , and Z w represent the adopted white in the test illuminant, and R w , G w , and B w are sensor response signals.
D = F 1 1 3.6 e ( L A 42 92 ) ,
where D is the degree of adaptation, L A is the luminance of the test adapting field (cd/m2), and F is a parameter related to the influence of the environment on color perception.
If D exceeds 1 or is less than 0, it is constrained to 1 or 0, respectively.
D R = D Y w R w + 1 D , D G = D Y w G w + 1 D , D B = D Y w B w + 1 D ,
F L = 0.2 k 4 ( 5 L A ) + 0.1 ( 1 k 4 ) 2 ( 5 L A ) 1 3 ,
where k = 1 5 L A + 1 .
n = Y b Y w , z = 1.48 + n , N b b = 0.725 1 n 0.2 , N c b = N b b ,
R w c G w c B w c = D R R w D G G w D B B w ,
{ R a w C a w B a w } = 400 F L { R w c G w c B w c } 100 0.42 F L { R w c G w c B w c } 100 0.42 + 27.13 + 0.1 ,
A w = ( 2 R a w + G a w + B a w 20 0.305 ) N b b ,
where A w is the achromatic response.
Further calculation steps are as follows:
R G B = M 16 X Y Z , R c G c B c = D R R D G G D B B ,
R a = 400 F L R c 100 0.42 F L R c 100 0.42 + 27.13 + 0.1 ,
If R c is negative,
R a = 400 F L R c 100 0.42 F L R c 100 0.42 + 27.13 + 0.1 ,
where R a is the postadaptation cone response, resulting in dynamic range compression. G a and B a are computed similarly.
a = R a 12 G a 11 + B a 11 , b = ( R a + G a 2 B a ) 9 , h = t a n 1 b a ,
where a represents the Redness–Greenness component, b the Yellowness–Blueness component, and h the hue angle.
Using the unique hue data in Table A2, we set h = h + 360 if h < h 1 ; otherwise, h = h . We choose an appropriate i (i = 1, 2, or 3) such that h i h h i + 1 .
e t = 1 4 c o s h π 180 + 2 + 3.8 ,
where e t is the eccentricity factor.
H = H i + 100 h h i e i h h i e i + h i + 1 h e i + 1 ,
where H is the hue quadrature.
A = ( 2 R a + G a + B a 20 0.305 ) N b b ,
where A is the achromatic response.
Table A2. Unique hue data for calculation of the hue quadrature.
Table A2. Unique hue data for calculation of the hue quadrature.
RedYellowGreenBlueRed
i12345
h i 20.1490.00164.25237.53380.14
e i 0.80.71.01.20.8
H i 0.0100.0200.0300.0400.0
The correlate of lightness J is calculated as
J = 100 A A w c z , Q = 4 c J 100 0.5 ( A w + 4 ) F L 0.25 ,
where Q is the correlate of brightness.
The last step is the calculation of chroma (C), colorfulness (M), and saturation (s):
t = 50 , 000 13 N c N c b e t ( a 2 + b 2 ) 1 2 R a + G a + 21 20 B a ,
C = t 0.9 J 100 0.5 ( 1.64 0 . 29 n ) 0.73 , M = C F L 0.25 , s = 100 M Q 0.5 .
The CAM16-UCS is defined as
J = 1.7 J 1 + 0.007 J , M = l n ( 1 + 0.0228 M ) / 0.0228 , a = M c o s ( h ) , b = M s i n ( h ) .
The color difference between two colors is determined by the Euclidean distance in CAM16-UCS:
Δ E CAM 16 UCS = Δ J 2 + Δ a 2 + Δ b 2 .
The use of the power correction function can further improve CAM16-UCS:
Δ E CAM 16 UCS PC = 1.41 ( Δ E CAM 16 UCS ) 0.63 .

References

  1. ITU-R. BT.2100-2: Image Parameter Values for High Dynamic Range Television for Use in Production and International Programme Exchange; ITU Radiocommunication Sector: Geneva, Switzerland, 2018. [Google Scholar]
  2. ITU-R. BT.2390-11: High Dynamic Range Television for Production and International Programme Exchange; ITU Radiocommunication Sector: Geneva, Switzerland, 2023. [Google Scholar]
  3. ITU-R. BT.2020-2: Parameter Values for Ultra-High Definition Television Systems for Production and International Programme Exchange; ITU Radiocommunication Sector: Geneva, Switzerland, 2015. [Google Scholar]
  4. Xu, L.; Zhao, B.; Luo, M.R. Colour gamut mapping between small and large colour gamuts: Part I. gamut compression. Opt. Express 2018, 26, 11481–11495. [Google Scholar] [CrossRef] [PubMed]
  5. Kucuk, A.; Finlayson, G.D.; Mantiuk, R.; Ashraf, M. Performance Comparison of Classical Methods and Neural Networks for Colour Correction. J. Imaging 2023, 9, 214. [Google Scholar] [CrossRef] [PubMed]
  6. Ghanem, S.; Holliman, J.H. Impact of color space and color resolution on vehicle recognition models. J. Imaging 2024, 10, 155. [Google Scholar] [CrossRef] [PubMed]
  7. Fairchild, M.D.; Wyble, D.R. hdr-CIELAB and hdr-IPT: Simple models for describing the color of high-dynamic-range and wide-color-gamut images. In Proceedings of the Color and Imaging Conference, San Antonio, TX, USA, 8–12 November 2010; Society for Imaging Science and Technology: Springfield, VA, USA, 2010; Volume 2010, pp. 322–326. [Google Scholar] [CrossRef]
  8. Froehlich, J.; Kunkel, T.; Atkins, R.; Pytlarz, J.; Daly, S.; Schilling, A.; Eberhardt, B. Encoding color difference signals for high dynamic range and wide gamut imagery. In Proceedings of the Color and Imaging Conference, Darmstadt, Germany, 19–23 October 2015; Society for Imaging Science and Technology: Springfield, VA, USA, 2015; Volume 2015, pp. 240–247. [Google Scholar]
  9. Dolby. What is ICTCP—Introduction? White Paper, Version 7.1; Technical Report; Dolby: San Francisco, CA, USA, 2016. [Google Scholar]
  10. Safdar, M.; Cui, G.; Kim, Y.J.; Luo, M.R. Perceptually uniform color space for image signals including high dynamic range and wide gamut. Opt. Express 2017, 25, 15131–15151. [Google Scholar] [CrossRef]
  11. Huang, Y.; Xu, H.; Zhang, Y.; Hu, B.; Deng, J.; Li, L. Towards perceptual uniformity and HDR-WCG image processing: A projection-based color space. Opt. Express 2024, 32, 30742–30755. [Google Scholar] [CrossRef]
  12. Luo, M.R.; Rigg, B. BFD (l: C) colour-difference formula. Part 1—Development of the formula. J. Soc. Dyers Colour. 1987, 103, 86–94. [Google Scholar] [CrossRef]
  13. Witt, K. Geometric relations between scales of small colour differences. Color Res. Appl. 1999, 24, 78–92. [Google Scholar] [CrossRef]
  14. Kim, D.H.; Nobbs, J.H. New weighting functions for the weighted CIELAB colour difference formula. In AIC Colour 97, Proceedings of the 8th Congress of the International Colour Association, Kyoto, Japan, 25–30 May 1997; Color Science Association of Japan: Kyoto, Japan, 1997; Volume 97, pp. 446–449. [Google Scholar]
  15. Berns, R.S.; Alman, D.H.; Reniff, L.; Snyder, G.D.; Balonon-Rosen, M.R. Visual determination of suprathreshold color-difference tolerances using probit analysis. Color Res. Appl. 1991, 16, 297–316. [Google Scholar] [CrossRef]
  16. Xu, Q.; Shi, K.; Luo, M.R. Parametric effects in color-difference evaluation. Opt. Express 2022, 30, 33302–33319. [Google Scholar] [CrossRef]
  17. Xu, Q.; Cui, G.; Safdar, M.; Xu, L.; Luo, M.R.; Zhao, B. Assessing Colour Differences under a Wide Range of Luminance Levels Using Surface and Display Colours. In Proceedings of the Color and Imaging Conference, Paris, France, 21–25 October 2019; Society for Imaging Science and Technology: Springfield, VA, USA, 2019; Volume 2019, pp. 355–359. [Google Scholar] [CrossRef]
  18. Judd, D.V. Ideal color space: Curvature of color space and its implications for industrial color tolerances. Palette 1968, 29, 4–25. [Google Scholar]
  19. Izmailov, C.A.; Sokolov, E.N. Spherical model of color and brightness discrimination. Psychol. Sci. 1991, 2, 249–260. [Google Scholar] [CrossRef]
  20. Bujack, R.; Teti, E.; Miller, J.; Caffrey, E.; Turton, T.L. The non-Riemannian nature of perceptual color space. Proc. Natl. Acad. Sci. USA 2022, 119, e2119753119. [Google Scholar] [CrossRef] [PubMed]
  21. Luo, M.R.; Li, C. CIECAM02 and its recent developments. In Advanced Color Image Processing and Analysis; Springer: New York, NY, USA, 2013; pp. 19–58. [Google Scholar]
  22. Huang, M.; Cui, G.; Melgosa, M.; Sánchez-Marañón, M.; Li, C.; Luo, M.R.; Liu, H. Power functions improving the performance of color-difference formulas. Opt. Express 2015, 23, 597–610. [Google Scholar] [CrossRef] [PubMed]
  23. Li, C.; Li, Z.; Wang, Z.; Xu, Y.; Luo, M.R.; Cui, G.; Melgosa, M.; Brill, M.H.; Pointer, M. Comprehensive color solutions: CAM16, CAT16, and CAM16-UCS. Color Res. Appl. 2017, 42, 703–718. [Google Scholar] [CrossRef]
  24. Luo, M.R.; Xu, Q.; Pointer, M.; Melgosa, M.; Cui, G.; Li, C.; Xiao, K.; Huang, M. A comprehensive test of colour-difference formulae and uniform colour spaces using available visual datasets. Color Res. Appl. 2023, 48, 267–282. [Google Scholar] [CrossRef]
  25. Xu, Q.; Zhao, B.; Cui, G.; Luo, M.R. Testing uniform colour spaces using colour differences of a wide colour gamut. Opt. Express 2021, 29, 7778–7793. [Google Scholar] [CrossRef]
  26. Basova, O.; Gladilin, S.; Grigoryev, A.; Nikolaev, D. Two calibration models for compensation of the individual elements properties of self-emitting displays. Comput. Opt. 2022, 46, 335–344. [Google Scholar] [CrossRef]
  27. der Maur, M.A.; Pecchia, A.; Penazzi, G.; Rodrigues, W.; Di Carlo, A. Unraveling the “Green Gap” problem: The role of random alloy fluctuations in InGaN/GaN light emitting diodes. arXiv 2015, arXiv:1510.07831. [Google Scholar]
  28. Wyszecki, G.; Stiles, W.S. Color Science: Concepts and Methods, Quantitative Data and Formulae; John Wiley & Sons: Hoboken, NJ, USA, 2000; Volume 40. [Google Scholar]
  29. Danilova, M.V.; Mollon, J.D. Bongard and Smirnov on the tetrachromacy of extra-foveal vision. Vis. Res. 2022, 195, 107952. [Google Scholar] [CrossRef]
  30. Pytlarz, J.A.; Pieri, E.G. How close is close enough? In Proceedings of the International Broadcasting Convention (IBC 2017), Amsterdam, The Netherlands, 13–17 September 2017; pp. 1–9. [Google Scholar]
  31. Huang, M.; Liu, H.; Cui, G.; Luo, M.R.; Melgosa, M. Evaluation of threshold color differences using printed samples. J. Opt. Soc. Am. A 2012, 29, 883–891. [Google Scholar] [CrossRef]
  32. Wyszecki, G.; Fielder, G.H. New Color-Matching Ellipses. J. Opt. Soc. Am. 1971, 61, 1135–1152. [Google Scholar] [CrossRef] [PubMed]
  33. Fairchild, M.D. Color Appearance Models, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
  34. Susstrunk, S.E.; Holm, J.M.; Finlayson, G.D. Chromatic adaptation performance of different RGB sensors. In Proceedings of the IS&T/SPIE Electronic Imaging, San Jose, CA, USA, 20–26 January 2001; SPIE: Bellingham, WA, USA, 2001; Volume 4300. [Google Scholar] [CrossRef]
  35. Moroney, N.; Fairchild, M.D.; Hunt, R.W.; Li, C.; Luo, M.R.; Newman, T. The CIECAM02 Color Appearance Model. In Proceedings of the Color and Imaging Conference, Scottsdale, AZ, USA, 12–15 November 2002; Imaging Science and Technology: Springfield, VA, USA; Volume 10, pp. 23–27. [Google Scholar]
  36. Konovalenko, I.A.; Smagina, A.A.; Nikolaev, D.P.; Nikolaev, P.P. ProLab: AA Perceptually Uniform Projective Color Coordinate System. IEEE Access 2021, 9, 133023–133042. [Google Scholar] [CrossRef]
  37. Luo, M.R.; Clarke, A.A.; Rhodes, P.A.; Schappo, A.; Scrivener, S.A.; Tait, C.J. Quantifying Colour Appearance. Part I. LUTCHI Colour Appearance Data. Color Res. Appl. 1991, 16, 166–180. [Google Scholar] [CrossRef]
  38. Braun, K.M.; Fairchild, M.D. Psychophysical Generation of Matching Images for Cross-Media Color Reproduction. J. Soc. Inf. Disp. 2000, 8, 33–44. [Google Scholar] [CrossRef]
  39. Robertson, A.R. CIE guidelines for coordinated research on colour-difference evaluation. Color Res. Appl. 1978, 3, 149–151. [Google Scholar] [CrossRef]
  40. Witt, K. CIE Guidelines for Coordinated Future Work on Industrial Colour-Difference Evaluation. Color Res. Appl. 1995, 20, 399–403. [Google Scholar] [CrossRef]
  41. Luo, M.R.; Cui, G.; Rigg, B. The development of the CIE 2000 colour-difference formula: CIEDE2000. Color Res. Appl. 2001, 26, 340–350. [Google Scholar] [CrossRef]
  42. Danilova, M.V. Colour sensitivity as a function of size: Psychophysically measured discrimination ellipses. In Proceedings of the 27th International Colour Vision Society Meeting, Ljubljana, Slovenia, 5–9 July 2024. [Google Scholar]
  43. Kruskal, J.B. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 1964, 29, 1–27. [Google Scholar] [CrossRef]
  44. García, P.A.; Huertas, R.; Melgosa, M.; Cui, G. Measurement of the relationship between perceived and computed color differences. J. Opt. Soc. Am. A 2007, 24, 1823–1829. [Google Scholar] [CrossRef]
  45. Melgosa, M.; Huertas, R.; Berns, R.S. Performance of recent advanced color-difference formulas using the standardized residual sum of squares index. J. Opt. Soc. Am. A 2008, 25, 1828–1834. [Google Scholar] [CrossRef]
  46. Nikolaev, D.P.; Basova, O.A.; Usaev, G.R.; Tchobanou, M.K.; Bozhkova, V.P. Detection and correction of errors in psychophysical color difference Munsell Re-renotation dataset. In Proceedings of the London Imaging Meeting, London, UK, 28–30 June 2023; Society for Imaging Science and Technology: Springfield, VA, USA, 2023; pp. 40–44. [Google Scholar] [CrossRef]
  47. Horowitz, J.L. Bootstrap methods in econometrics. Annu. Rev. Econ. 2019, 11, 193–224. [Google Scholar] [CrossRef]
  48. Wallis, S. Binomial confidence intervals and contingency tests: Mathematical fundamentals and the evaluation of alternative methods. J. Quant. Linguist. 2013, 20, 178–208. [Google Scholar] [CrossRef]
  49. Ottosson, B. A Perceptual Color Space for Image Processing. 2020. Available online: https://bottosson.github.io/posts/oklab/ (accessed on 8 November 2024).
Figure 1. The VCD gamut (blue), BT.2020 (red), human visual gamut (gray), and sRGB gamut (colored) shown on the proLab chromaticity diagram.
Figure 1. The VCD gamut (blue), BT.2020 (red), human visual gamut (gray), and sRGB gamut (colored) shown on the proLab chromaticity diagram.
Jimaging 10 00317 g001
Figure 2. Front and side views of the experiment setup.
Figure 2. Front and side views of the experiment setup.
Jimaging 10 00317 g002
Figure 3. Spectral power distributions of the luminance inside the booth compared to CIE D50, D55, and D65 illuminants.
Figure 3. Spectral power distributions of the luminance inside the booth compared to CIE D50, D55, and D65 illuminants.
Jimaging 10 00317 g003
Figure 4. Stimuli setup for SCD experiments. Quadrants: Q1, upper right; Q2, upper left; Q3, lower right; and Q4, lower left.
Figure 4. Stimuli setup for SCD experiments. Quadrants: Q1, upper right; Q2, upper left; Q3, lower right; and Q4, lower left.
Jimaging 10 00317 g004
Figure 5. Interior of the light booth for TCD measurements.
Figure 5. Interior of the light booth for TCD measurements.
Jimaging 10 00317 g005
Figure 6. Spectral power distributions of Q1 primaries R, G, B, and C.
Figure 6. Spectral power distributions of Q1 primaries R, G, B, and C.
Jimaging 10 00317 g006
Figure 7. Luminance vs. channel value for Q1 primaries (left to right, top to bottom: R, G, B, and C). Dots illustrate measurements; dashed lines represent ideal linearity; and solid lines, parabolic approximation.
Figure 7. Luminance vs. channel value for Q1 primaries (left to right, top to bottom: R, G, B, and C). Dots illustrate measurements; dashed lines represent ideal linearity; and solid lines, parabolic approximation.
Jimaging 10 00317 g007
Figure 8. Arrangement of 18 measurement directions around the color center in xyY space. Blue points indicate chromaticity-only shifts ( Δ Y = 0 ), yellow points show luminance shifts ( Δ x = 0 , Δ y = 0 ), and red points represent combined chromaticity and luminance shifts. (a) 3D view; (b) section showing chromaticity-only shifts; (c) section with x-coordinate fixed at 0; (d) section with y-coordinate fixed.
Figure 8. Arrangement of 18 measurement directions around the color center in xyY space. Blue points indicate chromaticity-only shifts ( Δ Y = 0 ), yellow points show luminance shifts ( Δ x = 0 , Δ y = 0 ), and red points represent combined chromaticity and luminance shifts. (a) 3D view; (b) section showing chromaticity-only shifts; (c) section with x-coordinate fixed at 0; (d) section with y-coordinate fixed.
Jimaging 10 00317 g008
Figure 9. Collected data shown on the proLab chromaticity diagram (left) and corresponding luminance Y in cd/m2 (right). The centers of the blue segment intersections indicate the centers of the measured ellipses, while the segments themselves represent the measured color differences.
Figure 9. Collected data shown on the proLab chromaticity diagram (left) and corresponding luminance Y in cd/m2 (right). The centers of the blue segment intersections indicate the centers of the measured ellipses, while the segments themselves represent the measured color differences.
Jimaging 10 00317 g009
Figure 10. Histogram of errors for the current state-of-the-art CDMs (CIEDE2000 and CAM16-UCS-PC) on the COMBVD dataset. The x-axis represents the magnitude of the CDM error, while the y-axis indicates the number of these errors.
Figure 10. Histogram of errors for the current state-of-the-art CDMs (CIEDE2000 and CAM16-UCS-PC) on the COMBVD dataset. The x-axis represents the magnitude of the CDM error, while the y-axis indicates the number of these errors.
Jimaging 10 00317 g010
Table 1. Gray-scale parameters from [25] and in our experiments.
Table 1. Gray-scale parameters from [25] and in our experiments.
Gray-Scale Number L * Δ E with GS-1cd/m2 [25]cd/m2, Ours
GS-140.00.011.2510225.02
GS-241.51.512.1795243.59
GS-343.03.013.1578263.16
GS-446.06.015.2687305.37
GS-552.012.020.1443402.89
Table 2. Potential biases and the measures taken to address them.
Table 2. Potential biases and the measures taken to address them.
Potential BiasesMitigation Measures
LEDs in different quadrants may vary.LEDs were sourced from the same model and production batch. Each quadrant was calibrated individually.
LED spectral power distribution may shift due to LED heating.The VCD was pre-warmed before use. A fan was installed to maintain a stable temperature during operation, and the LED temperature was monitored, ensuring it remained below 40 °C.
LED spectral power distribution may shift due to room temperature fluctuations.Room temperature was stabilized at 24 °C using air conditioning to maintain consistent LED spectral output.
The VCD may exhibit day-to-day performance variability.Random colors were verified for accuracy using a spectrophotometer during measurements.
Room lighting may vary throughout the day.Experiments were conducted in a fully darkened room, with D50 adaptation uniform lighting within the viewing cabin.
Observers may have color vision deficiencies.Observers were tested using the Color Assessment and Diagnosis test, and those with color vision anomalies were excluded.
Observers may have individual variations in visual systems and color perception.Measurements were stored for each individual separately without averaging across subjects, recognizing the importance of human diversity in perception.
Observers’ visual systems may adapt to conditions other than those in the experiment.Observers were allowed to adapt to the experimental viewing conditions for at least 1 min before testing.
Observers may have misunderstood the task.A learning session was conducted for each observer to discuss and clarify their observations and tasks.
Observers may adjust the number of steps arbitrarily, taking approximately the same number to find the desired color difference.Measurement directions were randomized, and each color pair was measured four times in different directions. Medians of the measurements were used for analysis.
Observers may experience fatigue, leading to less accurate observations.Each observer completed 2–3 sessions per day with breaks between sessions. Sessions lasted approximately 40 min, requiring several days of measurements per observer.
Table 3. Viewing condition parameters for experimental CD datasets.
Table 3. Viewing condition parameters for experimental CD datasets.
DatasetIlluminant L a Y b Surround
BFD-PD65, M, C10020Average
RIT-DuPontD65127.310.9Average
LeedsD6510020Average
WittD6582.824.9Average
SDCThD658050Average
OursD50100100Average
Table 4. Results of linear model optimization for COMBVD-D65 clusters.
Table 4. Results of linear model optimization for COMBVD-D65 clusters.
RegionRedWhiteBlueYellowGreen
S T R E S S g r o u p 0.1970.1790.1810.2160.197
Cluster size294264206205202
Table 5. Consistency scores of COMBVD-D65 sub-datasets and our data. For each detected cluster, the number of points from each dataset (size) and the S T R E S S g r o u p metric for the correspondence of cluster points to the constructed local linear model are provided.
Table 5. Consistency scores of COMBVD-D65 sub-datasets and our data. For each detected cluster, the number of points from each dataset (size) and the S T R E S S g r o u p metric for the correspondence of cluster points to the constructed local linear model are provided.
No. STRESS group BFDP SizeRit-DuPont SizeWitt SizeLeeds SizeOur Size
10.2401516855272
20.26015518781365
30.316107831583
40.2421218564
50.23710884943
Table 6. Performance of CDMs on previously published datasets (lower values indicate better consistency).
Table 6. Performance of CDMs on previously published datasets (lower values indicate better consistency).
ModelSTRESS
SDR & sRGBHDR or WCGAll
CIEDE20000.297 ± 0.0320.328 ± 0.0770.306 ± 0.031
CAM16-UCS0.303 ± 0.0360.320 ± 0.0580.308 ± 0.031
CAM16-UCS-PC0.301 ± 0.0140.293 ± 0.0300.311 ± 0.014
Jzazbz0.383 ± 0.0210.329 ± 0.0500.371 ± 0.021
ICaCb0.410 ± 0.0220.355 ± 0.0500.398 ± 0.022
Oklab0.475 ± 0.0260.439 ± 0.0860.466 ± 0.030
proLab0.475 ± 0.0240.439 ± 0.0540.466 ± 0.022
ICTCp0.492 ± 0.0270.408 ± 0.0610.472 ± 0.026
Opt. fitted Δ E f 0.334 ± 0.0210.303 ± 0.0380.327 ± 0.019
Table 7. Performance of CDMs on the dataset we collected (lower values indicate better consistency).
Table 7. Performance of CDMs on the dataset we collected (lower values indicate better consistency).
ModelSTRESS
SDR & sRGBHDR or WCGAll
CAM16-UCS-PC0.326 ± 0.0260.500 ± 0.0330.528 ± 0.032
Jzazbz0.470 ± 0.0550.584 ± 0.0460.583 ± 0.044
ICaCb0.485 ± 0.0460.583 ± 0.0400.589 ± 0.037
CAM16-UCS0.486 ± 0.0330.625 ± 0.0680.625 ± 0.063
Oklab0.461 ± 0.1110.634 ± 0.0430.633 ± 0.043
ICTCp0.564 ± 0.0450.601 ± 0.0380.635 ± 0.034
proLab0.471 ± 0.0480.646 ± 0.0440.648 ± 0.042
CIEDE20000.525 ± 0.0350.694 ± 0.0890.693 ± 0.082
Opt. fitted Δ E f 0.439 ± 0.0390.401 ± 0.0990.408 ± 0.091
Table 8. Performance of CDMs on the combined dataset: previously published and ours (lower values indicate better consistency).
Table 8. Performance of CDMs on the combined dataset: previously published and ours (lower values indicate better consistency).
ModelSTRESS
SDR & sRGBHDR or WCGAll
CAM16-UCS-PC0.395 ± 0.0230.490 ± 0.0310.493 ± 0.026
ICaCb0.484 ± 0.0300.571 ± 0.0390.558 ± 0.033
Jzazbz0.495 ± 0.0350.584 ± 0.0450.589 ± 0.042
CAM16-UCS0.438 ± 0.0360.622 ± 0.0690.610 ± 0.064
proLab0.508 ± 0.0270.635 ± 0.0420.616 ± 0.038
ICTCp0.528 ± 0.0270.615 ± 0.0330.630 ± 0.027
Oklab0.532 ± 0.0370.641 ± 0.0420.656 ± 0.037
CIEDE20000.454 ± 0.0360.688 ± 0.0900.668 ± 0.090
Opt. fitted Δ E f 0.377 ± 0.0240.395 ± 0.0960.393 ± 0.080
Table 9. Performance of CDMs on the combined dataset: previously published and ours (lower values indicate better consistency).
Table 9. Performance of CDMs on the combined dataset: previously published and ours (lower values indicate better consistency).
ModelFEM2JND
SDR & sRGBHDR or WCGAll
CAM16-UCS-PC0.026 ± 0.0100.342 ± 0.0370.164 ± 0.018
ICaCb0.050 ± 0.0130.266 ± 0.0340.139 ± 0.017
Jzazbz0.046 ± 0.0130.250 ± 0.0330.123 ± 0.016
CAM16-UCS0.038 ± 0.0120.194 ± 0.0310.090 ± 0.014
proLab0.063 ± 0.0150.309 ± 0.0360.172 ± 0.018
ICTCp0.076 ± 0.0160.356 ± 0.0370.257 ± 0.021
Oklab0.065 ± 0.0150.246 ± 0.0330.131 ± 0.016
CIEDE20000.043 ± 0.0130.182 ± 0.0300.095 ± 0.014
Opt. fitted Δ E f 0.019 ± 0.0080.179 ± 0.0300.081 ± 0.013
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Basova, O.; Gladilin, S.; Kokhan, V.; Kharkevich, M.; Sarycheva, A.; Konovalenko, I.; Chobanu, M.; Nikolaev, I. Evaluation of Color Difference Models for Wide Color Gamut and High Dynamic Range. J. Imaging 2024, 10, 317. https://doi.org/10.3390/jimaging10120317

AMA Style

Basova O, Gladilin S, Kokhan V, Kharkevich M, Sarycheva A, Konovalenko I, Chobanu M, Nikolaev I. Evaluation of Color Difference Models for Wide Color Gamut and High Dynamic Range. Journal of Imaging. 2024; 10(12):317. https://doi.org/10.3390/jimaging10120317

Chicago/Turabian Style

Basova, Olga, Sergey Gladilin, Vladislav Kokhan, Mikhalina Kharkevich, Anastasia Sarycheva, Ivan Konovalenko, Mikhail Chobanu, and Ilya Nikolaev. 2024. "Evaluation of Color Difference Models for Wide Color Gamut and High Dynamic Range" Journal of Imaging 10, no. 12: 317. https://doi.org/10.3390/jimaging10120317

APA Style

Basova, O., Gladilin, S., Kokhan, V., Kharkevich, M., Sarycheva, A., Konovalenko, I., Chobanu, M., & Nikolaev, I. (2024). Evaluation of Color Difference Models for Wide Color Gamut and High Dynamic Range. Journal of Imaging, 10(12), 317. https://doi.org/10.3390/jimaging10120317

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop