Information Flow in Biological Networks for Color Vision

Malo, Jesús

doi:10.3390/e24101442

Open AccessArticle

Information Flow in Biological Networks for Color Vision

by

Jesús Malo

Image Processing Lab, Universitat de Valencia, 46980 Valencia, Spain

Entropy 2022, 24(10), 1442; https://doi.org/10.3390/e24101442

Submission received: 11 August 2022 / Revised: 20 September 2022 / Accepted: 3 October 2022 / Published: 10 October 2022

(This article belongs to the Section Entropy and Biology)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Biological neural networks for color vision (also known as color appearance models) consist of a cascade of linear + nonlinear layers that modify the linear measurements at the retinal photo-receptors leading to an internal (nonlinear) representation of color that correlates with psychophysical experience. The basic layers of these networks include: (1) chromatic adaptation (normalization of the mean and covariance of the color manifold); (2) change to opponent color channels (PCA-like rotation in the color space); and (3) saturating nonlinearities to obtain perceptually Euclidean color representations (similar to dimension-wise equalization). The Efficient Coding Hypothesis argues that these transforms should emerge from information-theoretic goals. In case this hypothesis holds in color vision, the question is what is the coding gain due to the different layers of the color appearance networks? In this work, a representative family of color appearance models is analyzed in terms of how the redundancy among the chromatic components is modified along the network and how much information is transferred from the input data to the noisy response. The proposed analysis is performed using data and methods that were not available before: (1) new colorimetrically calibrated scenes in different CIE illuminations for the proper evaluation of chromatic adaptation; and (2) new statistical tools to estimate (multivariate) information-theoretic quantities between multidimensional sets based on Gaussianization. The results confirm that the efficient coding hypothesis holds for current color vision models, and identify the psychophysical mechanisms critically responsible for gains in information transference: opponent channels and their nonlinear nature are more important than chromatic adaptation at the retina.

Keywords:

chromatic information; color appearance networks; efficient coding hypothesis; total correlation; mutual information; Gaussianization

1. Introduction

Biological vision is relevant to manage visual data because natural visual systems evolved to develop efficient representations of visual features that may be an inspiration for artificial systems. Examples include (1) the equivalence between the spatio-spectral sensitivity of receptive fields in natural neural systems and those emerging from information maximization and optimal matching [1], or the filters found by maximizing classification performance [2], and (2) the importance of human-like spatio-chromatic representations in image coding algorithms [3,4].

Conversely, the quantitative tools of statistical learning are key to propose principled theories in visual neuroscience. For instance, the classical efficient coding hypothesis argues that the organization of biological sensors comes from the optimization of information-theoretic goals [5,6]. The conventional approach to check this hypothesis is from-statistics-to-perception: i.e., deriving the biological behavior from statistical arguments. In color vision, this includes the derivation of opponent channels from principal components [7], and the derivation of the frequency bandwidth of color channels [8,9] nonlinearities of opponent channels [10,11], and even the reproduction of color illusions [12,13], from information maximization or error minimization arguments. However, there is an alternative way to check the hypothesis: from-perception-to-statistics [14,15,16]. In this case, perceptually meaningful models which have not been statistically optimized are shown to have a statistically efficient behavior.

In this work, we take this alternative approach (from-perception-to-statistics) for color vision models. Here, we analyze the communication efficiency of standard color vision models (that already describe a wide range of color vision psychophysics) using multivariate information-theoretic measurements. This analysis is interesting for data science and computer vision because the building blocks of color appearance models have received statistical interpretation: adaptation is related to manifold alignment [17], opponency is related to the principal components of color data [7], and the nonlinearities of the opponent channels are related to histogram equalization [10,11,12]. In case the efficient coding hypothesis holds for color vision networks, the question is what is the coding gain due to the different layers of the color appearance networks?

A quantitative response to that question is interesting for data science to select which process should be addressed first for the efficient management of color data. The perception-to-statistics approach has been previously applied to texture perception [14,15] and to models that involve spatial transforms [16], but it is original in purely color vision: note that previous information-theoretic analysis of color vision (e.g., Ref. [18] and the references therein) were focused on the amount of information from an image which can be obtained from the corresponding scene under a different illumination and not on the efficiency of the system to transmit generic color information (irrespective of the illumination) as done herein. The work presented here is an extended version of the preliminary results presented in an oral communication [19].

2. Statistical Interpretation of Building Blocks of Color Vision

The basic elements of biological color vision are:

(1) Linear integration of spectral irradiance at the retina by three sensors tuned to Long, Medium and Short wavelengths, which are commonly referred to as LMS sensors [20]. In every natural or artificial system, this initial linear stage is the necessary transduction from electromagnetic energy to the first numerical representation of color data. Here, we will start by expressing colorimetrically calibrated images in LMS tristimulus values via the Stockman and Sharpe fundamentals [21].

(2) Nonlinear adaptation at the retina adjusts the sensitivity (or gain) of LMS sensors to the illumination of the scene. For instance, classical von Kries chromatic adaptation normalizes the sensors by the responses of what is considered to be white in the scene [22]. From a statistical point of view, the role of chromatic adaptation is the same as manifold alignment in machine learning to make the interpretation of the data easier in changing environments (in this case, environments with different illuminations). According to this interpretation, generalizations of the von Kries transform have been proposed, such as, for instance, trying to make first and second moments of the different color manifolds equal [17,23], or using higher-order equalization transforms for the different datasets and making their dimensions equal in the canonical domain. The latter higher-order methods may be linear [1] or nonlinear [11,12]. Here, we will explore the behavior of classical von Kries adaptation [22], and the adaptation through the equalization of mean and covariance, which will be referred to as the Webster–Clifford approach following [17,23]. More sophisticated nonlinear equalization techniques such as the sequential principal curves analysis (SPCA) [11,12] will be used as a convenient statistical benchmark.

(3) Linear opponent channels in ganglion cells and beyond. The change from a color representation mediated by sensors with all-positive (physically realizable) sensitivities, as the LMS sensors at the retina, to color representations in opponent channels is obtained via the linear recombination of the LMS signals. This recombination leads to an achromatic channel and two red-green and yellow-blue chromatic channels [20,22,24]. As such, neural computation allows to obtain sensors with opponent (positive-and-negative) spectral sensitivities that are not easy to implement physically. Spectral sensitivities which are effectively opponents are found at different layers along the neural pathway: at the ganglion cells, the lateral geniculate nucleus, and the visual cortex [25]. This linear change of color representation has been statistically interpreted as the identification of the principal components of the color manifold [7]. Here, we will use the classical Jameson and Hurvich transform to opponent channels [24], which is not based on image statistics, but on color-matching experiments. These opponent channels are also called achromatic, tritanopic, and deuteranopic (ATD) due to their relation with dichromatic vision [20,26].

(4) Nonlinearities of opponent channels. Nonuniform color discrimination thresholds [27,28] imply that the saturation of the Weber law occurring in the achromatic channel [22,29] also appears in the chromatic channels [20]. This nonlinearity of the opponent channels has been explained as the necessary transform to obtain equalized PDFs in the responses [10]. This nonlinearity can be optimized for PDF equalization or error minimization after uniform quantization. The authors referred to this nonlinearity as the pleistochrome transform [10]. This dimension-wise equalization concept was generalized to multivariate scenarios through principal curves (the above mentioned SPCA) [11,12]. Here, the simpler (univariate) pleistochrome transform will be compared to the more general SPCA transform.

The sequence of the building blocks considered above fits into the current deep-network paradigm [30] because it can be implemented as a cascade of two linear+nonlinear layers performing spectral integration + adaptation followed by opponency + saturation:

(1)

In this architecture,

x^{0}

is the spectral irradiance at the photo-receptors,

r^{1}

is the vector of linear LMS responses,

x^{1}

is the vector of nonlinearly adapted LMS responses,

r^{2}

is the vector of linearly recombined ATD responses, and

x^{2}

is the vector of nonlinear ATD responses. The goal of the work was to measure how much information from the linear LMS stage is transmitted to the other layers of the network. As such, we check whether information transmission can be a sensible organization principle for these natural systems which are the critical layers of the network.

3. Materials and Methods

3.1. Colorimetrically Calibrated Database

The IPL color image database [1,11,12] is well suited to study color adaptation because its controlled illumination under CIE A and CIE D65 allows us to know the white point in each scene. This implies that chromatic adaptation transforms will not require extra approximations such as the gray-world assumption. The acquisition of the controlled images and resulting data is illustrated in Figure 1.

3.2. Quantification of the Information Flow

The information flow along the considered networks can be measured in direct and indirect ways. The direct measure consists of estimating the information about the input natural colors

x^{0}

contained at the color representation (or response) at the i-th layer of the network,

x^{i}

: the Mutual Information

I (x^{0}, x^{i})

. Of course, the information transferred from the input can also be measured for the stages after the linear transforms,

I (x^{0}, r^{i})

. Given samples of the color signals at the different stages (either

r^{i}

or

x^{i}

), the same numerical technique will be applied to estimate the corresponding mutual information I. Therefore, below we will give expressions for

x^{i}

which will also be applicable for

r^{i}

.

These direct measures, I, can be put in terms of other (indirect) magnitudes. For illustration purposes, following [16], one can compare the efficiency of the representation in the i-th layer assuming that it performs a certain deterministic transform,

f^{i} (\cdot)

, corrupted by noise due to the non-ideal nature of the sensors:

x^{i} = f^{i} (x^{0}) + n^{i}

. In this setting, from the definition of I in terms of the joint and the conditional entropies (chapter 2 in [31]) leads to:

I (x^{0}, x^{i}) = h (x^{i}) - h (x^{i} | x^{0}) = h (x^{i}) - h (n^{i})

, because the uncertainty in the i-th layer given the input just comes from the noise

n^{i}

. Then, from the definition of Total Correlation, T, Refs. [32,33] that describe the redundancy between the coordinates,

x_{j}

, of a d-dimensional vector,

T (x) = \sum_{j = 1}^{d} h (x_{j}) - h (x)

, the information transferred up to the i-th layer is just [16]:

I (x^{0}, x^{i}) = \sum_{j = 1}^{d} h (x_{j}^{i}) - T (x^{i}) - h (n^{i})

(2)

where the superindex refers to the i-th layer and the subindex j refers to the individual neurons within that layer.

The right-hand side of Equation (2) implies that the transferred information can be increased in different ways: (1) by increasing the entropy of the coefficients of the response, which is limited by energy constraints that prevent arbitrarily large gains to increase the variance; (2) by reducing the noise at that layer, which is also limited by the finite quality of the sensors; and finally (3) by reducing the redundancy, T, between the sensors in the representation. Therefore, for sensors of fixed signal/noise quality, indirect measures of efficiency include the redundancy, T, within each layer along the networks, and the differential entropy, h, of the signal at each layer of the networks.

Equation (2) introduced in [16] clarifies the statements made in [34]: representations that minimize the redundancy, T, and maximize the entropy, h, are better for signal representation because independent components lead to factorial codes that maximize the use of the channel (maximize the transferred information), and maximum entropy representations imply that the signal accepts more noise without significant information loss.

By definition, noiseless and invertible representations preserve all the information from the input. Therefore, the measurement of the transmitted information only makes sense for noisy sensors (which is the case of actual physiological mechanisms). According to this, just for illustrative purposes as in [16], in our experiments measuring I, every considered sensor (or representation) was subject to Gaussian noise whose standard deviation was 5% of the total deviation of the response. We computed I between these noisy representations and the input (assuming a negligible noise in the input, with a standard deviation of 0.05% of the total deviation). The I between the noiseless input and the negligible-noise input will set a convenient reference for the maximum available information.

Finally, the Kullback–Leibler divergence, KLD or

D_{K L}

, which measures the lack of correspondence between data distributions [31], is a convenient information-theoretic measure to assess the match between the color-compensated sets using different chromatic adaptation mechanisms. Lower divergences indicate better adaptation.

In summary, I, T, h, and

D_{K L}

are appropriate measures to assess the information flow and adaptation in color appearance networks. However, the estimation of these (multivariate) quantities directly from the color samples injected into the networks is not straightforward: the naive use of the direct definitions implies the estimation of multivariate PDFs and this would introduce substantial bias in the results. This estimation problem is addressed in the next section.

3.3. Total Correlation, Mutual Information, and Kullback–Leibler Divergence from Gaussianization

In this work, we solve the problem of estimating multivariate information-theoretic quantities using a novel estimator of T which only relies on easier (univariate) density estimations: the rotation-based iterative Gaussianization (RBIG) [35,36].

The RBIG transform,

x^{'} = G_{x} (x)

, is a cascade of nonlinear+linear iterations, or the sequential application of two operations: nonlinear marginal Gaussianizations,

Ψ

, and linear rotations,

R

:

(3)

where n refers to the step in the sequence, and convergence takes N steps,

n = 0, \dots, N - 1

.

This invertible iterative procedure in Equation (3) is able to transform any input PDF into a zero-mean unit-covariance multivariate Gaussian even if the chosen rotations are random [35]. This ability to completely remove the structure of any PDF is useful to estimate the T of arbitrary vectors

x

: as the redundancy of a Gaussianized signal is zero,

T (x)

corresponds to the sum of the individual variations that take place along the iterations of RBIG while Gaussianizing

x

. Interestingly, the individual variation in each RBIG iteration only depends on (easy to compute) univariate entropies [36]:

T (x) = \sum_{n = 0}^{N - 1} Δ T^{(n)} = \frac{(N - 1) d}{2} log (2 π e) - \sum_{n = 1}^{N} \sum_{j = 1}^{d} h (x_{j}^{(n)})

(4)

where the (hard) multivariate problem was reduced to the computation of many (easy to compute) univariate entropies

h (x_{j}^{(n)})

.

The entropy of the multivariate signal at the i-th layer,

h (x^{i})

, can be obtained from

T (x^{i})

, and univariate entropy estimations from the aforementioned definition of T:

h (x^{i}) = - T (x^{i}) + \sum_{j = 1}^{d} h (x_{j}^{i})

(5)

The information shared by multidimensional datasets, I, is just the remaining total correlation within the variables once they have been separately Gaussianized [36,37]:

I (x, y) = T ([G_{x} (x), G_{y} (y)]),

(6)

where

G_{x} (\cdot)

and

G_{y} (\cdot)

are the Gaussianization transforms learnt for the random variables x and y, and T in Equation (6) can be reduced to univariate operations using Equation (4).

Similarly, the KLD between two multidimensional variables depends on the redundancy and the negentropy of one of the variables when it has been Gaussianized using the Gaussianization transform learnt from the other [36]:

D_{KL} (x | y) = T (z) + \sum_{j = 1}^{d} D_{KL} (p_{z_{j}} (z_{j}) | N (0, 1))

(7)

where

z = G_{x} (y)

is the variable y transformed according to the Gaussianization mapping for x.

In summary, according to Equations (4)–(7), all the hard-to-compute multivariate measures (T, h, I,

D_{K L}

) can be reduced to the computation of univariate quantities thanks to RBIG.

We also computed T and I via the Kozachenko–Leonenko estimator [38] used in other studies on color data [39]. However, note that this alternative estimator cannot be used to compute the KLD.

4. Experiments and Results

Here, we studied the communication efficiency of three illustrative families of color vision models:

Physiological networks: Cascades of physiologically meaningful linear + nonlinear layers:
(i)
LMS sensors [21] + von Kries or Webster–Clifford adaptation [17], followed by
(ii)
Jameson and Hurvich linear opponent channels [24] + pleistochrome saturations [10].
Psychophysical networks: Standard color appearance models that are made of the same ingredients, including
(i)
The classical CIE L $^{*}$ a $^{*}$ b $^{*}$ model [40],
(ii)
The LLAB model [22,41], and
(iii)
The more recent CIECAM02 [42,43].
Statistical models: Multivariate equalization based on principal curves, SPCA, was used to explain color vision [11,12], and this will also be considered (with or without classical chromatic adaptation transforms) as a convenient statistically based benchmark.

We used the following parameters in the above models. In the von Kries adaptation, the corresponding white points were computed by the mean of the color samples obtained under the different illuminations. The same was performed in the Webster–Clifford approach, which also used the sample covariance matrices for manifold alignment. The pleistochrome saturation functions were approximated by exponential functions. All the psychophysical networks also used the mean of the color samples for the different illuminations to obtain the white points. This is the only parameter in CIE L

^{*}

a

^{*}

b

^{*}

. However, LLAB and CIECAM02 require extra parameters. In LLAB, we took the adaptation and inductions parameters listed in [22] for a default average surround. In CIECAM02, we also assumed an average surround (with the corresponding consequence in adaptation, impact of surround, and induction parameters [43]). Moreover, we took the adapting field, the background, and the white point from the mean of the colors. Given the luminances of these mean colors, the degree of adaptation in CIECAM02 was approximately 0.9 in the

[0, 1]

range. Finally, in the statistical SPCA, we used the parameters for information maximization in color manifolds reported in [11], which imply relatively stiff principal curves but adaptive histogram equalization (or generalized pleistochrome) at different points of these curves.

The physiological and psychophysical networks are implemented in Colorlab [44] available here (https://isp.uv.es/code/visioncolor/colorlab.html, accessed on 5 October 2022) and the SPCA algorithm [11,12] is available here (https://isp.uv.es/spca.html, accessed on 5 October 2022).

4.1. Visualization of the Color Manifolds

First, we show how the color manifolds of natural scenes under the CIE D65 and CIE A illuminations change through the layers of the considered networks. Information measures are related to the volume and density of the data distribution [31]. As a result, the visualization of the shape of the manifolds is important to understand the geometric effect of the considered transforms, and hence, their impact on the measures. Specifically, Figure 2 shows the separate effect of the elements of the physiological networks (opponency and nonlinearities of opponent channels without and with different chromatic adaptations). Figure 3 shows the effect of different (psychophysical) color appearance models once the corresponding chromatic adaptation has been performed and after the nonlinear opponent responses have been obtained. Finally, Figure 4 shows the result of the nonlinear equalization SPCA when it is applied to nonaligned manifolds or to manifolds aligned according to different color adaptation transforms.

4.2. Quantification of the Information Flow

We provide indirect measures of efficiency, such as the redundancy, T, within each layer along the networks, and the differential entropy, h, of the signal at each layer of the networks (Table 1 and Table 2, respectively). We also provide a direct measure of efficiency: the mutual information between the noiseless input and the noisy response, I, at each layer of the network (Table 3).

These magnitudes were estimated from the responses of the models to test sets of randomly chosen color samples from the IPL database shown in Figure 1. The parameters of the models that require statistical training (namely the Webster–Clifford adaptation, the pleistochrome nonlinearity, and SPCA) were trained over a set with

2 \times 10^{6}

natural color samples. Then,

4 \times 10^{4}

color samples not included in the training set were transformed using the perceptual models and the purely statistical SPCA. Ten independent estimations of the quantities were performed using random subsets of 80% of the test set. All tables shown were estimated using the RBIG (code in https://isp.uv.es/RBIG4IT.htm, accessed on 5 October 2022). Additionally, entropy and mutual information were also estimated using the Kozachenko–Leonenko procedure leading to qualitatively similar results (tables equivalent to Table 2 and Table 4, not shown).

Finally, in Table 5, we report the KLD that measures the correspondence between the color sets corresponding to D65 illumination and A illumination after chromatic adaptation (or white balance).

The tables have to be interpreted according to the following reference values:

Total correlation in Table 1 should be interpreted in light of Equation (2) where the transmitted information is maximized only if redundancy is completely removed: optimal means T = 0.
The differential entropy, h, depends on arbitrary changes of scale (or units) of the response. It can be even negative for PDFs of small support (chapt. 17 in [31]). Thus, for fair comparison, all h values in Table 2 were computed after linearly re-scaling the signals to be inscribed in the same 3D cube of size S. As such, the h values do describe how uniform the distributions are in the common support (in our case, we chose S = 10). As a useful reference, the upper bound for h is achieved by the uniform distribution, which is $h_{unif} = d \log_{2} S = 3 \log_{2} 10 = 9.97$ bits.
As stated in Section 3.2, the information shared by the noiseless input, $x^{0}$ , and the negligible-noise input, $x_{☆}^{0}$ , is a convenient reference because other (more noisy) layers will share less information with $x^{0}$ . Assuming, as indicated in Section 3.2, that noise in LMS with 0.05% deviation is negligible, the empirical RBIG computation of $I (x^{0}, x_{☆}^{0})$ on natural colors in LMS gives $14.1 \pm 0.1$ bits. The values of I in Table 3 should be compared to that upper bound.
In Table 4, lower divergences indicate a better match between the color compensated sets, so the optimal value would be $D_{K L} = 0$ .

In all tables, the three best results are highlighted in blue.

5. Discussion

The visualization of the manifold changes through the transforms of the considered models confirms their statistical interpretation outlined in the introduction. Adaptation aligns the data obtained under different acquisition conditions: compare the unaligned sets in blue and red in the plots of the left column of Figure 2 (no adaptation) with all the other cases where both sets overlap. Linear opponent channels rotate the input LMS representation following the axes of the data: this rotation is particularly clear in the transition from the first row to the second row in Figure 2, which shows rotations of the sets consistent with the principal component analysis interpretation of opponent channels [7]. Finally, the nonlinear nature of the opponent channels equalizes the responses: note in the bottom row of Figure 2 and Figure 3 how the biological networks make the different dimensions comparable in size. The nonlinearities equalize the responses as opposed to the linear representations where the luminance has substantially bigger energy and the PDF is more nonuniform. This is consistent with the equalization goal proposed in [10,11,45].

While data in the input representation are highly correlated due to the overlap of the LMS sensitivities (strong alignment along the diagonal of the domain in the top row of Figure 2 and Figure 3), the redundancy between the responses clearly reduces at later stages. This is quantitatively confirmed by the reduction in T along the layers of the networks in Table 1. On the other hand, visualizations suggest that data are progressively equalized along the networks and this is quantitatively confirmed by the progressive increase in the differential entropy along the columns of Table 2. As a result of the progressive independence and equalization, the amount of information about the input available at the different representations increases along the columns of Table 3.

It is remarkable that psychophysically-tuned models such as CIE-Lab and CIECAM have similar or better information transmission performance than an unsupervised learning method, SPCA, specifically trained for information maximization. While the emergence of perceptual nonlinearities, adaptation, and aftereffects from SPCA [11,12] is a confirmation of the efficient coding hypothesis in the classical from-statistics-to-perception direction, the quantitative efficiency of CIE color appearance models presented here is a confirmation in the opposite direction (from-perception-to-statistics). The results presented here complement in the purely chromatic modality previous results of this alternative approach to check Barlow’s hypothesis [14,15,16].

Beyond this confirmation, our results enable the quantification of the gain in information transference due to the specific layers of the network: retinal adaptation, transform to opponent channels, and the saturation nonlinearities of the opponent channels. The average results of

Δ I

are given in Table 5. These gains,

Δ I

, imply that opponency is the most relevant feature of color vision to favor the efficient information transmission, followed by the saturating response of the opponent channels. These processes are way more important than chromatic adaptation. One may argue that the goal of chromatic adaptation is making scene interpretation more robust and not merely improving the information capacity of the visual pathway.

Note also that KLD results in Table 4 show that SPCA (designed for information maximization) actually achieves a lower adaptation performance than the simple von Kries or Webster–Clifford procedures in the LMS space. This, together with the small impact of chromatic adaptation in information transmission, suggest that adaptation should be explained by a different principle. The small effect of chromatic adaptation in information theoretic terms was also identified in a previous analysis of biological spatio-chromatic vision models [16]: von Kries did not reduce redundancy and the chromatically adapted domain with the same amount of noise shared less information with the original signal than the linear LMS space. In that study [16], the benefits of chromatic adaptation in information transmission were only apparent, if any, after the consideration of spatial transforms.

The communication efficiency analysis performed herein following the perception-to-statistics logic is different from the work of Foster et al., which was also concerned about the use of accurate information-theoretic measures in color vision [18]. Note that their work is mainly focused on determining the number of discriminable colors/surfaces under different illumination conditions [46,47,48], which is related to the amount of color information in a scene that can be extracted from color measurements under other illumination [18,39]. These problems are related to entropy and mutual-information measures, but they do not quantify the information transference through the visual pathway (mutual information between layers and redundancy within layers). As an example, in [47,49], the redundancy is only considered because of its impact on the available information in the color compensation context, not as a measure of information transmission of the visual system.

In more general scenarios (where perception is complemented with action), it has been suggested that information flow is not solely a function of the scenes to be encoded, but it can be maximized through active interaction between the system and the environment [50], e.g., through saccades, foveation, or adaptive saliency. This (more general) perception–action loop is beyond the scope of our work which is restricted to bottom–up feed-forward networks. However, from the technical point of view, the estimation of information-theoretic quantities using Gaussianization can relax the severe quantization and Gaussian assumptions that had to be used in [50]. Similarly, following [51], where estimations based on Gaussianization were used to quantify connectivity between nodes in neural networks, RBIG could be used to measure normalized mutual information which is the core of successful methods to identify actual relations in noisy networks [52].

6. Conclusions

The results show that biological architectures that mediate color vision are quite efficient in information theoretic terms: they reduce approximately 75% of the redundancy present at the input linear responses (Table 1 shows that the redundancy of the raw signal (input linear LMS) is 5.71 bits while the redundancy in the inner representation of psychophysical models is in the range of [1,2] bits, which represents reductions of [65,82]% of the redundancy present in the input). As a result, while noisy sensors at the input representation would only retain approximately 35% of the chromatic information (Table 2 shows that LMS sensors with noise of 5% deviation would retain 5.0 bits compared to the ideal 14.1 bits assuming negligible noise of 0.05% deviation), sensors with the same amount of noise at the inner representations retain approximately 65% of the information. The inner representation of the psychophysical models with 5% noise shown in Table 2 retain approximately [7.8, 9.8] bits from the input. This is approximately 65% of the maximum available information, 14.1 bits). The results of biological networks (not explicitly optimized for statistical goals) are on par with those of non-linear equalization methods. In terms of communication efficiency, the most relevant transform in color vision is the consideration of opponent channels followed by the nonlinear response of the opponent channels. On the contrary, the impact of adaptation to improve the transmission is almost negligible (see Table 5). The small impact of chromatic adaptation in information transmission is consistent with the previous analysis of the von Kries transform in more general (spatio-chromatic) vision models [16].

From the theoretical neuroscience perspective, these results confirm the Efficient Coding Hypothesis for human color vision in the perception-to-statistics direction: statistically agnostic color vision models such CIE L

^{*}

a

^{*}

b

^{*}

and CIECAM02 (only based on psychophysics) are remarkably efficient in transmitting natural colors. Moreover, for the data science community, these results rank the relevance of color vision features in terms of their impact in optimal color information transmission: linear decorrelation and nonlinear equalization are more important than manifold alignment (or white balance).

Funding

This research and the APC were funded by the Spanish Ministerio de Ciencia e Innovación (MICIIN/FEDER/UE), grant numbers PID2020-118071GB-I00 and PDC2021-121522-C21, and and by the Generalitat Valenciana, grant numbers GrisoliaP/2019/035 and CIPROM/2021/056.

Conflicts of Interest

The author declares no conflict of interest regarding the publication of this paper.

References

Gutmann, M.U.; Laparra, V.; Hyvärinen, A.; Malo, J. Spatio-chromatic adaptation via higher-order canonical correlation analysis of natural images. PLoS ONE 2014, 9, e86481. [Google Scholar] [CrossRef] [PubMed]
Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the 25th Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2012; pp. 1097–1105. [Google Scholar]
Taubman, D.S.; Marcellin, M.W. JPEG 2000: Image Compression Fundamentals, Standards and Practice; Kluwer Academic Publishers: Norwell, MA, USA, 2001. [Google Scholar]
Gutierrez, J.; Camps, G.; Luque, M.; Malo, J. A Color Contrast Definition for Perceptually Based Color Image Coding. Recent Patents Sign. Proc. 2012, 2, 33–55. [Google Scholar]
Barlow, H. Sensory mechanisms, the reduction of redundancy, and intelligence. In Proceedings of the National Physics Laboratory Symposium on the Mechanization of Thought Process, London, UK, 24–27 November 1958; pp. 535–539. [Google Scholar]
Barlow, H. Redundancy reduction revisited. Netw. Comp. Neur. Syst. 2001, 12, 241–253. [Google Scholar] [CrossRef]
Buchsbaum, G.; Gottschalk, A. Trichromacy, opponent colours coding and optimum colour information transmission in the retina. Proc. Roy. Soc. Lond. B Biol. Sci. 1983, 220, 89–113. [Google Scholar]
Atick, J.; Li, Z.; Redlich, A. Understanding Retinal Color Coding from First Principles. Neural Comput. 1992, 4, 559–572. [Google Scholar] [CrossRef]
Li, Q.; Gomez-Villa, A.; Bertalmío, M.; Malo, J. Contrast sensitivity functions in autoencoders. J. Vis. 2022, 22, 8. [Google Scholar] [CrossRef] [PubMed]
MacLeod, D.; von der Twer, T. The pleistochrome: Optimal opponent codes for natural colors. In Color Perception: From Light to Object; Heyer, D., Mausfeld, R., Eds.; Oxford Univ. Press: Oxford, UK, 2003. [Google Scholar]
Laparra, V.; Jiménez, S.; Camps-Valls, G.; Malo, J. Nonlinearities and adaptation of color vision from sequential principal curves analysis. Neural Comput. 2012, 24, 2751–2788. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Laparra, V.; Malo, J. Visual aftereffects and sensory nonlinearities from a single statistical framework. Front. Hum. Neurosci. 2015, 9, 557. [Google Scholar] [CrossRef] [Green Version]
Gomez-Villa, A.; Martín, A.; Vazquez-Corral, J.; Bertalmío, M.; Malo, J. Color illusions also deceive CNNs for low-level vision tasks: Analysis and implications. Vis. Res. 2020, 176, 156–174. [Google Scholar] [CrossRef]
Malo, J.; Laparra, V. Psychophysically tuned divisive normalization approximately factorizes the PDF of natural images. Neural Comput. 2010, 22, 3179–3206. [Google Scholar] [CrossRef]
Gomez-Villa, A.; Bertalmio, M.; Malo, J. Visual Information Flow in Wilson-Cowan Networks. J. Neurophysiol. 2020, 123, 2249–2268. [Google Scholar] [CrossRef]
Malo, J. Spatio-Chromatic Information available from different Neural Layers via Gaussianization. J. Math. Neurosci. 2020, 10, 18. [Google Scholar] [CrossRef]
Webster, M.A.; Mollon, J. Adaptation and the color statistics of natural images. Vis. Res. 1997, 37, 3283–3298. [Google Scholar] [CrossRef] [Green Version]
Foster, D.H. The Verriest Lecture: Color vision in an uncertain world. JOSA A 2018, 35, 192–201. [Google Scholar] [CrossRef]
Malo, J. Information flow in Color Appearance Neural Networks. In Proceedings of the Entropy Conference, Porto, Portugal; 2021; p. 48. Available online: https://entropy2021.sciforum.net/ (accessed on 5 October 2022).
Stockman, A.; Brainard, D. OSA Handbook of Optics, 3rd ed.; McGraw-Hill: New York, NY, USA, 2010; Chapter Color Vision Mechanisms; pp. 147–152. [Google Scholar]
Stockman, A.; Sharpe, L. The spectral sensitivities of the middle- and long-wavelength-sensitive cones derived from measurements in observers of known genotype. Vis. Res. 2000, 40, 1711–1737. [Google Scholar] [CrossRef] [Green Version]
Fairchild, M. Color Appearance Models; The Wiley-IS & T Series in Imaging Science and Technology; Wiley: Sussex, UK, 2013. [Google Scholar]
Clifford, W.; Webster, M.; Stanley, G.; Stocker, A.; Kohn, A.; Sharpee, T.; Schwartz, O. Visual adaptation: Neural, psychological and computational aspects. Vis. Res. 2007, 47, 3125–3131. [Google Scholar] [CrossRef] [Green Version]
Hurvich, L.; Jameson, D. An opponent-process theory of color vision. Psychol. Rev. 1957, 64, 384–404. [Google Scholar] [CrossRef] [Green Version]
Shapley, R.; Hawken, M. Color in the cortex: Single- and double-opponent cells. Vis. Res. 2011, 51, 701–717. [Google Scholar] [CrossRef] [Green Version]
Capilla, P.; Diez-Ajenjo, M.; Luque, M.; Malo, J. Corresponding-pair procedure: A new approach to simulation of dichromatic color perception. JOSA A 2004, 21, 176–186. [Google Scholar] [CrossRef]
Krauskopf, J.; Gegenfurtner, K. Color discrimination and adaptation. Vis. Res. 1992, 32, 2165–2175. [Google Scholar] [CrossRef]
Romero, J.; García, J.; Jiménez del Barco, L.; Hita, E. Evaluation of color-discrimination ellipsoids in two-color spaces. J. Opt. Soc. Am. A 1993, 10, 827–837. [Google Scholar] [CrossRef]
Wyszecki, G.; Stiles, W. Color Science: Concepts and Methods, Quantitative Data and Formulae; John Wiley & Sons: New York, NY, USA, 1982. [Google Scholar]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; Available online: https://www.deeplearningbook.org (accessed on 5 October 2022).
Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley-Interscience: Hoboken, NJ, USA, 2006. [Google Scholar]
Watanabe, S. Information theoretical analysis of multivariate correlation. IBM J. Res. Dev. 1960, 4, 66–82. [Google Scholar] [CrossRef]
Studeny, M.; Vejnarova, J. The Multi-information function as a tool for measuring stochastic dependence. In Learning in Graphical Models; Jordan, M.I., Ed.; Kluwer: Amsterdam, The Netherlands, 1998; pp. 261–298. [Google Scholar]
Bell, A.J.; Sejnowski, T.J. An information maximization approach to blind separation and blind deconvolution. Neural Comput. 1995, 7, 1129–1159. [Google Scholar] [CrossRef] [PubMed]
Laparra, V.; Camps-Valls, G.; Malo, J. Iterative gaussianization: From ICA to random rotations. IEEE Trans. Neural Netw. 2011, 22, 537–549. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Laparra, V.; Johnson, E.; Camps, G.; Santos, R.; Malo, J. Information Theory Measures via Multidimensional Gaussianization. arXiv 2020, arXiv:2010.03807. [Google Scholar]
Johnson, J.; Laparra, V.; Santos, R.; Camps, G.; Malo, J. Information Theory in Density Destructors. In Proceedings of the 7th International Conference of Machine Learning ICML 2019, Workshop on Invertible Normalization Flows, Long Beach, CA, USA, 10–15 June 2019. [Google Scholar]
Kozachenko, L.; Leonenko, N. Sample estimate of the entropy of a random vector. Probl. Inf. Trans. 1987, 23, 95–101. [Google Scholar]
Marin-Franch, I.; Foster, D.H. Estimating Information from Image Colors: An Application to Digital Cameras and Natural Scenes. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 78–91. [Google Scholar] [CrossRef] [Green Version]
CIE-Commission. Colorimetry-Part 4: CIE 1976 L*a*b* Colour Space; Technical Report ISO/CIE 11664-4:2019; CIE: Cambridge, UK, 1976. [Google Scholar]
Luo, M.; Lo, M.; Kuo, W. The LLAB colour model. Color Res. Appl. 1996, 21, 412. [Google Scholar] [CrossRef]
Luo, M.; Hunt, R. The structure of the CIE 1997 colour appearance model (CIECAM97s). Color Res. Appl. 1998, 22, 138–146. [Google Scholar] [CrossRef]
Moroney, N. The CIECAM02 Color Appearance Model. In Proceedings of the 10th Color Imaging Conference, Scottsdale, AZ, USA, 12–15 November 2002; pp. 23–27. [Google Scholar]
Malo, J.; Luque, M. ColorLab: A Matlab Toolbox for Color Science and Calibrated Color Image Processing. 2002. Available online: http://isp.uv.es/code/visioncolor/colorlab.html (accessed on 5 October 2022).
Laughlin, S.B. Matching coding to scenes to enhance efficiency. In Physical and Biological Processing of Images; Braddick, O.J., Sleigh, A.C., Eds.; Springer: Berlin, Germany, 1983; pp. 42–52. [Google Scholar]
Foster, D.; Nascimento, S.; Amano, K. Information limits on neural identification of colored surfaces in natural scenes. Vis. Neurosci. 2004, 21, 331–336. [Google Scholar] [CrossRef] [Green Version]
Foster, D.H.; Marín-Franch, I.; Amano, K.; Nascimento, S.M.C. Approaching ideal observer efficiency in using color to retrieve information from natural scenes. J. Opt. Soc. Am. A 2009, 26, 14–24. [Google Scholar] [CrossRef] [Green Version]
Marin-Franch, I.; Foster, D. Number of perceptually distinct surface colors in natural scenes. J. Vis. 2010, 10, 9. [Google Scholar] [CrossRef] [Green Version]
Foster, D.; Marin-Franch, I.; Nascimento, S. Coding efficiency of CIE color spaces. In Proceedings of the 16th Color Imaging Conference on Social Imaging Science and Technology, Portland, OR, USA, 10–14 November 2008; pp. 285–288. [Google Scholar]
Lungarella, M.; Sporns, O. Mapping Information Flow in Sensorimotor Networks. PLoS Comput. Biol. 2006, 2, e144. [Google Scholar] [CrossRef] [Green Version]
Li, Q.; Steeg, G.; Malo, J. Functional Connectivity in Visual Areas from Total Correlation. arXiv 2022, arXiv:2208.05770. [Google Scholar]
Wang, B.; Pourshafeie, A.; Zitnik, M.; Zhu, J.; Bustamante, C.; Batzoglou, S.; Leskovec, J. Network enhancement as a general method to denoise weighted biological networks. Nat. Comm. 2018, 9, 3108. [Google Scholar] [CrossRef]

Figure 1. Natural colors used in the experiments. (Left): experimental setting to record the natural scenes under calibrated illumination (CIE D65 and CIE A spectra) and representative corresponding scenes. (Right): color measurements in the CIE xy diagram. Blue and red dots represent the colors under D65 (white) and A (yellowish) illuminations.

Figure 2. Manifold changes through physiological networks. (Top Row): Retinal responses of LMS cones with no adaptation (left) and with different retinal adaptation schemes (center and right). (Center Row): Linear opponent ATD responses (recombination after the retina) with no adaptation (left) and different adaptation schemes. (Bottom Row): Nonlinear opponent ATD responses after dimension-wise PDF equalization (pleistochrome transform). Red and blue dots represent the samples under the A and D65 illuminations.

Figure 3. Manifold changes through psychophysical color appearance models. (Top): Responses of sensors of standard models tuned to long, medium, and short wavelengths. All these representations already include divisive adaptation schemes for manifold alignment. (Bottom): Nonlinear opponent responses in the different color appearance models. Red and blue dots represent the samples under the A and D65 illuminations.

Figure 4. Manifold equalization using nonlinear independent component analysis (sequential principal curves, SPCA). (Left): Responses of the SPCA sensors from non-aligned manifolds. (Center): SPCA responses from von Kries adapted measurements. (Right): SPCA responses from Webster–Clifford adapted colors. Red and blue dots represent the samples under the A and D65 illuminations.

Table 1. Intra-layer Total Correlation through the networks (in bits). Optimal systems (Equation (2)) should have T = 0 bits.

Physiol. Models	No Adaptation	Von Kries	Webster–Clifford
Linear LMS input	5.71 ± 0.04	6.26 ± 0.02	6.29 ± 0.03
Linear ATD channels	3.23 ± 0.03	2.21 ± 0.03	2.30 ± 0.05
Pleistochrome	3.26 ± 0.04	2.22 ± 0.04	2.33 ± 0.05
Color App. Models	CIE L $^{}$ a $^{}$ b $^{*}$	LLAB	CIECAM
LMS nonlin. sensors	5.89 ± 0.04	5.84 ± 0.02	6.55 ± 0.03
ATD nonlin. sensors	1.04 ± 0.06	2.00 ± 0.02	1.58 ± 0.05
Statist. Model	No adaptation	Von Kries	Webster–Clifford
Infomax SPCA	1.12 ± 0.04	0.95 ± 0.06	1.08 ± 0.04

Table 2. Differential entropy of color through the networks (in bits). Upper bound for h in the 3D cube of size 10 is

h_{unif} = 3 \log_{2} 10 = 9.97

bits.

Table 2. Differential entropy of color through the networks (in bits). Upper bound for h in the 3D cube of size 10 is

h_{unif} = 3 \log_{2} 10 = 9.97

bits.

Physiol. Models	No Adaptation	Von Kries	Webster–Clifford
Linear LMS input	−4.97 ± 0.04	−4.01 ± 0.04	−4.01 ± 0.02
Linear ATD channels	−1.68 ± 0.03	−0.71 ± 0.03	−0.72 ± 0.01
Pleistochrome	6.24 ± 0.04	7.64 ± 0.0.3	7.56 ± 0.02
Color App. Models	CIE L $^{}$ a $^{}$ b $^{*}$	LLAB	CIECAM
LMS nonlin. sensors	1.93 ± 0.04	−2.18 ± 0.02	−1.70 ± 0.03
ATD nonlin. sensors	4.61 ± 0.02	1.10 ± 0.03	2.74 ± 0.01
Statist. Model	No adaptation	Von Kries	Webster–Clifford
Infomax SPCA	8.70 ± 0.04	8.82 ± 0.03	8.71 ± 0.03

Table 3. Transferred information (input–output mutual information, in bits). Empirical RBIG upper bound for I is 14.1 bits (negligible noise of 0.05% deviation).

Physiol. Models	No Adaptation	Von Kries	Webster–Clifford
Linear LMS input	$5.0 \pm 0.1$	$5.1 \pm 0.1$	$5.2 \pm 0.1$
Linear ATD channels	$7.7 \pm 0.2$	$7.6 \pm 0.2$	$7.6 \pm 0.1$
Pleistochrome	$8.58 \pm 0.05$	$8.6 \pm 0.1$	$8.7 \pm 0.1$
Color App. Models	CIE L $^{}$ a $^{}$ b $^{*}$	LLAB	CIECAM
LMS nonlin. sensors	$6.6 \pm 0.1$	$5.44 \pm 0.03$	$5.8 \pm 0.1$
ATD nonlin. sensors	9.8 ± 0.2	$7.8 \pm 0.2$	8.8 ± 0.3
Statist. Model	No adaptation	Von Kries	Webster–Clifford
Infomax SPCA	8.9 ± 0.2	$8.7 \pm 0.3$	8.8 ± 0.3

Table 4. Chromatic adaptation (KLD between the D65 and A sets, in bits). Perfect color compensation is represented (by definition) by KLD = 0 bits.

Physiol. Models	No Adaptation	Von Kries	Webster–Clifford
Linear LMS input	5.7 ± 0.1	0.84 ± 0.05	0.67± 0.06
Linear ATD channels	3.6 ± 0.3	0.83 ± 0.05	0.78 ± 0.04
Pleistochrome	3.6 ± 0.3	0.7 ± 0.1	0.82 ± 0.07
Color App. Models	CIE L $^{}$ a $^{}$ b $^{*}$	LLAB	CIECAM
LMS nonlin. sensors	0.54 ± 0.07	0.72 ± 0.09	0.73 ± 0.07
ATD nonlin. sensors	0.55 ± 0.07	0.72 ± 0.07	0.72 ± 0.02
Statist. Model	No adaptation	Von Kries	Webster–Clifford
Infomax SPCA	2.2 ± 0.1	1.8 ± 0.1	1.6 ± 0.1

Table 5. Gains in available information (

Δ I

, in bits) due to different features of the models.

Table 5. Gains in available information (

Δ I

, in bits) due to different features of the models.

	Retinal Adaptation	Opponency	Saturation
Physiol. Models	0.1 ± 0.1	2.5 ± 0.2	1.0 ± 0.2
	Retinal Adaptation	Opponency + Saturation
Color App. Mod. Infomax SPCA	0.9 ± 0.9 0.0 ± 0.3	2.8 ± 0.4 3.7 ± 0.2

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Malo, J. Information Flow in Biological Networks for Color Vision. Entropy 2022, 24, 1442. https://doi.org/10.3390/e24101442

AMA Style

Malo J. Information Flow in Biological Networks for Color Vision. Entropy. 2022; 24(10):1442. https://doi.org/10.3390/e24101442

Chicago/Turabian Style

Malo, Jesús. 2022. "Information Flow in Biological Networks for Color Vision" Entropy 24, no. 10: 1442. https://doi.org/10.3390/e24101442

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Information Flow in Biological Networks for Color Vision

Abstract

1. Introduction

2. Statistical Interpretation of Building Blocks of Color Vision

3. Materials and Methods

3.1. Colorimetrically Calibrated Database

3.2. Quantification of the Information Flow

3.3. Total Correlation, Mutual Information, and Kullback–Leibler Divergence from Gaussianization

4. Experiments and Results

4.1. Visualization of the Color Manifolds

4.2. Quantification of the Information Flow

5. Discussion

6. Conclusions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI