**1. Introduction**

In 2010, the World Health Organization estimated that at least 285 million people worldwide have a form of visual impairment, 39 million of whom are blind. Moreover, the number will likely increase due to several factors, including an aging population. Driven by advances in technology and the need to improve the quality of life in visually impaired and blind people, the research community has recently experienced a growing interest in developing assistive technology [1]. Most of the focus has been on mobility, navigation, object recognition, and, today, in improving access to printed media and social interaction [2]. Although access to artistic culture prevents the isolation and fosters the proper functioning of visually impaired and blind people in society and communication [3], there is a limited amount of research regarding the development of assistive technology to access visual artworks contents. Traditionally, access to visual artworks by blind and visually impaired people has been made possible through accessible tours and workshops [4,5], audio guides [6], Braille leaflets with embossed tactile diagrams [7], tactile 3D models [8], and, more recently, by interactive interfaces that provide rich and location-based information through multiple sensory channels [9–13]. However, these methods fail to facilitate experiencing the artwork's color contents. Color is a fundamental artwork element that expresses depth, form, and movement. In addition, it strongly affects the mood, harmony, importance, and emotions expressed. Therefore, perceiving it is essential for a complete artwork experience.

In this work, we propose a multi-sensory color code for visual artworks that uses melodies and scents. We selected the audio and olfactory components following a cross-

**Citation:** Cavazos Quero, L.; Lee, C.-H.; Cho, J.-D. Multi-Sensory Color Code Based on Sound and Scent for Visual Art Appreciation. *Electronics* **2021**, *10*, 1696. https://doi.org/ 10.3390/electronics10141696

Academic Editor: Mazdak Zamani

Received: 11 March 2021 Accepted: 12 July 2021 Published: 15 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

modal correspondence approach since there is evidence that it can facilitate learning [14–17]. Most of the literature has traditionally expressed the color in a uni-sensory way, but, as the number of encoded colors increases, the mental effort involved to learn and use the code also increases. We presume that using a hybrid approach eases the effort to recognize the encoded colors and improves the color identification success rate compared to the uni-sensory method. Additionally, the multi-sensory approach could be potentially more expressive as the different modalities could complement each other. In this work, we also explore using the proposed color code system and implement a sensory substitution device prototype for visual art color content exploration. Participants were requested to explore a visual artwork through the prototype and compare it with a tactile graphics alternative to elicit feedback and identify the advantages and challenges of the multi-sensory approach. The main contributions of this work are:


### **2. Related Work**

### *2.1. Auditory Representation of Color*

Visual Sensory Substitution Devices (SSDs) are systems that convey visual information through non-visual modalities. Typically, they translate visual features (vertical position, horizontal position, luminance, and color) into tactile or auditory stimuli (texture, shape, pitch, frequency, time, intensity) [18]. The translation often is based on the principle of cross modal correspondence. This principle refers to a compatibility effect between attributes or dimensions of a stimulus in different sensory modalities. For this work, the compatibility effect studied is between the visual stimulus and the sound and smell sensory modalities. The specific case of compatibility between color and audio is known as color sonification, and the set of associations or mappings made between sound and color is known as Sound Color Code (SCC).

Examples of SSDs that use color sonification are:

Soundview [19] represents the color of a point from an image selected with a stylus using a tablet. Color sonification maps the color hue, saturation, and brightness to a white noise sound. This sound is then modified using a low-pass filter. The color's brightness determines the filter's cutoff frequency. The color's hue determines which filter, out of 12 pitch-filters, modulates the signal. The color saturation affects the extent to which the pitch filter modifies the signal. Soundview has been evaluated for 2D geometric shape identification of black and white shapes with success [20].

Eyeborg [21] uses a head-mounted webcam. A single color is picked from in the center of the image using a weighted distribution of the pixel colors. The color's hue and saturation are used for color sonification. The color's hue is classified into 360 tones spanning an octave. The saturation modulates the volume of the sound.

See ColOr [22] horizontally samples 25 pixels from a video feed and simultaneously sonifies each pixel color based on its hue, saturation, and brightness. The color's hue is classified into seven colors. Each color is represented by the timbre of a musical instrument as follows: red (oboe), orange (viola), yellow (pizzicato), green (flute), cyan (trumpet), blue (piano), and purple (saxophone). Mixed color hues can be expressed by modulating the volume between them. Saturation is expressed by the pitch of the note of the hue's instrument in 4 steps. Zero to twenty-four percent is represented by a C note, 25–49% by a G note, 50–74% by a B lat note, and 75–100% by an E note. Luminance is mapped as follows: for values less than 50% a double bass sound is added, for values larger than 50% a signing voice is added. High luminance uses high-pitched notes, while low luminance uses low-pitched notes. See ColOr has been evaluated for color recognition [23], navigation, and object localization [24].

Kromophone [25] represents a specific color as the sum of focal colors and luminance. The focal colors are red, green, blue, and yellow. The luminance is divided into white, grey, and black. Each focal color is represented by a characteristic sound composed of pitch, timbre, and panning. A color representation is made by mixing the sound of the different focal colors and luminance. The contribution of each of the focal colors is expressed by its volume in the mix. Kromophone was evaluated for color identification, object discrimination, and navigation tasks.

ColEnViSon [26] converts the Red, Green, and Blue (RGB) space of an image into the CIELch color space, which characterizes colors into 267 centroids. The colors in the image are simplified into the color of the nearest centroid into ten color categories selected from hue and chroma. Each of the ten color categories is given a unique timbre for representation, and the luminance is represented using musical notes. Experiments show that users were able to interpret sequences of colored patterns and identify the number of colors in an image.

EyeMusic [14] conveys color information using a musical instrument's timbre for each of the white (choir), blue (brass), red (Reggae organ), green (Rapman's reed), and yellow (string) colors. Black is represented by silence.

EyeMusic resizes and clusters the colors in the image to produce a six-color 40 × 24- pixel image. Then, the image is divided into columns, which are processed from left to right to construct a soundscape. The musical note played in the instrument is determined by the y-axis coordinate of the pixel. The luminance determines the note's volume. All the sounds of each of the pixels in the column are combined. The combined audio is reproduced for each of the columns. Eye Music was evaluated for shape and color recognition achieving 91.5% and 85.6%, respectively.

Creole [27] is similar to Soundview in that it uses a tablet and stylus to explore twodimensional colored images. The stylus is used to select a pixel in the image to translate its color information into sound. The pixel's RGB value is transformed into the CIE LUV color space, and the proportions of white, grey, black, red, green, yellow, and blue colors are calculated. Each of the colors is mapped to sound in the following way: white = 3520 Hz pure tone, grey = 100–3200 Hz noise, black = 110 Hz pure tone, red = 'u' vowel sound, green = 'i' vowel sound, blue = 262, 311, and 392 Hz tones, and yellow = 1047, 1319, and 1568 Hz tones. Creole was evaluated through color-sound associative memory tasks and object recognition. It was found that the Creole coding was easier to memorize and more effective to correctly identify the colors after less than 15 min of training.

Colorphone [28] is a wearable device that represents color by associating each RGB color component with a sine frequency. Specifically, red is associated with a 1600 Hz sine wave sound, green with 550 Hz, and blue with 150 Hz. The mixed signal is filtered with a low-pass white noise to represent the whiteness. This color system was evaluated for color recognition and navigation.

Most of the previous studies involving SSDs were made only for research purposes [17]. Like most assistive technologies, they mainly focus on functionality without any emphasis on aesthetics [27]. As shown in Table 1, in this paper, we explore a dual modality-based color code that utilizes the characteristics of hearing and smell to provide an improved user experience beyond the existing color codes that use single sensory modalities, such as touch, hearing, and smell.


**Table 1.** Comparison of existing color codes in terms of coding method and the number of colors coded.

> Two recent works that propose color coding schemes designed for visual artwork color exploration with an emphasis on aesthetics are described in ColorPoetry [37] and Bartolome et al. [36]. ColorPoetry [37] proposes a color scheme that uses poem narrations with voice modulation to represent colors and their different shades. Bartolome et al. [36], instead, propose a multi-sensory approach involving the auditory and tactile channels that uses musical sounds and temperature cues to convey color. Compared to our work, ColorPoetry uses a uni-sensory approach for color representation, and was not evaluated for color identification performance. The approach followed by Bartolome et al. [36] describes a multi-sensory approach that uses sound and tact. Our work, in contrast, uses sound and smell. We use these sensory channels based on the results of the preliminary study were participants proposed the of use music and scent over other sensory channels. In addition, the temperature actuator proposed by Bartolome et al. [36] is limited in the

number of colors it can represent spatially at the same time for tactile exploration. We compare the performance evaluations of the different methods in Table 2.


**Table 2.** Comparison of an existing multi-modal color codes in terms of user evaluation.

> The sound color code study in Reference [33] confirmed that distinguishing light and dark colors using a sound (classical music melodies) approach is easier for most participants. However, when extending the palette to include warm and cool color variants, they started to experience difficulties. In this work, we propose to use scent in addition to sound to design a color code to serve two purposes. One is to easily differentiate the light-dark, warm-cool color variants from each other. The second one is to take advantage of using the additional sensory mode to convey the several sensorial properties of color. The proposed code decomposes a specific color into a hue and a set of color dimensions (saturated, light, and dark) for each hue. It uses the correspondence between musical instruments' timbre and color to facilitate hue identification. In addition, it employs pitch modulation and selected melodies for aesthetic color dimension representation. Besides sound, the proposed multi-sensory color code simultaneously integrates smell to represent each hue's saturated, light, and dark color dimensions. The advantages of this approach are several: most users seem to improve the correct identification of the hue's color dimensions and also report improved expressivity of the color and their artwork experience.

### *2.2. Olfactory Representation of Color*

The correspondence between color, scent, and taste has been explored for the food and consumer industries by Frieling [38], who proposed the scent-hue mappings in Table 3. Li et al. [39] developed ColorOdor, a sensory substitution device that uses scents to help blind and visually impaired people identify colors. The device uses a camera to recognize the color of objects. Using a piezoelectric transducer, it vaporizes scents following the scenthue mapping in Table 3. The scent-color mappings were designed through a survey with two visually impaired participants. ColorOdor intended use is for the color identification of everyday objects and as a learning tool for children with congenital blindness. The use of ColorOdor for visual artwork appreciation was not explored in the study. Lee and Cho [40] explored the implicit associations between color and concepts and described two relationships: color orientation and concept orientation. They use these two relationships to map the association between scent and color. They found that the orange scent represents a highly saturated orange color directionality, bright, extroverted, and the directionality of a strong stimulus. This property could also be applied when describing the characteristics of yellow and red. Considering that orange is a mixture of these two colors, orange has a universality that includes all three colors. The chocolate scent showed a brown color directionality with low brightness, and the concept directionality of round, low, warm, and introverted. Menthol and pine had similar turquoise color directivity, and the concept of coolness was considered to be about 22% higher in menthol than pine. Therefore, menthol was assigned a color associated with coolness, blue, and pine green, respectively. Using these mappings, they produce a Tactile Color Book [41] that conveys the color information of visual artworks to provide immersive and active exploration for blind and visually impaired people. The book is printed using a special ink impregnated with scent, that can be smelled when rubbed. Using this approach, blind and visually impaired

students obtained a color identification accuracy of 94.3%. They expressed that the scentcolor mapping was intuitive, easy to learn, and that it helps to understand the visual artwork content.


**Table 3.** Odor-hue color codes for olfactory representation of color.

The olfactory system is connected directly to the limbic system which is the section of the brain that processes emotions. When the olfactory receptors are stimulated by a scent, it often produces emotive responses on the subject which often trigger associated memories [43]. Thus, scents can be used to mediate the exploration of artworks through emotion [44]. Besides emotion, olfactory stimuli can trigger light or dark sensations [45]. Gilbert et al. [42] described that people tend to associate particular scents (Table 3) to a specific color in a non-random way. In addition, they associated the scents with the concepts of darkness and lightness. Civet was rated as the darkest, and Bergamot Oil, Aldehyde C-16, and Cinnamic aldehyde were rated as the lightest. Kemp et al. [46] described a correspondence between the strength-intensity of the scent with the lightness of a color. For example, strong scents were matched with darker colors. On the contrary, Fiore [47] described a correlation between floral scents and bright colors.

Olfactory stimuli can also trigger thermal sensations, like the feeling of coolness or warmness. Laska et al. [48] found that menthol (peppermint) and cineol (eucalyptus) consistently match the temperature conditions (cooling). Madzharov [49] pretested six essential oils, three of which we expected to be perceived as warm scents (warm vanilla sugar, cinnamon, pumpkin, and spice) and three as cool scents (eucalyptus- spearmint, peppermint, and winter wonderland). Thirty-three undergraduate students participated and evaluated each scent on perceived temperature and liking. Of the six scents, cinnamon and warm vanilla sugar were rated as the warmest, and peppermint was rated as the coolest. Cinnamon and peppermint were significantly different on the temperature dimension, as were warm vanilla sugar and peppermint. Adams et al. [50] identified lemon, apple, and peach scents as the brightest and lightest, whereas coffee, cinnamon, and chocolate scents as the dimmest and darkest odorants. Stevenson et al. [51] argued that the stronger cross-correspondence between scent

and color occurs when the scent evokes a specific object (or context) producing a semantic match with a specific color. A summary of the correspondences between scent and color dimension described in previous works is shown in Table 4.


**Table 4.** Correspondence between scent and color dimensions described in previous literature.

#### *2.3. Multi-Sensory Representation of Color Based on Sound and Scent*

While there are more developments to improve the access of blind and visually impaired to visual artworks that make use of the several human perception senses, it is still uncommon to provide access to color content through the use of multiple senses simultaneously. This work attempts to follow this multi-sensory approach which has also been explored in Reference [52]. Previous research has explored the relationship between scent and sound. Piesse [53] described correspondences between scent and sounds by using the musical notes in the diatonic scale. The match could also be made using variations on the sound pitch. Crisinel et al. [54] studied the correspondence between scent and musical features and found that the scent of orange and the iris flower could be mapped to higher-pitched sounds compared to the scent of musk and roasted coffee. They expand the correspondence study to include shapes and emotions. Scents judged as joyful, pleasant, and sweet were more frequently associated with a higher pitch and round-curved shapes. Scents judged as arousing were more frequently associated with the angular shapes, but no correlation was found with sound's pitch. Scents judged as brighter were associated with higher-pitch and round shapes. In Reference [55], Velasco et al. describe the emotional similarity between the olfactory and auditory information, which is potentially crucial for cross-modal correspondences and multi-sensory processing. Olfactory and sound multi-sensory representation are more frequently explored in media artworks. A couple of examples are the Tate Sensorium [56] and Perfumery Organ [57]. In the latter, a fragrance is scented when a piano is played using an "incense" that connects the fragrance and sound devised by Piesse [53]. Piesse matches the musical notes Do with rose, Re with violet, and Mi with acacia scents. Inspired by strong correspondences between color, sound, and scent, in this paper, we attempt to identify the correspondence between scent and four color dimensions warm-cool and light-dark using a semantic match mediation. Once the correspondence has been established, we will use them in conjunction with a timbre-hue auditory correspondence to create a multi-sensory color code.

### *2.4. Tactile Representation of Color*

Tactile graphics refer to the use of raised lines and textures to convey images by touch. They serve as the basis for Tactile Color Patterns (TCPs), which are among the most common forms of accessible color representation for blind and visually impaired people. TCPs are a series of tactile pattern symbols that can be embossed along tactile reproductions of visual artworks to help identify the color of a specific object or area in the artwork. They follow a series of logical patterns to ease memorization and recollection. They are common on

printed media as they can be embossed along with Braille and tactile graphics. Compared to other color representation methods, they have several advantages. The immediacy of feedback is one of them. A trained user can identify the color as soon as the pattern is touched. Additionally, it can also communicate other characteristics of color through the shape, size, and position of the pattern [32]. To ease the understanding and learning of the pattern, the designers base the tactile patterns of their color symbols on different properties or motifs. For example, Taras et al. [29] designed a TCP inspired for display on Braille display devices. It uses two dots in a Braille cell to represent the symbols for primary colors red, blue, and yellow. Secondary and tertiary color symbols are represented using a combination of the former color patterns. Ramsamy-Iranah et al. [30] based their tactile patterns on the children's knowledge of basic shapes and their surroundings. For example, the color red is represented by a circle which the children associate with the red 'bindi' dot used by Hindu women on their forehead. Blue is represented by the outline of a square as an analogy of the blue rectangular-shaped soap used in the laundry. The color yellow is represented by small dots reminiscent of the pollen in flowers. Shin et al. [31] developed a line pattern texture by decomposing color into three components: hue, saturation, and value (brightness). These components map into the line pattern texture following the convention: The hue affects the orientation of the lines, the saturation determines the width of the lines, and the value (brightness) dictates the interval (density) of the lines. Stonehouse [58] proposed a TCP based on common geometric shapes. Cho et al. [32] proposed three different TCPs: CHUNJIIN, CELESTIAL, and TRIANGLE. CHUNJIIN is inspired by the three basic components of the Korean alphabet. CELESTIAL is based on curved and straight lines, and the TRIANGLE TCP by Goethe's color triangle. TCPs are well suited for printed media and color learning. However, when embossed into the tactile graphic representation of an artwork, they obstruct exploration, making it hard to discern what is part of the visual artwork and what is part of the color pattern. A solution to this problem is producing two or three tactile graphic versions of the artwork, one without the TCP for easy shape recognition, and one with the TCP for color identification. The third one is a combined version of the first two [59]. This approach has the disadvantage that forces the reader to explore the different versions to build the complete mental image of the artwork [60]. More importantly, as the number of colors increases, so does the difficulty for correct identification and training required to be proficient. Multi-sensory color coding using sounds and scents, which we explore in this work, can be used to alleviate the complexity of using tactile graphics with tactile color patterns.

### *2.5. Visual Art Appreciation*

Art appreciation is performed on the basic principles for exploration, technique examination, information analysis, and interpretation that enable the viewer to experience and understand an artwork. The community has long used art appreciation frameworks as a tool to establish a common ground and a defined process to appreciate visual artworks. Feldman [61] proposes a framework composed by the study of the artwork's information, analysis of its techniques, interpretation of the artwork's meaning, and value judgment. With the advent of modern art, the perception of art has moved from what was traditionally considered aesthetic or 'high art' into cognitive experiences such that viewing the artwork produces affective and self-rewarding aesthetic experiences. This has led to the development of new frameworks, such as the information-processing stage model of aesthetic processing (Leder et al. [62]), which considers art appreciation through both aesthetic experiences and judgments in a five-stage process that includes the perception of the artwork, explicit classification, implicit classification, cognitive mastering, and evaluation. Each stage exerts influence on each other, and, while the process follows an order, it also has feedback loops and can repeat its cycle. However, most of the frameworks include a perception phase in the early stage of the process. This stage involves acquiring different perceptual variables, such as complexity, contrast, symmetry, order, etc. This stage is particularly challenging for blind and visually impaired people since their vision might hinder their ability to assess the

perceptual variables. The following stages will be influenced and produce distressed aesthetic judgment and emotional responses. This work focuses on improving the perception of the color contents of the visual artworks through non-visual channels. Facilitating the perceptual analysis of the color information will positively influence the following stages in the aesthetic processing model helping blind and visually impaired people reach better aesthetic judgment and experience richer aesthetic emotions.

### **3. Materials and Methods**
