Unlocking Visual Attraction: The Subtle Relationship between Image Features and Attractiveness

Sun, Zhoubao; Zhang, Kai; Zhu, Yan; Ji, Yanzhe; Wu, Pingping

doi:10.3390/math12071005

Open AccessArticle

Unlocking Visual Attraction: The Subtle Relationship between Image Features and Attractiveness

by

Zhoubao Sun

^1,†,

Kai Zhang

^2,†,

Yan Zhu

²,

Yanzhe Ji

³ and

Pingping Wu

^1,*

¹

School of Engineering Audit, Jiangsu Key Laboratory of Public Project Audit, Nanjing Audit University, Nanjing 211815, China

²

School of Computer Science, Jiangsu Key Laboratory of Public Project Audit, Nanjing Audit University, Nanjing 211815, China

³

School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai 264209, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2024, 12(7), 1005; https://doi.org/10.3390/math12071005

Submission received: 25 February 2024 / Revised: 18 March 2024 / Accepted: 25 March 2024 / Published: 28 March 2024

Download

Browse Figures

Versions Notes

Abstract

:

The interest of advertising designers and operators in crafting appealing images is steadily increasing. With a primary focus on image attractiveness, this study endeavors to uncover the correlation between image features and attractiveness. The ultimate objective is to enhance the accuracy of predicting image attractiveness to achieve visually captivating effects. The experimental subjects encompass images sourced from the Shutterstock website, and the correlation between image features and attractiveness is analyzed through image attractiveness scores. In our experiments, we extracted traditional features such as color, shape, and texture from the images. Through a detailed analysis and comparison of the accuracy in predicting image attractiveness before and after feature selection using Lasso and LassoNet,, we confirmed that feature selection is an effective method for improving prediction accuracy. Subsequently, the Lasso and LassoNet feature selection methods were applied to a dataset containing image content features. The results verified an enhancement in the model’s accuracy for predicting image attractiveness with the inclusion of image content features. Finally, through an analysis of the four-dimensional features of color, texture, shape, and content, we identified specific features influencing image attractiveness, providing a robust reference for image design.

Keywords:

attractiveness of images; feature selection; LassoNet; context features

MSC:

94A08; 94A16; 91-10; 91C99

1. Introduction

In contemporary society, we find ourselves in an era characterized by an unprecedented deluge of information [1]. Daily, individuals are bombarded with a vast array of data inputs, ranging from incessant social media notifications to a continuous stream of emails, from an abundance of news stories to pervasive advertising campaigns. This phenomenon, known as information overload, presents a formidable challenge, particularly detrimental to the efficacy of advertising initiatives [2,3]. Due to the saturation of information, the public’s attention span is becoming increasingly fragmented, which substantially undermines the impact of advertisements. Consequently, the quest to craft advertisements that captivate within these constrained attention windows has emerged as a critical concern for designers [4,5,6]. In response, some scholars have explored the relative allure of images versus text within advertising contexts, positing a general predilection for visual stimuli [7,8,9]. This inclination towards imagery is often attributed to a diminishing patience for textual information in an era overwhelmed by information surplus. As such, the art of designing compelling visual content has ascended to a pivotal role within the advertising domain [10]. Nevertheless, the goal of quantifying visual appeal remains elusive, often relegated to subjective assessments due to the diverse interpretations that an image may elicit [11]. The variability in engagement and approval observed across social media platforms, where some images garner significant attention and accolades, while others languish in obscurity, underscores the challenge of discerning universally appealing imagery amidst a vast digital landscape [12].

Recent years have witnessed significant advancements in image recognition within the realm of computational vision, propelled by the application of machine learning methods across diverse fields [13,14,15]. The prediction of image popularity poses a formidable task, which has prompted numerous researchers to tackle this issue [16,17]. Presently, the prevailing approach relies on machine learning algorithms for prediction [18,19]. Notably, support vector machine (SVM) and multilayer perceptron (MLP) algorithms are employed for image popularity prediction. These algorithms leverage features such as color, texture, shape, and label information as inputs, constructing a prediction model through training on a designated set. Experimental assessments on test sets substantiate the superior predictive outcomes achieved [20,21].

In the realm of computer vision, machine learning methodologies have found extensive applications in tasks like image classification, object detection, and image segmentation [22]. However, image popularity prediction introduces added complexities. Investigating the extraction of effective image features and predicting image popularity have emerged as pivotal research avenues in digital image processing and computer vision. Image features, encompassing mathematical representations defining specific regions or objects within an image, include attributes like color, texture, and shape. Extracting image features and assessing image popularity contribute to more precise image search and recommendation mechanisms [23,24,25].

Research underscores the idea that an image’s allure extends beyond its visual aesthetics; the information and narrative embedded in the image’s content are crucial determinants of its attractiveness. For instance, Li and Xie discovered that an image’s content influences user engagement on social media, analyzing content features such as the number of faces and emotional states in forwarded images [26]. Content features offer nuanced information about an image’s allure, capturing semantic elements like objects, scenes, and emotions. This research enhances the efficacy of advertisement placement by aligning with audience preferences, thereby optimizing the impact of advertising strategies [27,28].

This study centers on a comprehensive analysis of images, delving into the intricacies of color, texture, shape, and content features to discern the pivotal elements influencing image attractiveness. Building upon this investigation, the paper presents an innovative predictive model for image attractiveness. This model stands poised to significantly contribute to the creation of highly compelling images by offering valuable insights into the factors that drive visual allure. The proposed model not only advances our understanding of image attractiveness but also serves as a practical tool for enhancing the design process, ensuring the production of visually captivating content.

2. Materials and Methods

2.1. Data Sources and Processing

For the purposes of this research, we selected a dataset comprising images from Shutterstock as the primary source of data. Shutterstock’s image collection includes comprehensive details, encompassing various attributes such as titles, keywords, classifications, timestamps, shooting locations, techniques, resolutions, sizes, image quality, and image popularity. This information can serve as a valuable database, facilitating a more precise assessment of image quality and appeal. The specific image categories are presented in Table 1.

We acquired a total of 2080 images from Shutterstock, selecting 80 per category. Employing Shutterstock’s image filtering functionality, we prioritized images based on their ratings during the search process for each category. To ensure a diverse representation, we segmented the search results into four quartiles and randomly chose pages from each quartile to download 20 images. This methodology aimed to achieve a balanced collection of images across different rating levels, thereby enriching the dataset’s diversity.

To guarantee the dataset’s quality and the reliability of future analyses, we performed thorough data cleaning and preprocessing on the collected images. This involved filtering out images that were either irrelevant or devoid of substantial informational content, including those that were blurry, duplicative, unclear, or unrelated to our research focus, such as illustrations and vector images. We made a conscious decision to retain only photographs that conveyed a strong sense of realism. Through this meticulous filtering process, we condensed our collection to 1040 high-quality professional images, distributing an average of 40 images across each of the 26 predetermined categories. Additionally, we documented the ratings (i.e., popularity) of each image for further analysis.

2.2. Image Processing

2.2.1. The Color Features of the Image

Color perception is one of the most intuitive aspects of human perception, with each object displaying a unique color due to the reflection and absorption of light. To represent colors in computers, researchers have developed various mathematical models. The most widely used model in modern computer science and digital image processing is RGB (red, green, and blue) [29]. The intensity of each of the three colors ranges from 0 to 255. Through the nuanced blending of these colors, the entire spectrum becomes visible. This model finds extensive applications in both computer vision and photography [30].

The HSV (hues, saturation, and value) model is commonly employed in image processing and editing [31]. Compared to the RGB model, HSV offers a more intuitive means of controlling and representing color. In HSV, ‘hue’ is measured in angular degrees and is used to describe the type of color. ‘Saturation’ represents the purity or intensity of a color, and ‘value’ (also known as ‘luminance’) indicates the degree of lightness or darkness of a color. Higher luminance values correspond to brighter colors, while lower luminance values indicate darker colors [32,33].

We computed and extracted the color histogram and color moment features of the image. A color histogram is a feature representation used in computational vision and image processing. It quantifies the likelihood of various colors occurring in an image and categorizes pixels within an image by selecting a color model, such as RGB or HSV. For instance, in the HSV model, the color histogram tallies the number of pixels in three distinct color channels (hue, saturation, and value), providing a means to characterize differences between images. Color moment features, on the other hand, are a method for converting color information into numerical values, facilitating tasks like image analysis and recognition. In both RGB and HSV color spaces, each image has three color channels. First-order color moments describe the color distribution within an image by calculating statistical properties like the mean, variance, skewness, and other features of the pixel values in each color channel.

2.2.2. Texture Features of the Image

Texture features in an image describe the distribution and arrangement relationships between pixel points [34,35,36]. They can reveal features such as roughness, directionality, structure, and color of an object’s surface. These features are valuable for identifying and classifying different objects or regions within an image. The texture features that we extracted include Gabor filter features, grayscale covariance matrix features, Tamura texture features, and local binary pattern (LBP) features [36].

2.2.3. Shape Features of the Image

Shape features in an image are relatively stable attributes that remain unaffected by changes in color and brightness [37,38]. These features typically represent an object’s shape in terms of its regions, boundaries, and skeletons. Generally, before extracting image feature parameters, an initial edge detection step is performed to obtain the edge profile of the target image. Shape features can be broadly categorized into contour features and region features. Contour features primarily describe the object’s contour shape, while region features encompass information related to the entire shape region [39,40].

2.2.4. Content Features of the Image

While features such as color, shape, and texture are valuable for studying image attractiveness, they may not always provide a comprehensive description. In many cases, additional content features are required to capture richer information about image attractiveness, including semantic details like objects, scenes, and emotions within the image [41,42]. These content features enable a more thorough and accurate assessment of image attractiveness.

Hence, this study proposes the incorporation of content features, which encompass information such as the type and quantity of objects within the image and the spatial relationships among neighboring objects. We achieve this by categorizing objects in the image and extracting data on the number and location of various object types, such as people, cars, animals, buildings, furniture, and others. Additionally, this study examines various aspects of content features, including the relative positions of objects in the image and the distribution of their centers of gravity.

In summary, this study extracts color features, shape features, texture features, and content features from the image dataset. The specific feature indicators are shown in Supplementary File.

2.3. Model Building

2.3.1. Lasso Model

The Lasso model is a linear regression model augmented with L1 regularization, making it a valuable tool for feature selection and regression tasks. This regularization technique aims to optimize model parameters by minimizing a specific objective function [43,44,45]. One of the key advantages of Lasso is its ability to drive the coefficients of many features to precisely zero, effectively facilitating feature selection. In the context of prediction, the Lasso model utilizes its trained parameters to make predictions on new data. These predictions can serve various purposes, including classification or regression tasks. Lasso models excel when dealing with high-dimensional datasets, as they can effectively handle scenarios with numerous features. Additionally, a Lasso model’s automatic feature selection capabilities enable it to identify and retain only the most relevant features. This is particularly advantageous compared to traditional regression models, as Lasso models effectively address the issue of multicollinearity. As a result, Lasso models yield more robust and interpretable models in many cases. Equation (1) is the objective function of the Lasso model, where

L (β)

denotes the objective function of Lasso,

Y

is the target variable,

X

is the matrix of predictor variables,

β

is the vector of coefficients to be estimated,

{∥ Y - X β ∥}^{2}

denotes the sum of squared errors between predicted values (

X β

) and actual target values (

Y

), and

λ

is the regularization parameter.

{∥ β ∥}_{1}

denotes the L1 norm of the coefficient vector

β

.

L (β) = {∥ Y - X β ∥}^{2} + λ {∥ β ∥}_{1}

(1)

2.3.2. LassoNet Model

The LassoNet model differs from the Lasso method in that it is a neural network-based feature selection method that borrows the Skip Connection mechanism from ResNet, aiming to be better adapted to the task of feature selection and prediction in high-dimensional data [46,47]. LassoNet combines elements of both Lasso and neural networks to perform feature selection and regression on high-dimensional datasets [48].

The fundamental idea behind LassoNet is to embed a Lasso-regularized linear model within a neural network, allowing it to simultaneously learn feature representation and predict the target variable. The LassoNet algorithm consists of two main components: a Lasso-regularized linear model and a neural network. In the Lasso-regularized linear model, the objective function comprises two parts: the fitting error of the data and an L1 penalty term. The L1 penalty term sums the absolute values of coefficients, promoting sparsity in the coefficients, and, thus, feature selection. In the neural network component, the objective function consists of a squared error and an L2 penalty term, which is the sum of the squares of all weights. The L2 penalty term helps control the magnitude of weights to prevent overfitting. LassoNet trains the model by simultaneously minimizing these two objective functions, where the strength of both penalty terms is controlled by hyperparameters. This combination of Lasso and neural networks enables LassoNet to perform feature selection and regression on high-dimensional datasets, making it more adaptable to complex data patterns. Figure 1 illustrates the LassoNet model for this study. The yellow line is the structure of LassoNet, which borrows the Skip Connection mechanism from ResNet.

2.4. Study Design

In this study, we undertake the design of two sets of experiments with the aim of predicting the popularity of 1040 images sourced from the Shutterstock website. The initial set of experiments involves the exclusion of content features from the images. Instead, we exclusively employ the color, texture, and shape features as the training set. Utilizing the Lasso and LassoNet methods for feature selection, we subsequently apply a classification algorithm to predict the popularity of the images. In the second set of experiments, we replicate the procedures of the first set while incorporating the content features of the images, enabling us to predict image popularity using this augmented dataset. The experimental flow is shown in Figure 2.

When employing the Lasso and LassoNet methods for feature selection, cross-validation is utilized to determine the optimal regularization parameter lambda. This parameter governs the number and weighting of features within the model. Selecting various lambda values yields different feature sets and corresponding model performances. During the cross-validation process, the dataset is partitioned into multiple mutually exclusive subsets. Subsequently, different subsets are utilized to train and validate the model, thereby obtaining an estimate of the average performance. By employing cross-validation to select lambda values, we ensure that the model does not become overfitted to a specific dataset, thereby enhancing the model’s generalization capability.

Meanwhile, to validate the efficacy of our proposed LassoNet method in feature selection, we randomly generated 500 artificial features based on the original dataset and merged them with the original dataset to create a new feature set, thereby simulating the presence of “noise” in real data. Following feature selection, the study ranked the selected features based on their importance coefficients and identified the top 100 features as candidates. Subsequently, we assessed the number of actual features among these 100 candidate features, i.e., the count of genuine features included. By comparing this with a dataset comprising randomly generated artificial features, we could more accurately evaluate the effectiveness of the LassoNet method.

3. Results

3.1. Lasso Feature Selection Results

We first calculated the correlation between image popularity and the feature set and produced a correlation heat map (see Figure 3). Details of the parameter tuning of the Lasso and LassoNet models are given in the Supplementary File. Without considering the inclusion of image content features, our study conducted Lasso feature selection [44] (refer to the Supplementary File for details on the Lasso model regularization parameter tuning process; here, we selected λ = 0.071). This process led to the identification of eight salient feature variables from a pool of fifty-nine image features (refer to Table 2, Post-EST OLS is the result of ordinary least squares computation.). In the realm of color features, the variable “Black” exhibited a negative correlation with attractiveness, while “Gray distribution” showed a positive correlation. Within the domain of shape features, both “Image size” and “Low gradient hue” demonstrated positive correlations with attractiveness, whereas “Aspect ratio” exhibited a negative correlation. Regarding texture features, “Tamura roughness”, “Saturation energy (GLCM)”, and “GLCM homogeneity” displayed negative correlations.

Upon incorporating the content features of the images, our study employed Lasso feature selection (refer to the Supplementary File for details on the regularization parameter selection process; in this instance, λ = 0.098). Subsequently, the dataset, initially comprising 89 image features, underwent refinement, isolating 15 salient variables, of which 8 are associated with image content (refer to Table 3).

3.2. LassoNet Feature Selection Results

In investigating the relationship between image features and attractiveness, specifically without the inclusion of content features, a dataset comprising 59 features (excluding content features) underwent processing through the LassoNet model. The feature selection results, depicted in Figure 4, reveal that the highest accuracy, reaching 0.719, is achieved when 38 features are selected. Detailed characterization information can be found in the Supplementary File.

After incorporating the content features of the images in our study, we utilized the LassoNet method to select features from the initially extracted set of 89 image features. Notably, we discovered that 81 features exhibited the highest accuracy, as illustrated in Figure 5. Importantly, among the top 50 highest-scoring features, content features alone constituted 27.3%, underscoring the crucial role of content features in studying the influence of image attractiveness.

The results displaying the importance scores of image features prior to the inclusion of image content features are presented in Figure 6. For detailed information on feature importance scores, please refer to the Supplementary File. The sum of the importance scores for each feature type was calculated, and the outcomes for all selected features in the absence of content features are depicted in Figure 6a. Texture features exhibit the highest combined score, constituting 75.2%, with color features holding the highest importance score at 15.8%, while shape features have the lowest importance score percentage, standing at 9.0%. Furthermore, we computed the average importance scores for each feature type, as illustrated in Figure 6b. Consistent with previous findings, the average importance score for image texture features remains the highest at 55.4%, followed by color features at 21.2%, and shape features at 23.4%. For detailed feature importance scores, please refer to the Supplementary File.

With the inclusion of content features, as depicted in Figure 7, texture features maintained their position as the highest contributors, accounting for 49.6% of the total importance ratings. Content features followed closely, representing 28.1% in the sum of importance ratings. When considering the average importance scores, content features rank third with a percentage of 22.1%. This underscores the significance of content features in determining image attractiveness.

3.3. Image Popularity Prediction

Initially, excluding any content features, Table 4 reveals a relatively low prediction accuracy across all models. The random forest model outperformed others with an accuracy of 0.498, while the plain Bayesian model performed the worst, at 0.418. Introducing Lasso feature selection resulted in an improved prediction accuracy for all models, with the plain Bayesian and nearest neighbor models exhibiting the most significant enhancements, improving by 0.117 and 0.185, respectively. Furthermore, Lasso feature selection effectively aided in feature screening. The plain Bayesian model screened six features, while the decision tree and support vector machine screened thirteen and nineteen features, respectively. It is important to note that the effectiveness of Lasso feature selection is closely tied to the appropriate choice of regularization parameters; an improper choice may compromise the predictive accuracy.

After incorporating the LassoNet method and content features, there was a substantial improvement in the predictive accuracy of all models. Through cross-validation, we determined the regression parameter for the LassoNet method to be alpha = 0.003. The models selected using the LassoNet method exhibited superior performance in both prediction accuracy and feature selection compared to those selected using Lasso features alone.

With the inclusion of the LassoNet method and content features, the accuracy of the naive Bayes (NB) model increased from 0.418 to 0.814, the accuracy of the k-nearest neighbor (KNN) model increased from 0.478 to 0.852, and the accuracy of the random forest (RF) model increased from 0.498 to 0.853. Similarly, the accuracy of the support vector machine (SVM) model improved from 0.489 to 0.846. Furthermore, the decision tree (DT) model’s accuracy improved from 0.438 to 0.831.

It is noteworthy that the number of selected features significantly decreased for all models compared to when no content features were added. Particularly, the support vector machine model retained only seven features. This indicates that the LassoNet method excels in selecting the most relevant features for the prediction task, thereby enhancing the generalization ability of the models.

4. Discussion

This study empirically validates the impact of integrating content features and employing various feature selection methods in machine learning on the prediction accuracy and feature screening efficacy. The experimental findings reveal a substantial enhancement in the prediction accuracies of all models when content features are added, compared to models without such features. Notably, the LassoNet method emerges as particularly adept at identifying the most pertinent features for the prediction task, thereby further elevating the prediction accuracies and the efficacy of feature selection.

The introduction of content features alone led to an improved prediction accuracy across all models, with the most noteworthy enhancements observed in the plain Bayesian and nearest neighbor models. The utilization of Lasso feature selection further boosted the prediction accuracy, yet its effectiveness hinged closely on the judicious selection of regularization parameters. However, when the LassoNet method was employed in conjunction with content features, the prediction accuracy of all models experienced even more substantial improvements. This underscores the superiority of the LassoNet method over Lasso feature selection, both in terms of prediction accuracy and its ability to discern the features most relevant to the prediction task.

In the realm of color features, this study discerned a positive correlation between light effect, saturation standard deviation, brightness standard deviation, and the colors brown, gray, orange, pink, and yellow with image popularity. This implies that the saturation, the luminance, and the utilization of specific colors such as brown, gray, orange, pink, and yellow exert a certain influence on image attractiveness [49]. The pronounced impact of color on image allure serves to elevate both the recognition and appeal of the image, fostering heightened sensory impact and memory retention [49,50]. Concurrently, variations in saturation and brightness emerge as pivotal factors in the selection and favorability of images, with light effects potentially indicative of picture quality and realism [51]. Thus, it is logical to posit that individuals lean towards images characterized by higher color saturation and brightness, and a predilection for brown, gray, orange, pink, and yellow. These colors elicit positive visual experiences and emotional resonance, underscoring their efficacy in visual design to enhance the overall attractiveness and aesthetics of images.

Regarding texture features, this investigation revealed a positive correlation between hue wavelet level 3, saturation wavelet level 1, saturation wavelet level 2, the saturation wavelet mean, the luminance wavelet mean, hue contrast, hue correlation, hue uniformity, saturation uniformity, luminance correlation, and GIST with image popularity. Conversely, grayscale distribution, hue wavelet level 1, hue wavelet level 2, saturation wavelet level 3, luminance wavelet level 2, luminance wavelet level 3, the hue wavelet mean, roughness, contrast, directionality, hue energy, saturation correlation, saturation energy, luminance contrast, luminance energy, and luminance uniformity exhibited a negative correlation with image popularity. As a result, the popularity of an image is intricately linked with factors such as color depth, texture complexity, and contrast [52,53,54]. Optimal visual appeal is achieved when the texture and color of an image are diverse and layered, complemented by colors that harmonize well with the environment. Moderately enhancing the contrast serves to heighten attention and perception. These findings underscore the significance of texture and color as pivotal factors in people’s preferences and enjoyment of images, providing valuable guidance for diverse visual designers [55,56,57].

In the context of shape features, this study uncovered positive correlations between the degree of detail, mean region size, Low DOF-saturation (Low Depth of Field- saturation), and image size with image popularity. Conversely, the number of edges, Low-DOF-hue, Low DOF-brightness, tristimulus saturation, tristimulus brightness, and aspect ratio exhibited negative correlations with the image popularity. This implies that images boasting greater detail, larger dimensions, higher Low DOF-saturation, and larger average area sizes are deemed more attractive. Furthermore, lower degrees of freedom in saturation suggest that images with clearer themes and fewer cluttered elements are more likely to capture attention. Conversely, a reduced number of edges, low degrees of freedom in hue and luminance, tristimulus saturation and luminance, and smaller aspect ratios were negatively linked to image attractiveness. These findings indicate a preference for images rich in detail and overt themes, with a reduced interest in images that are excessively monotonous or cluttered. Additionally, larger image and area size averages contribute to an enhanced visual experience and garner increased viewer attention. Moreover, smaller aspect ratios and higher low-frequency saturation can enhance image prominence and facilitate easier distinction, offering insights into why individuals prefer such images [58].

Regarding content features, this study identified positive correlations between the inclusion of number of people, number of airplanes, airplane regions, number of details, number of bottles, number of buses, number of bus regions, number of cats, number of motorcycles, number of chairs, face regions, congestion clutter, gradient indicated by intensity, and Beltrami traffic features and the popularity of the image. Conversely, the bottle region, number of cars, car region, cat region, motorcycle region, chair region, number of faces, face angle, agency region, color patch, grayscale patch, and number of people exhibited negative correlations with the popularity of the image. Hence, it is reasonable to posit that individuals are inclined towards images depicting diverse scenes and landscapes that evoke resonance, while eschewing those that are overly intricate and crowded [59]. These findings align with the viewing habits and psychological preferences of individuals, though specific preferences may vary in consideration of other factors, such as culture and social customs. These conclusions serve as a guide for designing images that are more appealing and visually striking. It is crucial to note that these features are not mutually independent and may have interactive relationships. In practical applications, selecting an appropriate combination of features is essential to enhancing the overall attractiveness of the image [60].

On the other hand, Spape et al.’s study on generating personalized attractive images using a brain–computer interface (BCI) represents a cutting-edge endeavor to integrate neuroscience techniques into image generation [61]. This method showcases a distinct advantage in personalized image design by analyzing an individual’s brainwave response to an image and generating an image that aligns with the individual’s preferences. While the BCI method excels in achieving highly personalized image generation, its reliance on specialized hardware, high technical requirements, and the complexity of experimental setup and data processing limit its wide application. In contrast, our study not only reduces the technical barriers but also broadens the applicability of the research findings across various domains through a universal data analysis approach and a readily reproducible experimental procedure. However, it is important to note that our approach may not offer solutions as direct and personalized as those from individual brainwave-based BCI methods.

5. Limitations

In this study, we primarily focused on quantitatively analyzing the relationship between image features and attractiveness using specific feature selection methods, which highlighted the necessity for further experimental research to validate these features’ applicability in real-world advertising design. Additionally, we acknowledge two limitations in our work. First, we did not explore the potential interactions between different categories of features, an aspect that warrants deeper investigation in future research endeavors to understand the complex dynamics between these features fully. Second, our study’s reliance on a limited range of image data sources poses a limitation to the generalizability and robustness of our findings. Future work will aim to incorporate a broader array of image data, such as from Wallpaper Engine, to enhance the model’s applicability and validity across various contexts. These limitations are integral to our study’s discussion and will guide the direction of our subsequent research efforts.

6. Conclusions

The findings of this study indicate that the inclusion of content features and the utilization of the LassoNet method contribute to enhanced prediction accuracy and feature selection in machine learning models. This offers novel ideas and methods applicable to popular image prediction, image recognition, classification, and recommendation fields. In future research, additional validation can be conducted on diverse datasets, exploring the application of alternative machine learning algorithms. Moreover, further optimization of the model’s regularization parameter selection and feature screening process can be pursued to investigate the impacts of more intricate image features on the model.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math12071005/s1, Table S1: Extracted image features. Table S2: Lambda parameter tuning procedure for Lasso models. Table S3: Importance scores of features selected by the LassoNet model in Study 1 related to image attractiveness. Table S4: Importance scores of features selected by the LassoNet model related to image attractiveness in Study 2. Table S5: Results of the Lambda cross-validation process.

Author Contributions

Conceptualization, Z.S. and P.W.; Data curation, K.Z. and Y.J.; Formal analysis, Y.Z.; Investigation, K.Z. and Y.J.; Methodology, K.Z., Y.Z. and P.W.; Project administration, P.W.; Resources, Y.Z.; Software, Y.J.; Supervision, Z.S. and P.W.; Validation, Z.S. and Y.J.; Writing—original draft, K.Z. and Y.Z.; Writing—review and editing, Z.S. and P.W. All authors have read and agreed to the published version of the manuscript.

Funding

This study is supported by Major Project of Natural Science Foundation of Jiangsu Education Department (22KJA630001) and National Natural Science Foundation of China (72271126).

Data Availability Statement

The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Arnold, M.; Goldschmitt, M.; Rigotti, T. Dealing with Information Overload: A Comprehensive Review. Front. Psychol. 2023, 14, 1122200. [Google Scholar] [CrossRef] [PubMed]
Bawden, D.; Robinson, L. Information Overload: An Overview. In Oxford Encyclopedia of Political Decision Making; Oxford University Press: Oxford, UK, 2020; pp. 1–60. [Google Scholar] [CrossRef]
Hwang, M.I.; Lin, J.W. Information Dimension, Information Overload and Decision Quality. J. Inf. Sci. 1999, 25, 213–218. [Google Scholar] [CrossRef]
Cheng, J.; Sun, A.; Zeng, D. Information Overload and Viral Marketing: Countermeasures and Strategies. In Advances in Social Computing; Chai, S.-K., Salerno, J.J., Mabry, P.L., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2010; Volume 6007, pp. 108–117. ISBN 978-3-642-12078-7. [Google Scholar]
Rehman, A.; Ahmad, I.; Amin, K.; Noor, N.; Rehman, A. Marketing Overload: The Impact of Information Overload on Brand Recall (A Case Study of Students of the University of Swat). J. Soc. Sci. Rev. 2023, 3, 70–78. [Google Scholar] [CrossRef]
Meyer, J.-A. Information Overload in Marketing Management. Mark. Intell. Plan. 1998, 16, 200–209. [Google Scholar] [CrossRef]
Samani, M.C. Visual Images in Advertisements: An Alternative Language. J. Komunlkasi Maraysian J. Commun. 2006, 22, 252–273. [Google Scholar]
Chen, Y.; Jin, O.; Xue, G.-R.; Chen, J.; Yang, Q. Visual Contextual Advertising: Bringing Textual Advertisements to Images. In Proceedings of the AAAI Conference on Artificial Intelligence, Atlanta, GA, USA, 11–15 July 2010; Volome 24, pp. 1314–1320. [Google Scholar]
Obermiller, C.; Sawyer, A.G. The Effects of Advertisement Picture Likeability on Information Search and Brand Choice. Mark. Lett. 2011, 22, 101–113. [Google Scholar] [CrossRef]
Kergoat, M.; Meyer, T.; Merot, A. Picture-Based Persuasion in Advertising: The Impact of Attractive Pictures on Verbal Ad’s Content. J. Consum. Mark. 2017, 34, 624–635. [Google Scholar] [CrossRef]
Tang, X.; Luo, W.; Wang, X. Content-Based Photo Quality Assessment. IEEE Trans. Multimed. 2013, 15, 1930–1943. [Google Scholar] [CrossRef]
Abousaleh, F.S.; Cheng, W.-H.; Yu, N.-H.; Tsao, Y. Multimodal Deep Learning Framework for Image Popularity Prediction on Social Media. IEEE Trans. Cogn. Dev. Syst. 2021, 13, 679–692. [Google Scholar] [CrossRef]
Han, J.; Li, H.; Lin, H.; Wu, P.; Wang, S.; Tu, J.; Lu, J. Depression Prediction Based on LassoNet-RNN Model: A Longitudinal Study. Heliyon 2023, 9, e20684. [Google Scholar] [CrossRef]
Lin, H.; Han, J.; Wu, P.; Tang, H.; Zhu, L.; Wang, J.; Tu, J. Machine Learning and Human-machine Trust in Healthcare: A Systematic Survey. CAAI Trans. Intell. Technol. 2023; in press. [Google Scholar] [CrossRef]
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
Gelli, F.; Uricchio, T.; Bertini, M.; Del Bimbo, A.; Chang, S.-F. Image Popularity Prediction in Social Media Using Sentiment and Context Features. In Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane, Australia, 26–30 October 2015; pp. 907–910. [Google Scholar]
McParlane, P.J.; Moshfeghi, Y.; Jose, J.M. “Nobody Comes Here Anymore, It’s Too Crowded”; Predicting Image Popularity on Flickr. In Proceedings of the International Conference on Multimedia Retrieval, Glasgow, UK, 1–4 April 2014; pp. 385–391. [Google Scholar]
Gayberi, M.; Oguducu, S.G. Popularity Prediction of Posts in Social Networks Based on User, Post and Image Features. In Proceedings of the 11th International Conference on Management of Digital EcoSystems, Limassol, Cyprus, 12–14 November 2019; pp. 9–15. [Google Scholar]
Wang, W.; Zhang, W. Combining Multiple Features for Image Popularity Prediction in Social Media. In Proceedings of the 25th ACM international conference on Multimedia, Mountain View, CA, USA, 23–27 October 2017; pp. 1901–1905. [Google Scholar]
Cappallo, S.; Mensink, T.; Snoek, C.G.M. Latent Factors of Visual Popularity Prediction. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, Shanghai, China, 16–23 June 2015; pp. 195–202. [Google Scholar]
Hidayati, S.C.; Chen, Y.-L.; Yang, C.-L.; Hua, K.-L. Popularity Meter: An Influence- and Aesthetics-Aware Social Media Popularity Predictor. In Proceedings of the 25th ACM International Conference on Multimedia, Mountain View, CA, USA, 23–27 October 2017; pp. 1918–1923. [Google Scholar]
Georgiou, T.; Liu, Y.; Chen, W.; Lew, M. A Survey of Traditional and Deep Learning-Based Feature Descriptors for High Dimensional Data in Computer Vision. Int. J. Multimed. Inf. Retr. 2020, 9, 135–170. [Google Scholar] [CrossRef]
Egmont-Petersen, M.; de Ridder, D.; Handels, H. Image Processing with Neural Networks—A Review. Pattern Recognit. 2002, 35, 2279–2301. [Google Scholar] [CrossRef]
Johnson, C.R.; Hendriks, E.; Berezhnoy, I.J.; Brevdo, E.; Hughes, S.M.; Daubechies, I.; Li, J.; Postma, E.; Wang, J.Z. Image Processing for Artist Identification. IEEE Signal Process. Mag. 2008, 25, 37–48. [Google Scholar] [CrossRef]
Van der Walt, S.; Schönberger, J.L.; Nunez-Iglesias, J.; Boulogne, F.; Warner, J.D.; Yager, N.; Gouillart, E.; Yu, T. Scikit-Image: Image Processing in Python. PeerJ 2014, 2, e453. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Xie, Y. Is a Picture Worth a Thousand Words? An Empirical Study of Image Content and Social Media Engagement. J. Mark. Res. 2020, 57, 1–19. [Google Scholar] [CrossRef]
Iyer, A.; Webster, J.; Hornsey, M.J.; Vanman, E.J. Understanding the Power of the Picture: The Effect of Image Content on Emotional and Political Responses to Terrorism. J. Appl. Soc. Pyschol. 2014, 44, 511–521. [Google Scholar] [CrossRef]
Weinberg, A.; Hajcak, G. Beyond Good and Evil: The Time-Course of Neural Activity Elicited by Specific Picture Content. Emotion 2010, 10, 767–782. [Google Scholar] [CrossRef]
Süsstrunk, S.; Buckley, R.; Swen, S. Standard RGB Color Spaces. In Proceedings of the Color and Imaging Conference; Society of Imaging Science and Technology, Scottsdale, AZ, USA, 16–19 November 1999; Volume 7, pp. 127–134. [Google Scholar]
Sharma, S.; Tandukar, J.; Bista, R. Generating Harmonious Colors through the Combination of N-Grams and K-Means. J. Comput. Theor. Appl. 2023, 1, 140–150. [Google Scholar] [CrossRef]
Chernov, V.; Alander, J.; Bochko, V. Integer-Based Accurate Conversion between RGB and HSV Color Spaces. Comput. Electr. Eng. 2015, 46, 328–337. [Google Scholar] [CrossRef]
Ganesan, P.; Rajini, V. Assessment of Satellite Image Segmentation in RGB and HSV Color Space Using Image Quality Measures. In Proceedings of the 2014 International Conference on Advances in Electrical Engineering (ICAEE), Vellore, India, 9–11 January 2014; pp. 1–5. [Google Scholar]
Saravanan, G.; Yamuna, G.; Nandhini, S. Real Time Implementation of RGB to HSV/HSI/HSL and Its Reverse Color Space Models. In Proceedings of the 2016 International Conference on Communication and Signal Processing (ICCSP), Melmaruvathur, India, 6–8 April 2016; pp. 462–466. [Google Scholar]
Liao, X.; Yin, J.; Chen, M.; Qin, Z. Adaptive Payload Distribution in Multiple Images Steganography Based on Image Texture Features. IEEE Trans. Dependable Secur. Comput. 2020, 19, 897–911. [Google Scholar] [CrossRef]
Ma, W.-Y.; Manjunath, B.S. Texture Features and Learning Similarity. In Proceedings of the Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 18–20 June 1996; pp. 425–430. [Google Scholar]
Mohanaiah, P.; Sathyanarayana, P.; GuruKumar, L. Image Texture Feature Extraction Using GLCM Approach. Int. J. Sci. Res. Publ. 2013, 3, 1–5. [Google Scholar]
Hiremath, P.S.; Pujari, J. Content Based Image Retrieval Using Color, Texture and Shape Features. In Proceedings of the 15th International Conference on Advanced Computing and Communications (ADCOM 2007), Guwahati, India, 18–21 December 2007; pp. 780–784. [Google Scholar]
Mehtre, B.M.; Kankanhalli, M.S.; Lee, W.F. Shape Measures for Content Based Image Retrieval: A Comparison. Inf. Process. Manag. 1997, 33, 319–337. [Google Scholar] [CrossRef]
Mingqiang, Y.; Kidiyo, K.; Joseph, R. A Survey of Shape Feature Extraction Techniques. Pattern Recognit. 2008, 15, 43–90. [Google Scholar]
Zenggang, X.; Zhiwen, T.; Xiaowen, C.; Xue-min, Z.; Kaibin, Z.; Conghuan, Y. Research on Image Retrieval Algorithm Based on Combination of Color and Shape Features. J. Sign Process Syst. 2021, 93, 139–146. [Google Scholar] [CrossRef]
Hum, N.J.; Chamberlin, P.E.; Hambright, B.L.; Portwood, A.C.; Schat, A.C.; Bevan, J.L. A Picture Is Worth a Thousand Words: A Content Analysis of Facebook Profile Photographs. Comput. Hum. Behav. 2011, 27, 1828–1833. [Google Scholar] [CrossRef]
Lehmann, T.M.; Güld, M.O.; Thies, C.; Fischer, B.; Spitzer, K.; Keysers, D.; Ney, H.; Kohnen, M.; Schubert, H.; Wein, B.B. Content-Based Image Retrieval in Medical Applications. Methods Inf. Med. 2004, 43, 354–361. [Google Scholar] [CrossRef] [PubMed]
Osborne, M.R.; Presnell, B.; Turlach, B.A. On the LASSO and Its Dual. J. Comput. Graph. Stat. 2000, 9, 319–337. [Google Scholar] [CrossRef]
Ranstam, J.; Cook, J.A. LASSO Regression. J. Br. Surg. 2018, 105, 1348. [Google Scholar] [CrossRef]
Zhao, P.; Yu, B. On Model Selection Consistency of Lasso. J. Mach. Learn. Res. 2006, 7, 2541–2563. [Google Scholar]
Chen, Z.; Zeng, W.; Yang, Z.; Yu, L.; Fu, C.-W.; Qu, H. LassoNet: Deep Lasso-Selection of 3D Point Clouds. IEEE Trans. Vis. Comput. Graph. 2019, 26, 195–204. [Google Scholar]
Lemhadri, I.; Ruan, F.; Tibshirani, R. Lassonet: Neural Networks with Feature Sparsity. In Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, Virtual, 13–15 April 2021; pp. 10–18. [Google Scholar]
Wen, X.; Yang, Z. Classification Efficiency of LassoNet Model in Image Recognition. In Proceedings of the 2022 5th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), Wuhan, China, 22–24 April 2022; pp. 384–391. [Google Scholar]
Van de Weijer, J.; Gevers, T.; Bagdanov, A.D. Boosting Color Saliency in Image Feature Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 28, 150–156. [Google Scholar] [CrossRef] [PubMed]
Wang, B.; Yu, Y.; Wong, T.-T.; Chen, C.; Xu, Y.-Q. Data-Driven Image Color Theme Enhancement. ACM Trans. Graph. 2010, 29, 1–10. [Google Scholar] [CrossRef]
Afifi, A.J.; Ashour, W.M. Image Retrieval Based on Content Using Color Feature. Int. Sch. Res. Not. 2012, 2012, 248285. [Google Scholar] [CrossRef]
Freeborough, P.A.; Fox, N.C. MR Image Texture Analysis Applied to the Diagnosis and Tracking of Alzheimer’s Disease. IEEE Trans. Med. Imaging 1998, 17, 475–478. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Tan, J.; Shatadal, P. Classification of Tough and Tender Beef by Image Texture Analysis. Meat Sci. 2001, 57, 341–346. [Google Scholar] [CrossRef] [PubMed]
Mendoza, F.; Dejmek, P.; Aguilera, J.M. Colour and Image Texture Analysis in Classification of Commercial Potato Chips. Food Res. Int. 2007, 40, 1146–1154. [Google Scholar] [CrossRef]
Cheng, Y.-C.; Chen, S.-Y. Image Classification Using Color, Texture and Regions. Image Vis. Comput. 2003, 21, 759–776. [Google Scholar] [CrossRef]
Liapis, S.; Tziritas, G. Color and Texture Image Retrieval Using Chromaticity Histograms and Wavelet Frames. IEEE Trans. Multimed. 2004, 6, 676–686. [Google Scholar] [CrossRef]
Wang, X.-Y.; Yu, Y.-J.; Yang, H.-Y. An Effective Image Retrieval Scheme Using Color, Texture and Shape Features. Comput. Stand. Interfaces 2011, 33, 59–68. [Google Scholar] [CrossRef]
Zhang, D.; Lu, G. Review of Shape Representation and Description Techniques. Pattern Recognit. 2004, 37, 1–19. [Google Scholar] [CrossRef]
Shin, D.; He, S.; Lee, G.M.; Whinston, A.B.; Cetintas, S.; Lee, K.-C. Enhancing Social Media Analysis with Visual Data Analytics: A Deep Learning Approach; SSRN: Amsterdam, The Netherlands, 2020. [Google Scholar]
Khosla, A.; Das Sarma, A.; Hamid, R. What Makes an Image Popular? In Proceedings of the 23rd International Conference on World Wide Web, Seoul, Republic of Korea, 7–11 April 2014; pp. 867–876. [Google Scholar]
Spape, M.; Davis, K.M.; Kangassalo, L.; Ravaja, N.; Sovijärvi-Spapé, Z.; Ruotsalo, T. Brain-Computer Interface for Generating Personally Attractive Images. IEEE Trans. Affect. Comput. 2021, 14, 637–649. [Google Scholar] [CrossRef]

Figure 1. LassoNet model structure.

Figure 2. Experimental design process.

Figure 3. Heatmap for feature correlation analysis.

Figure 4. Results of LassoNet feature selection for Study 1.

Figure 5. Results of LassoNet feature selection for Study 2.

Figure 6. (a) Percentage of importance scores for selected features. (b) Percentage of average importance scores for selected features (no content features).

Figure 7. (a) Percentage of importance scores for selected features. (b) Percentage of average importance scores for selected features.

Table 1. Categorization of collected images.

Nature and Environment	Culture and Society	Technology and Industry	Entertainment and Life
Animals	Art	Science	Indoor
Nature	Architecture	Technology	Illustration
Park	History	Industry	Miscellaneous
Scenery	Business	Objects	Sports
Environment	Religion		Food
	Politics		Health
	Retro		Travel
	Abstract		Transportation
			People

Table 2. Results after Lasso feature selection.

Image Feature Categories	Selected Features	Feature Weights	Post-EST OLS
Color features	Black	−0.335	−0.489
	Gray	0.038	−0.077
Shape features	Image size	0.001	0.002
	Aspect ratio	−0.161	−0.373
	Low DOF hue	0.074	0.300
Texture features	Tamura roughness	−0.200	−0.446
	GLCM saturation energy	−0.152	−0.284
	GLCM homogeneity	−0.407	−0.786

Table 3. Lasso feature selection results after adding content features.

Image Feature Categories	Selected Features	Feature Weights	Post-EST OLS
Color features	Black	−0.054	−0.224
	Brown	0.159	0.294
	Yellow	0.070	0.269
Shape features	Low DOF brightness	−0.045	−0.151
Shape features	Aspect ratio	−0.009	−0.180
Texture features	GLCM correlation	0.165	0.388
Texture features	GLCM homogeneity	−0.417	−0.743
Content features	objnumperson	−0.120	−0.465
	objnumareoplane	0.289	0.475
	objareaboxaerop	0.053	0.165
	objnumbicycle	0.161	0.852
	objnumbus	0.676	1.635
	objareaboxcat	−0.190	−0.875
	objnumchair	0.215	1.115
	numfaces	−0.761	−1.890

Table 4. The effect of feature selection on the accuracy of image attractiveness prediction.

Methods	Lasso *	LassoNet *	Lasso	LassoNet
NB	0.418	0.535	0.529	0.814
KNN	0.478	0.663	0.761	0.852
DT	0.438	0.653	0.723	0.831
RF	0.498	0.677	0.771	0.853
SVM	0.489	0.663	0.694	0.846

* Content features of the image are added to the model.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sun, Z.; Zhang, K.; Zhu, Y.; Ji, Y.; Wu, P. Unlocking Visual Attraction: The Subtle Relationship between Image Features and Attractiveness. Mathematics 2024, 12, 1005. https://doi.org/10.3390/math12071005

AMA Style

Sun Z, Zhang K, Zhu Y, Ji Y, Wu P. Unlocking Visual Attraction: The Subtle Relationship between Image Features and Attractiveness. Mathematics. 2024; 12(7):1005. https://doi.org/10.3390/math12071005

Chicago/Turabian Style

Sun, Zhoubao, Kai Zhang, Yan Zhu, Yanzhe Ji, and Pingping Wu. 2024. "Unlocking Visual Attraction: The Subtle Relationship between Image Features and Attractiveness" Mathematics 12, no. 7: 1005. https://doi.org/10.3390/math12071005

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Unlocking Visual Attraction: The Subtle Relationship between Image Features and Attractiveness

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Sources and Processing

2.2. Image Processing

2.2.1. The Color Features of the Image

2.2.2. Texture Features of the Image

2.2.3. Shape Features of the Image

2.2.4. Content Features of the Image

2.3. Model Building

2.3.1. Lasso Model

2.3.2. LassoNet Model

2.4. Study Design

3. Results

3.1. Lasso Feature Selection Results

3.2. LassoNet Feature Selection Results

3.3. Image Popularity Prediction

4. Discussion

5. Limitations

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI