Nonlinear Perception Characteristics Analysis of Ocean White Noise Based on Deep Learning Algorithms

Qian, Tao; Li, Ying; Chen, Jun

doi:10.3390/math12182892

Open AccessArticle

Nonlinear Perception Characteristics Analysis of Ocean White Noise Based on Deep Learning Algorithms

by

Tao Qian

,

Ying Li

^* and

Jun Chen

School of Design, Anhui Polytechnic University, Wuhu 241000, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(18), 2892; https://doi.org/10.3390/math12182892

Submission received: 12 August 2024 / Revised: 9 September 2024 / Accepted: 14 September 2024 / Published: 17 September 2024

(This article belongs to the Special Issue Modern Trends in Nonlinear Dynamics in Ocean Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Caused by nonlinear vibration, ocean white noise exhibits complex dynamic characteristics and nonlinear perception characteristics. To explore the potential application of ocean white noise in engineering and health fields, novel methods based on deep learning algorithms are proposed to generate ocean white noise, contributing to marine environment simulation in ocean engineering. A comparative study, including spectrum analysis and auditory testing, proved the superiority of the generation method using deep learning networks over general mathematical or physical methods. To further study the nonlinear perception characteristics of ocean white noise, novel experimental research based on multi-modal perception research methods was carried out within a constructed multi-modal perception system environment, including the following two experiments. The first audiovisual comparative experiment thoroughly explores the system’s user multi-modal perception experience and influence factors, explicitly focusing on the impact of ocean white noise on human perception. The second sound intensity testing experiment is conducted to further explore human multi-sensory interaction and change patterns under white noise stimulation. The experimental results indicate that user visual perception ability and state reach a relatively high level when the sound intensity is close to 50 dB. Further numerical analysis based on the experimental results reveals the internal influence relationship between user perception of multiple senses, showing a fluctuating influence law to user visual concentration and a curvilinear influence law to user visual psychology from the sound intensity of ocean white noise. This study underscores ocean white noise’s positive effect on human perception enhancement and concentration improvement, providing a research basis for multiple field applications such as spiritual healing, perceptual learning, and artistic creation for human beings. Importantly, it provides valuable references and practical insights for professionals in related fields, contributing to the development and utilization of the marine environment.

Keywords:

nonlinear dynamics; perception characteristics; ocean white noise; ocean engineering; deep learning algorithms

MSC:

62-07

1. Introduction

With the development of marine environment and ocean engineering, the noise induced by the nonlinear vibration in ocean engineering triggers significant attention from academic and technical personnel, causing adverse effects on the stealth of ships and marine organisms. They extensively conduct nonlinear analysis of ocean noise and apply it to ocean engineering and ship detection [1]. Ocean nonlinear noise always reveals the dynamic factors caused by seawater flow through the research on deep-sea low-frequency noise [2,3]. Nonlinear noise and underwater acoustic information [4] have always been important sources for observing and studying ocean information, such as ship detection [5] and marine environment observation [6]. As a random audio signal with constant power and spectral density [7], ocean white noise, a term used to describe the ambient sound in the ocean, is not just a by-product of the natural marine environment, such as rain, wind, and breaking waves, but also a valuable source of information. It presents complex nonlinear characteristics and is suitable for nonlinear analysis methods. Liu et al. [8] proposed that the present white noise wave generation method is feasible in marine engineering tests, and the wave with variable amplitude and higher energy spectral density can be generated. In early studies, uniform random sequences provided a well-known method for generating white noises, which are usually considered to lack diversity and flexibility. In recent studies, mathematical models such as autocorrelation coefficients matrix [9], or physical methods such as the hydraulic transmission function, are commonly used to generate white noise. Frequency-shift keying (FSK) technology and the BELLHOP module of MATLAB can be used to modulate ocean white noise [10]. The latest research uses deep learning algorithms to analyze and process white noise [11,12]. For example, Jun et al. [13] use convolutional neural networks to control earthquakes’ natural environment in some East China Sea areas. Simultaneously, by utilizing the mutual conversion between sound and spectrogram images, it is not difficult to implement the generation of white noise spectrograms using deep learning networks because the image generation networks [14] have been maturely applied. In terms of sound-image conversion, Discrete Wavelet Transform (DWT) [15] and Fourier transform [16] are commonly used algorithms. For example, the Discrete Fourier Transform (DFT) method can convert white noise to spectrograms, and the Inverse Discrete Fourier Transform (IDFT) method can restore the spectrogram to white noise.

In addition to the ability to characterize nonlinear dynamics, ocean white noise has also revealed its significant impact on organisms and humans. Numerous studies have shown that ocean white noise may positively impact humans and animals, including marine organisms. Reed et al. [17] conducted a study on the influence of ocean white noise on bird songs, using large-scale playbacks of ocean surf in coastal areas and whitewater river noise in riparian areas. Their findings, which showed that the birds’ song features changed with the amplitude and frequency of background noise, underscore the long-standing influence of natural soundscapes on vocal behavior. This research also extends to the study of marine ecosystems [18] and healthcare [19]. Related studies have consistently shown that white noise has a significant and positive influence on human perception. For instance, Warjri et al. [20] studied the impact of white noise, such as waves and rainfall, on sleep quality through testing experiments. Akiyama et al. [21] found that white noise significantly influenced the functional connectivity of scalp electroencephalogram (EEG) in newborns compared with music and non-interference auditory environments. Ohbayashi et al. [22] proposed that exposure to white noise facilitated cognitive function, and a moderate amount of auditory noise benefited individuals in hypodopaminergic states. Hiwa et al. [23] concluded that white noise helped people strengthen their bilateral functional connection between brain regions related to memory tasks and improved the activity of brain regions that control attention through the comparative test of subjects’ memory tasks under white noise and a silent environment. Masuda et al. [24] studied the affection to user emotion from pure sound and white noise under average volume. They concluded that the pure sound stimulus was more likely to cause user aversion than the white noise stimulus. Hong et al. [25] evaluated the comfort and pleasure of participants in both audio-only and audiovisual environments by leading eight natural sounds to urban recreational areas, including four bird sounds and four water sounds. It shows that the beautiful white noises simulating nature have a positive impact on user perception and emotion.

This essay analyzes the nonlinear dynamic characteristics of ocean white noise and its nonlinear perception characteristics based on deep learning algorithms. It proposed novel methods based on deep learning algorithms to generate ocean white noise and dynamic feature graphics, contributing to marine environment simulation in ocean engineering. Simultaneously, novel experimental methods based on multi-modal perception research methods are carried out to explore the nonlinear perception characteristics of ocean white noise, providing a research basis for multiple field applications such as spiritual healing, perceptual learning, and artistic creation for human beings. In the specific research process, it requires comprehensive research, developing an innovative and comprehensive perceptual system that offers a promising multi-modal perception environment by generating an ocean white noise auditory environment, simultaneously presenting a dynamic graphic visual environment, and enabling the user’s haptic interaction. Nonlinear multi-modal perception research methods are employed to explore the system user’s multi-modal perception experience and its influencing factors, focusing on the impact of ocean white noise on human perception. The specific research methods include theoretical analysis and testing experiments, using various means such as sound testing and eye movement experiments. The findings will provide a significant reference for designing white noise perception systems for users’ attention enhancement and perceptual improvement, offering a hopeful outlook for future research and application.

2. Novel Methods for Generating Ocean White Noise Perceptual System

In order to analyze the nonlinear perception characteristics of ocean white noise, we utilize deep learning algorithms to create an ocean white noise perceptual system and simultaneously utilize interactive intelligent algorithms to construct an immersive experience environment by connecting tactile sensors and external equipment. The specific construction method is depicted in Figure 1.

With the generation of ocean white noise as its core part, the system construction process is shown in Figure 2. Firstly, the Discrete Fourier Transform (DFT) module is used to convert the white noise from sound signals into spectrograms that DCNN can process. Then, the DCNN is used to extract features and recognize the spectrograms. The extracted feature will be input into a generated network (attn GAN) to generate spectrograms. By connecting a text processing (TP) and attention module to the generated network, the system can randomly generate a white noise spectrogram of the corresponding category based on the prompt text input by the user. Moreover, the Inverse Discrete Fourier Transform (IDFT) module will convert the spectrogram to sound signals. Simultaneously, corresponding feature graphics will be called from the dynamic graphics database according to the sound recognition results displayed on the system terminal along with the ocean white noise. The following text will expand on the specific generation method of ocean white noise and dynamic graphics.

2.1. White Noise Generation Based on Deep Learning Algorithms

In this study, it employs the innovative use of deep learning algorithms to generate ocean white noise for perceptual systems. The specific method involves the following steps.

Data acquisition and signal processing:

In this study, we select four typical types of ocean white noise as research objects, which have been applied in the fields of sleep aid and stress relief relaxation, including sea breeze, sea wave, deep sea, and whale. Also, we select other four typical types of white noise from other natural environments to conduct supplementary and possible comparative research in the future, including raining, burnfire, forest and streaming. For each white noise category, we randomly collect 50 white noise samples and convert them into spectrograms that DCNN can process. Here, the Discrete Fourier Transform (DFT) module is used to implement the conversion from white noise to spectrogram, which yields

F (k) = \sum_{k = 0}^{N - 1} f (n) e^{- j 2 π \frac{k n}{N}}

(1)

where F(k) is the complex representation of the frequency domain, which constitutes the spectrogram. f(n) is the input time-domain signal. N is the signal sample number. n is the signal sample index, and k is the the frequency domain sample index. j is the imaginary unit.

Then, the spectrogram samples are expanded to 500 in each type through data augmentation, including translation, rotation, and scaling. Thus, there are a total of 4000 spectrogram sample data, which are used to train the deep learning networks to generate random ocean white noise according to the experimental setup or user requirements.

2.: Spectrogram feature extraction:

The Deep Convolutional Neural Network (DCNN) extracts features from the spectrogram. It is a well-known network model for image recognition and feature extraction, which includes the input layer, convolutional layers, pooling layers, fully connected layers, and output layer [26], as shown in Figure 3.

When sample images are input into the DCNN, convolutional layers will use multiple convolutional kernels to extract image features in the image recognition and feature extraction process. Then, the pooling layer performs dimensionality reduction on the feature map to reduce computational complexity and prevent overfitting. Finally, the fully connected layer flattens the output of the pooling layer and connects it to a classifier for outputting the classification results. As a powerful deep learning network model, the DCNN is suitable and highly adaptable for processing various images. It efficiently and accurately completes image feature extraction and recognition, showcasing the model’s versatility. During this process, the convolution operation plays a crucial role, as it identifies and learns the complex patterns in the spectrogram. These features can then train a sound synthesis network, generating natural sound effects for the system terminal display. The DCNN extracts spectrogram features through a sophisticated process of multi-channel convolution operation in multilayers, which yields

g (i, j) = δ (I (x, y) \otimes K_{m \times n} + b)

(2)

where g(i, j) reveals the convolution result, I(x, y) is the pixel values of the Input spectrogram,

\otimes

represents the convolution operation, and K_m×n means a convolutional kernel of m × n size. b is the offset value.

δ

is the Sigmoid activation function. During multi-channel convolution operation in multiple layers, the relationship between layers can be represented as

g^{l} = δ (g^{l - 1} \otimes K^{l} + b^{l})

(3)

where l is the layer number of a convolutional layer, while different layers may use different convolution kernels. Then, the spectrogram feature extracted by the DCNN is used to train the GAN to generate white noise spectrograms.

3.: White noise spectrogram generation:

Due to the system generating corresponding types of white noise based on experimental settings or user requirements, attn GAN, which is also constructed based on CNNs, generates white noise spectrograms after being fully trained from input text. In attn GAN, whose structure is included in Figure 2, the Attention module, which concludes CNNs, extracts and enhances the input text feature to form the noise vector. The generator (G) then uses this noise vector as input data and uses deconvolution to convert data into images. The discriminator (D) plays a crucial role in the system, distinguishing whether the generated images are real data or fake ones. Both the generator and discriminator utilize CNNs as their core structures, which allows the generator to learn and generate realistic data through repeated adversarial training between these two components. The generator is in a constant state of learning and adaptation, continuously improving its ability to fit the data through a series of processes.

Integrating an attention module into the generation network of GAN establishes a clear correspondence between the input text feature vector (extracted by TP) and the output results of the CNN classifier. Here, the main methods for generating spectrograms can be described as

h₀ = G₀[z, F^te(e)]

(4)

h_{i} = G_{i} [h_{i} -_{1}, F_{i}^{a t t e n} (\bar{e})]

(5)

x_i = G_i(h_i)

(6)

where i represents different levels of generators and discriminators. h is the hidden node that contains the image feature as the input of the generator, and z is the input noise vector extracted by the Attention module from input text. F^te represents the text feature extraction operation in the TP module, and F^atten represents the attention operation at each level to establish the connection between text features and image features. G_i is the operation module of each network level for generating images, and x_i represents the images generated from different generators.

By this method, the generator converts the input noise vector z into an image through multiple layers of inverted and continuously enlarged convolution operations, and the discriminator judges its authenticity, guiding the generation of image data that are close to reality. The judgment method of the discriminator can be expressed as

L_{G_{i}} = - \frac{1}{2} E_{x_{i} ~ P_{G_{i}}} [\ln D_{i} (x_{i})] - \frac{1}{2} E_{x_{i} ~ P_{G_{i}}} [\ln D_{i} (x_{i}, e)]

(7)

where D_i represents the discriminators at different stages, and P_Gi represents the distribution of images generated from G_i. When L_Gi reaches a sufficiently small value, a sufficiently realistic image will be obtained.

Effect testing and model optimization: It evaluates the generated ocean noise using real-person testing and relevant indicators to ensure its similarity with real sound effects and optimize the involved network models and training strategies based on the evaluation results.

4.: Sound restoration and adjustment:

Then, the white noise spectrograms generated by deep learning networks are restored to authentic sound through the Inverse Discrete Fourier Transform (IDFT) operation, which yields

f (n) = \sum_{k = 0}^{N - 1} F (k) e^{j 2 π \frac{k n}{N}}

(8)

The ocean white noise generated by the proposed method is further processed to provide complete control over the sound conditions in multi-modal perception testing experiments. This control extends to sound quality, intensity, and frequency. In this study, an audio processing module is used for sound adjustment employing linear algorithms.

A (n) = α f (n) + β f (n - 1)

(9)

where f(n) and A(n) represent the input and output sound signals, respectively. α and β are coefficients for adjusting pitch and volume levels, respectively. This linear algorithm module can adjust the sound’s external conditions without changing its original dynamic range and spectral characteristics.

Compared with traditional generation methods of uniform random sequences or other mathematical models, such as an autocorrelation coefficients matrix, ocean white noise generated by deep learning networks presents more authenticity, diversity, and auditory comfort. Simultaneously, it is convenient to change the sensory characteristics of white noise by adjusting the model parameters of deep learning networks for human perception testing experiments. A comparative study is carried out between ocean white noise produced by general mathematical models and deep learning networks, including spectrum analysis and auditory testing, to confirm this conclusion. Taking the sea wave as an example, three groups of white noises are produced with uniform random sequences, an autocorrelation coefficients matrix, and deep learning networks, respectively. Ten white noises were selected from each group to conduct spectrum analysis and auditory testing, comparing them with those collected from natural marine environments. The former determines the authenticity level of white noise by comparing the frequency spectrum of the generated white noises with the real collected ones. The latter, a human auditory test, subjectively scores the realism and comfort of white noise. In this test, 20 students with normal hearing in the age range of 20–25 who passed the listening test are selected to take the human testing. They were asked to listen to white noise in a closed, quiet environment and rate its authenticity and comfort. During the experiment, in order to avoid interference from other senses, they were asked to close their eyes. The playback duration of each white noise is set to 30 s. The data from the above experiments were statistically analyzed, as shown in Table 1, in which the human testing data are the mean value of 20 subjects.

From the results of the sound comparison testing experiment, it can be concluded that ocean white noise generated by the deep learning network is more authentic and comfortable for human listening. This finding is significant in the fields of deep learning and sound engineering, as it demonstrates the potential of deep learning networks in generating more realistic and diverse soundscapes, particularly ocean white noise. Simultaneously, a pre-trained deep learning network model with a large amount of data can generate more realistic and diverse ocean white noise, which is incomparable to other methods.

2.2. Construction of Feature Graphic Database

Our study also uses deep learning algorithms to generate feature graphics that simulate natural scenery. This process is crucial in constructing the visual environment of the intelligent multimedia system. In order to enrich the audiovisual effects, we set 20 static or dynamic feature graphics for each type of white noise. The graphic creation process involves image feature extraction, feature graphic creation, and graphic database construction. Random static or dynamic graphics will be called from the database according to the sound recognition results by DCNN, which determines the type of white noise.

Image Feature Extraction:

We meticulously collect a certain number (20) of natural images for each type of white noise to create a dynamic graphics library. The convolution operation with precision is then applied to extract the image features of natural scenery images, and the specific method refers to the extraction method of spectrogram features in Section 2.1. Taking a sample image of a sea wave, for example, the image brightness feature is enhanced by brightness feature enhancement and threshold processing, and then the convolution operation (mentioned above) is applied to obtain its feature map. An inverse operation is sometimes applied to the feature map to highlight its essential parts. Figure 4 shows the image feature processing and extraction process.

2.: Feature Graphic Creation:

Based on the feature or inverse feature maps from sample images, feature graphics are created by using feature point sampling and connecting. Taking the sample image of sea waves, for example, it takes feature point sampling to its inverse feature map with the image size of 224 × 224, as shown in Figure 5a. When the sampling steps in the X and Y directions are both set as three pixels, it obtains the feature point map, as shown in Figure 5b. It optimizes the feature point sampling map by resetting the sampling step in the Y direction to two times, as shown in Figure 5c, which reflects the features of waves more lively and clearly. By connecting the feature points with a limited condition that the distance between two points is lower than a specific value (six pixels in this method), the line graphic is obtained, which simulates the wave shape in nature, as shown in Figure 5d. By this method, a straightforward feature graphic map is obtained after removing the discrete points or line segments with a proper threshold.

3.: Graphic database creation:

Feature graphics are then imported and stored by category in a database to create a graphics database using numpy tools, forming a static graphic database. This database connects with our proposed network model to provide corresponding graphics according to sound recognition results by DCNN. It must also create a dynamic graphic database to design different experimental conditions. The creation method of dynamic graphics based on the feature graphic map is a reliable process that can be consistently generated with the algorithm flow, as shown in Figure 6, and the specific description and pseudocode of this algorithm were detailed in the author’s previous study of Ref. [27]. To unify the dynamic visual effect, the motion of key points in the feature graphics adopts a dynamic effect of moving up and down according to a certain rule to drive the line to fluctuate up and down. Simultaneously, it sets the play speed of dynamic graphics to 12 f/s, which is a speed carefully chosen for its suitability for human eyes, respecting the viewer’s comfort and experience.

2.3. Construction of Muti-Modal Perception System Environment for Ocean White Noise

The generated ocean white noises and their corresponding dynamic graphics will be displayed on the system terminal screen. We have crafted eight displaying backgrounds that resemble mountains, each meticulously tailored to a specific type of white noise and its dynamic graphic, as shown in Figure 7, and the terminal display effect is shown in Figure 8. This design enhances the audiovisual effect and adds a unique aesthetic to system functionality.

The system is designed to implement user interaction for a better multi-modal user experience and testing experiments. It allows users to control the system’s sound intensity with a simple press, demonstrating the system’s adaptability and enhancing the user experience. In addition to inputting preset text into the mode to generate the corresponding type of white noise and dynamic graphics, the system integrates with external equipment to receive more user data and collect experimental data. For example, it connects a sound collection device with the output of the sound adjustment module to collect sound data. It also connects a tactile sensor with the input of the sound adjustment module to control some sound variables. This feature gives users a profound sense of empowerment and control over the system, enhancing their overall experience.

3. Multi-Modal Perception Experimental Research of Ocean White Noise

Driven by nonlinear vibration, ocean white noise triggers intricate nonlinear perception traits in human senses, prompting a comprehensive response from multi-sensory to cognitive emotions. This underscores the vital necessity for a comprehensive research methodology. The use of a comprehensive nonlinear multi-modal perception research method, which incorporates multi-sensory data collection and tracking, is not just suitable but indispensable for comprehending the intricacy and variability of ocean white noise as environmental changes. The specific data collection methods include tactile testing, auditory testing, eye tracking, and more.

3.1. Nonlinear Multi-Modal Perception Research Method

Multi-modal perception, a theoretical concept, is fundamental for users to comprehend the external environment. Scholars use multi-modal perception to establish the connection between external symbols and the body’s internal state [28]. The multi-modal perception research method including multi-sensory data collection and tracking has been widely and reliably used in human perception research due to its diversified research methods and the reliable, nonlinear expression ability of human senses.

Firstly, multi-modal perception research is a comprehensive method for studying human sensory perception. Multi-modal perception relies on the user’s multiple senses, which combine at least two or more sensory systems, such as the visual and auditory senses and visual and tactile senses. A nonlinear multi-modal perception research method, including sensory research in all aspects, such as tactile testing, sound testing, and eye tracking, ensures a thorough exploration and analysis of human perception of external affairs. For instance, El Saddik [29] explored users’ comprehensive perception and response through the joint research of vision, hearing, touch, smell, and other senses and compared the response thresholds of different senses, thus providing a complete understanding of the user’s perception in the Internet interaction scene. We have reproduced and confirmed the multi-modal research content of this literature, except for smell research, using nonlinear multi-modal perception research methods, including multi-sensory data collection and tracking. We obtained the same results, especially the comparison results of the response thresholds of different senses and their specific numerical values, such as the visual response threshold (about 10 ms) being higher than the auditory response (about 100 ms) and others. It confirms the validity and effectiveness of this method.

Secondly, multi-modal perception research provides a vital reference for improving user perception ability. For example, Wah et al. [30] researched user multi-dimensional perceptual quality in online interactive multimedia and its optimization method, which has direct implications for enhancing user experience in multimedia systems. Velasco et al. [31] aimed to improve user understanding of system attributes and interactive functions by studying the human–computer interaction matching of various senses. This research provides a crucial reference for designing human–computer interactive interfaces for intelligent systems [32] to enhance user information perception and processing ability in the human–computer interactive environment.

Lastly, at the technical level, the eye movement tracking and electroencephalogram (EEG) experiments provide scientific and accurate data analysis and information visualization research methods on users’ multi-modal perception and emotional recognition [33,34]. For example, Zheng et al. [35] proposed a multi-modal emotion recognition framework called the Emotion Meter, which was used to identify and explore the user’s cognitive state and the multi-modal emotional experience combined with eye movement and EEG data. As important technical clues for studying machine vision [36], eye movement tracking has been studied as well as its theoretical basis, which includes the principles of oculomotor control and the role of eye movements in perception and cognition, a data collection procedure involving the use of eye-tracking devices to record the movements of the eye, and an analysis method that includes the use of statistical techniques to interpret the data [37,38]. More importantly, eye movement tracking plays a crucial role in predicting user perception ability, highlighting the practical implications of the research. Scholars also explain its working principle and data types [39], providing a basis for experimental research. Eye movement tracking has been used as an indicator for studying human sensory and cognitive functions, which includes the essential clues of human cognitive state and visual attention [40]. Klaib et al. [41] developed different algorithms and technologies to automatically track the gaze position and direction. Yan et al. [42] used eye-tracking technology to record and analyze the user’s eye movement to determine the user’s psychological and cognitive status. Katona [43] verified the flow chart’s cognitive efficiency and algorithm superiority through the parametric analysis, including fixation duration, fixation times, and pupil diameter. The above research provides important technical references for human multi-modal perception research.

3.2. Multi-Modal Perception Experiment Research

As a complex environmental noise, ocean white noise represents nonlinear perception characteristics of diversity and uncertainty due to its differences in the natural environment and user groups. The specific manifestation is the influence of differences in type, sound intensity, and user group on the perceptual characteristics of white noise. The innovative experimental research, based on multi-modal perception research methods allows us to analyze the nonlinear perception characteristics on human beings of ocean white noise and its potential influencing factors through the audiovisual comparative experiment as well as research the interaction and changing patterns between multiple senses through the sound intensity testing experiment. This research has practical implications, helping us further understand and manage the impact of white noise on human senses, multi-sensory interaction, and change patterns under white noise stimulation through multi-modal perception testing and numerical analysis. The multi-modal perception testing methods include tactile testing, sound testing, eye tracking, and user testing.

3.3. Experimental Apparatus and Participants

The experimental setup is a sophisticated computer system integration that connects with a screen to display the generated dynamic graphics and user interaction, a sound device to display the generated white noise, and a tactile sensor to receive user tactile pressure. Sound testing software and a desktop telemetry eye tracker of Tobii Pro ErgoLAB 3.0 are set up on the system to comprehensively collect sound data and eye movement data. It also collects tactile pressure from the tactile sensor, ensuring a thorough data collection process.

The study encompassed a diverse yet meticulously controlled sample population of 86 students, ranging from first-year students to seniors, with a gender distribution of 36 males and 50 females. This meticulous control ensures that our study is comprehensive and representative of the general population. The average age of the subjects was 21 years with a standard deviation of 0.92. All subjects were right-handed, had normal or corrected vision, and had no astigmatism. This careful selection ensures the validity of our study, providing the audience with the assurance of a well-controlled research environment.

3.4. Design of Experimental Process and Stimulus Material

This comprehensive experiment aims to understand how ocean white noise affects user perception. An audiovisual comparative experiment is conducted to thoroughly explore the system’s user multi-modal perception experience and its influencing factors. Then, a sound intensity testing experiment is conducted to further explore user multi-sensory interaction and change patterns under ocean white noise stimulation.

3.4.1. Audiovisual Comparative Experiment

This experiment compares user perception under different visual environments of dynamic and static graphics and different auditory environments of white noise and silence. The dynamic graphics are designed to simulate a changing visual environment, while the static graphics represent a stable visual environment. The comparison between white noise and a silent environment highlights the impact of white noise on user perception.

Experimental stimulus material design:

For the comparative experiment, it designs different perception environments with the following three different stimulus materials:

‘Dynamic graphic in silence’: Setting it by calling the dynamic graphic database and setting the output volume of the noise to lower than 20 dB by the sound adjustment module;

‘Dynamic graphic in white noise’: Calling the dynamic graphic database and setting the output volume of white noise to 50 dB;

‘Static graphic in white noise’: Calling the static graphic database and setting the output volume of white noise to 50 dB.

2.: Experimental process:

In the audiovisual stimulation comparative experiment, a rigorous process must be ensured. After visual correction, 86 subjects were asked to maintain correct posture and screen distance (about 60 cm). A stable and quiet experimental environment was maintained. The subjects were then required to experience the three stimulus materials of ‘dynamic graphic in silence’, ‘dynamic graphic in white noise’, and ‘static graphic in white noise’ in a natural and relaxed state. To ensure the stability of experimental conditions, the same type of white noise is used, and its static or dynamic graphics were generated by the system for 86 participants, such as ‘sea wave’. User-relevant perception data are obtained through the tactile sensor, sound detection software, and eye-tracking numerical analysis software. It draws a hot zone covering the mountain-shaped background on the screen as the Areas of Interest (AOIs) in the eye-tracking numerical analysis software. All these data recorded and collected were then analyzed using statistical methods, ensuring the findings were based on solid analysis.

3.4.2. Sound Intensity Testing Experiment

The stimulus material used in this experiment is the ‘dynamic graphic in white noise’ from the former experiment. To ensure the stability of experimental conditions, we use the same type of white noise and its dynamic graphic generated by the system for all the participants, such as ‘whale’. By connecting the tactile sensor to the sound adjustment module, which is a component that interprets the tactile pressure into sound intensity changes, users can change the sound intensity of white noise by tactile pressure. A human–computer interaction algorithm is applied to achieve the offline variation pattern from touch to sound. The implementation method can be described as follows: The pressure sensor connected to the system detects the user’s pressing operation, including the force and duration of the pressing, which determines the amplitude and speed of volume adjustment. Different volume adjustment strategies can be set based on the detected pressure information. Thus, the increase or decrease in volume can be controlled by the user’s pressing on the sensor with a preset threshold. The specific algorithm implemented using Python tools is shown in Figure 9.

Firstly, it connects the sensor to the system. Then, it imports the pyautogui library to receive and convert sensor presses into actions the system can recognize, such as simulated mouse presses or keyboard keys. The algorithm uses the mouse.Listener or keyboard.Listener events to detect the converted mouse or keyboard events and make judgments. When users continuously press the sensor, the volume_up() function increases the system volume. If else, the system volume descends with the volume_down() function when users release pressing. It adjusts the corresponding parameters through testing to maintain a linear relationship between the pressure received by the sensor and the system volume. This real-time response ensures that the system is dynamic and responsive. When the tactile pressure displayed on the sensor changes from 30 to 70 Pa, the system sound intensity changes from 30 to 70 dB synchronously. The experiment chooses the sound intensity levels in a range of comfortable and tolerable sound levels.

During the experiment, 86 subjects were required to change the sound intensity of white noise by pressing the tactile sensor with their right fingers while experiencing the stimulus material under the same experimental conditions. This experiment can be used to explore the conversion relationship between audiovisual and tactile senses and the variation in user perception with sound intensity.

4. Numerical Analysis on Experimental Results

4.1. Analysis of User Multi-Modal Perception and Its Influencing Factors

By conducting the audiovisual comparative experiment, we aimed to investigate system users’ multi-modal perception status and influencing factors. Table 2 presents the relevant eye-tracking data statistics, and all data are the average of 86 participants.

Table 3 contains the significant difference analysis on the user eye-tracking data for different stimulus materials. F represents the inter-group variance, while P signifies the statistical validity. These values play a crucial role in our understanding of the data. A larger F value indicates a more significant inter-group variance, while a smaller P value suggests a more effective statistical result. It considers the statistical data significant when P < 0.05 and very significant when P < 0.01.

Table 2 and Table 3 contain the analyses of users’ perception states and influencing factors in different environments, and we conclude the following points.

Visual perception and influence analysis:

Dynamic graphics outperform static ones in attracting the user’s visual attention (FF_dynamic = 0.02 < FF_static = 0.1) with a very significant impact (F_{danamic,static} = 21.190, P_{danamic,static} = 0.000 < 0.01). The dynamic visual environment relatively reduces users’ perceived depth of information processing (AF_dynamic = 0.45 < AF_static = 0.47) and relatively increases the information processing density (AS_dynamic = 2.19 > AS_static = 1.92). This suggests that a dynamic visual environment can significantly boost the efficiency of information processing, enabling users to absorb more visual information. However, it is important to consider the trade-off involved, as this efficiency comes at the cost of a decline in the depth of information processing.

2.: Auditory perception and influence analysis:

The white noise has no significant effect on the entry state of user comprehensive perception (F_{white,silence} = 0.245, P_{white,silence} = 0.622 > 0.05) compared with the silent environment. However, white noise commonly influences user visual attention by comparing the AF and AS values between silence and a white noise auditory environment (the F value is the average, and the P value is close to 0.05). In this situation, the overall efficiency of information processing by user perception will become slightly higher, but the depth of information processing will decline slightly.

3.: Perception psychology and influence analysis:

The analysis of mean pupil diameter in Table 1 reveals that both the white noise auditory environment and dynamic graphic visual environment lead to an increase in the user pupil diameter. This increase is not just a number but rather a reflection of the increase in cognitive load of user perception information processing and the user’s visual psychological tension. The influence of white noise and dynamic graphics on mean pupil diameter (MP) is commonly significant from the numerical analysis in Table 2 (F_{dynamic,static} = 0.002, P_{dynamic,static} = 0.045 < 0.05; F_{white,silence} = 0.032, P_{white,silence} = 0.038 < 0.05). These findings simultaneously underscore the impact on user visual psychology tension.

The experimental results of the audiovisual comparative experiment indicate that the efficiency of user multi-modal perception is relatively higher under the combined effect of the audiovisual environment of white noise and dynamic graphics. These findings have significant implications for the design of the multi-modal perception system. The specific manifestation is that the viewpoint tracking map (Figure 10) and hot spot map (Figure 11) show that the user viewpoint tracking distribution is relatively uniform, the number of gaze points is relatively high, and the area of deep processing to visual information is relatively large under the audiovisual environment of white noise and dynamic graphics. It reflects that the efficiency of user information perception and cognitive processing is relatively high, and the multi-modal perception state is good in this situation.

Figure 10 provides a visual representation of the user’s gaze points and their distribution. Moreover, Figure 11 provides a visual representation of the areas that attracted the most visual attention from the users. These maps help us understand how the audiovisual environment influences the user’s visual attention.

4.2. Influence Analysis of White Noise Sound Intensity on Visual Perception

In the sound intensity testing experiment, subjects can adjust the sound intensity by pressing the sensor, influencing visual perception. Related data statistics from the mean value of all participants, presented in Table 4, reveal the shifts in user perception from tactile to auditory and then to visual. It is important to note that our study focused on the sound intensity range within the auditory comfort zone, excluding extreme conditions.

Next, we will focus on the impact of sound intensity on users’ visual perception using the variance and statistical significance analysis on data groups from Table 4, focusing on the eye-tracking parameters, and the results are shown in Table 5.

Numerical analysis from Table 4 and Table 5 reveals a significant finding: the variation in sound intensity has a discernible impact on user perception. Meanwhile, the overall influence is relatively minor (indicated by the small F values, except for the average number of saccades). Let us delve into the details:

The variation in white noise intensity directly affects the entry time of users’ visual perception. Specifically, within the 40–60 dB range, the entry state of user perception is slightly delayed (FF = 0.02). However, at lower (<30 dB) or higher (>70 dB) levels, the entry state of user comprehensive perception is slightly accelerated (FF = 0.01). It suggests that relatively high or low white noise intensity is more conducive to the entry state of visual perception.
The variation in sound intensity significantly affects the depth of information processing in user perception. This finding has practical implications, suggesting that the appropriate sound intensity of white noise can enhance the user’s comprehensive perception ability. When the sound intensity of white noise is close to 50 dB, the information processing depth of user perception is relatively shallow (AF = 0.45), but the information processing density is relatively high (AS = 2.19). It indicates that the information processing efficiency of user perception is relatively high, albeit with a slight decrease in information processing depth in this situation.
The pupil diameter has an upward trend with the increase in the white noise intensity, indicating that the cognitive load of user perception and the degree of visual and psychological tension gradually increased as the sound intensity of white noise increased. Simultaneously, combined with the change in the information processing efficiency of user perception, we conclude that the information processing efficiency of user perception does not have a proportional relationship with the cognitive load and the visual and psychological tension degree. Too relaxed or tense sensory stimulation is not conducive to the information reception of user perception.

As observed from the user viewpoint tracking maps (Figure 12) and hot spot maps (Figure 13), the research confirms that the user perception state changes with the increase in the sound intensity of white noise. When the sound intensity of white noise is close to 50 dB, the user viewpoint tracking distribution is relatively uniform, the number of gaze points is relatively high, and the area of deep processing to visual information is relatively large. This reflects that the user’s visual perception efficiency and cognitive processing level are relatively high. In this situation, users’ multi-modal perception state is relatively good.

4.3. Analysis on User Multi-Sensory Interaction and Change Patterns

In this system, we created a linear human–computer interaction algorithm that lets users adjust the sound intensity by pressing the sensor, enhancing their visual perception. This setup realizes the user’s multi-sensory interaction. Based on further numerical analysis, it enables the interaction and change patterns between multiple senses, which are revealed as follows.

It presents a linear influence law from tactile press to auditory perception intensity due to the tactile sensor’s human–computer interaction algorithm, the numerical changes of which are revealed in Table 4.
It presents a fluctuating influence law from white noise sound intensity to visual concentration, which is reflected by average fixation duration and average saccade count parameters. It considers that the user’s visual concentration improves when the average fixation duration increases and the average saccade count decreases. During the experiment, the average fixation duration fluctuated with the increase in sound intensity, and the average fixation time reached the minimum value when the sound intensity was about 50 dB, as shown in Figure 14. Simultaneously, the saccade count had a more noticeable change with the increase in sound intensity, and the average fixation time reached the maximum value when the sound intensity was about 50 dB, as shown in Figure 15. Combined with the viewpoint tracking maps in Figure 12, it indicates that user visual concentration is relatively low at the sound intensity level of about 50 dB and relatively high at about 60 dB.
It presents a curvilinear influence law from white noise sound intensity to visual psychology. The pupil diameter often reflects the level of user visual psychological pressure and visual stimulation tension. In the sound intensity testing experiment, the average pupil diameter showed a curvilinear upward trend with the increase in sound intensity, indicating that the degree of tension in user visual psychology gradually increased. When the sound intensity increases to over 60 dB, there is still a curvilinear upward trend with the average and the maximum pupil diameter. However, the minimum pupil diameter decreases significantly, as shown in Figure 16. It indicates that it causes a significant change in user visual psychology when the white noise sound increases to over 60 dB.

5. Conclusions

This study proposes a novel generation method based on deep learning algorithms to generate ocean white noise. It proved the superiority of this generation method over general mathematical or physical methods through comparative study, including spectrum analysis and auditory testing. A comprehensive perceptual system is developed for multi-modal perception study by generating the audiovisual environment of ocean white noise and dynamic graphics based on deep learning algorithms. It also incorporates a human–computer interaction algorithm to create an interactive and tactile environment, ensuring the participation of user tactile experience. Furthermore, it analyzes the nonlinear perception characteristics of ocean white noise through multi-modal perception research methods, including the following two experiments. The first audiovisual comparative experiment reveals the positive influence of white noise and dynamic graphics on user perception ability. The second sound intensity testing experiment explores the impact of white noise intensity on user visual perception, finding that the visual perception ability reaches a relatively high level when the white noise sound intensity is close to 50 dB, which is reflected in the distribution status of visual fixation points and deep processing areas in hot spot maps. Based on the experimental results, further numerical analysis is needed to delve into the multi-sensory interaction and change patterns under ocean white noise stimulation. We found a fluctuating influence law from white noise sound intensity to visual concentration and a curvilinear influence law from white noise sound intensity to visual psychology through more in-depth numerical analysis. The experimental results will provide more comprehensive, detailed, and accurate research results, paving the way for the design of more advanced multi-modal perception systems.

However, there is still much room for improvement in our work. For instance, the intelligent algorithms can be improved to create more comprehensive and extensive audiovisual perception environments for user experience and experimental research. In addition, this study has certain limitations. More extensive research is required in future work, such as analyzing the nonlinear perception characteristics of different types of white noise under different environments and subjects to reveal its complexity and universality and lay the foundation for applying ocean white noise in multiple fields. It also needs to explore diverse research for more in-depth perception characteristics in exploring ocean white noise, for example, developing more types of ocean white noise, conducting multi-model perception experiments targeting special populations such as people with poor vision and special needs, and exploring the perception characteristics of ocean white noise in real and virtual environments.

Author Contributions

Conceptualization, T.Q. and Y.L.; Methodology, T.Q.; Investigation, T.Q. and Y.L.; Validation, T.Q. and J.C.; Data curation, J.C.; Writing—original draft preparation, Y.L.; Software, Y.L.; Formal analysis, J.C.; Project administration, Y.L. and J.C.; Writing—review and editing, Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by the National Social Science Fund’s Rare and Unique Learning Project of China (no. 21VJXG033); the Excellent Youth Program of Philosophy and Social Science of Anhui Universities (no.2023AH030025); Key Projects of Humanities and Social Sciences in Anhui Province’s Universities (KZ22023077); Innovation Team Project of Anhui Polytechnic University (no. KZ42022004).

Data Availability Statement

The datasets supporting the conclusion of this article are included within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Helal, K.M.; Fragasso, J.; Moro, L. Underwater noise characterization of a typical fishing vessel from Atlantic Canada. Ocean Eng. 2024, 229, 117310. [Google Scholar] [CrossRef]
Ma, Z.H.; Li, P.; Wang, L.Z.; Lu, J.; Yang, Y.R. Mechanistic study of noise source and propagation characteristics of flow noise of a submarine. Ocean Eng. 2024, 302, 117667. [Google Scholar] [CrossRef]
Song, R.P.; Feng, X.; Wang, J.F.; Sun, H.X.; Zhou, M.Z.; Esmaiel, H. Underwater acoustic nonlinear blind ship noise separation using recurrent attention neural networks. Remote Sens. 2024, 16, 253. [Google Scholar] [CrossRef]
Zhu, C.Y.; Cao, T.Y.; Chen, L.; Dai, X.B.; Ge, Q.Q.; Zhao, X.Q. High-order domain feature extraction technology for ocean acoustic observation signals: A review. IEEE Access 2023, 11, 17665–17683. [Google Scholar] [CrossRef]
Mahanty, M.M.; Latha, G.; Sanjana, M.C.; Raguraman, G.; Venkatesan, R. Passive acoustic detection of distant ship crossing signal in deep waters using wavelet denoising technique. In Proceedings of the Oceans Conference, Chennai, India, 21–24 February 2022. [Google Scholar]
Nasir, F.; Taib, C.M.I.C.; Ariffin, E.H.; Padlee, S.F.; Akhir, M.F.; Ahmad, M.F.; Yusoff, B. Significant wave height modelling and simulation of the monsoon-influenced South China Sea coast. Ocean Eng. 2023, 227, 114142. [Google Scholar] [CrossRef]
Pickens, T.A.; Khan, S.P.; Berlau, D.J. White noise as a possible therapeutic option for children with ADHD. Complement. Ther. Med. 2019, 42, 151–155. [Google Scholar] [CrossRef]
Liu, Y.; Chen, D.X.; Jin, H.; Wang, T. White noise wave generation method controlled by a rotary valve. J. Vib. Control 2022, 28, 203–213. [Google Scholar] [CrossRef]
Deon, A.F.; Karaduta, O.K.; Menyaev, Y.A. Phase congruential white noise generator. Algorithms 2021, 14, 118. [Google Scholar] [CrossRef]
Yen, C.T.; Chen, U.H. Design of deep learning acoustic sonar receiver with temporal/spatial underwater channel feature extraction capability. Int. J. Eng. Technol. Innov. 2024, 14, 115–136. [Google Scholar]
Lv, J.; Zhang, X.; Zhou, H.; Bai, Y.; Zhao, X.; He, P.; Zhang, Y. Low Probability of Intercept Radar Signal Identification Method, Involves Adding Weighted White Noise Item in Decoder, Where Amplitude Value Output by Coding Machine, Frequency, Phase Characteristic and Convolutional Neural Network Deep Learning. China Patent CN115409057-A, 29 November 2022. [Google Scholar]
Zhu, J.; Song, S.; Yan, S.; Liu, C.; Li, C. Method for Enhancing Underwater Acoustic Target Radiation Noise Based on Deep Learning in Deep Sea Severe Environment, Involves Performing Pitch Filtering Process on Result of Pitch Analysis and Short-Time Fourier Transform, and Completing Acoustic Target Radiation Noise Enhancement Process. China Patent CN117238308-A, 15 December 2023. [Google Scholar]
Jun, H.; Jou, H.T.; Kim, C.H.; Lee, S.H.; Kim, H.J. Random noise attenuation of sparker seismic oceanography data with machine learning. Ocean Sci. 2020, 26, 1367–1383. [Google Scholar] [CrossRef]
Li, Y.; Tang, Y. Novel creation method of feature graphics for image generation based on deep learning algorithms. Math. 2023, 11, 1644. [Google Scholar] [CrossRef]
Alimagadov, K.A.; Umnyashkin, S.V. Application of wiener filter to suppress white noise in images: Wavelet vs Fourier basis. In Proceedings of the 2021 IEEE Conference of Russian Young Researchers in Electrical and Elecronic Engineering, Saint Petersburg, Russia, 26–28 January 2021. [Google Scholar]
Ji, U.C.; Lee, M.R.; Ma, P.C. Generalized mehler semigroup on white noise functionals and white noise evolution equations. Mathematics 2020, 8, 1025. [Google Scholar] [CrossRef]
Reed, V.A.; Toth, C.A.; Wardle, R.N.; Gomes, D.G.E.; Barber, J.R.; Francis, C.D. Experimentally broadcast ocean surf and river noise alters birdsong. PeerJ 2022, 10, 13297. [Google Scholar] [CrossRef] [PubMed]
Sabbar, Y.; Khan, A.; Din, A. Probabilistic Analysis of a Marine Ecological System with Intense Variability. Mathematics 2022, 10, 2262. [Google Scholar] [CrossRef]
Hu, L.; Nie, L.F. Dynamics of a stochastic HIV infection model with logistic growth and CTLs immune response under regime switching. Mathematics 2022, 10, 3472. [Google Scholar] [CrossRef]
Warjri, E.; Dsilva, F.; Sanal, T.S.; Kumar, A. impact of a white noise app on sleep quality among critically ill patients. Nurs. Crit. Care 2022, 27, 815–823. [Google Scholar] [CrossRef]
Akiyama, A.; Tsai, J.D.; Tam, E.W.Y.; Kamino, D.; Hahn, C.; Go, C.Y.; Chau, V.; Whyte, H.; Wilson, D.; McNair, C. The effect of music and white noise on electroencephalographic (EEG) functional connectivity in neonates in the neonatal intensive care unit. J. Child. Neurol. 2021, 36, 38–47. [Google Scholar] [CrossRef]
Ohbayashi, W.; Kakigi, R.; Nakata, H. Effects of white noise duration on somatosensory event-related Potentials. Neuroreport 2019, 30, 26–31. [Google Scholar] [CrossRef]
Hiwa, S.; Katayama, T.; Hiroyasu, T. Functional near-infrared spectroscopy study of the neural correlates between auditory environments and intellectual work performance. Brain Behav. 2018, 8, e01104. [Google Scholar] [CrossRef]
Masuda, F.; Sumi, Y.; Takahashi, M.; Kadotani, H.; Yamada, N.; Matsuo, M. Association of different neural processes during different emotional perceptions of white noise and pure tone auditory stimuli. Neurosci. Lett. 2018, 665, 99–103. [Google Scholar] [CrossRef]
Hong, J.Y.; Lam, B.; Ong, Z.T.; Ooi, K.; Gan, W.S.; Kang, J.; Yeong, S.; Lee, I.; Tan, S.T. Effects of contexts in urban residential areas on the pleasantness and appropriateness of natural sounds. Sustain. Cities Soc. 2020, 63, 102475. [Google Scholar] [CrossRef]
Fu, J.; Deng, Z.; Liu, C.; Liu, C.; Luo, J.; Wu, J.; Peng, S.; Song, L.; Li, X.; Peng, M.; et al. Intelligent, flexible artificial throats with sound emitting, detecting, and recognizing abilities. Sensors 2024, 24, 1493. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Tang, Y. Design on intelligent feature graphics based on convolution operation. Mathematics 2022, 10, 384. [Google Scholar] [CrossRef]
Ang, G.; Lim, E.P. Learning semantically rich network-based multi-modal mobile user interface embeddings. ACM Trans. Interact. Intell. Syst. 2022, 12, 34. [Google Scholar] [CrossRef]
El Saddik, A. Multimedia and the tactile internet. IEEE Multimed. 2020, 27, 5–7. [Google Scholar] [CrossRef]
Wah, B.W.; Xu, J.X. Optimizing multidimensional perceptual quality in online interactive multimedia. IEEE Multimed. 2023, 30, 119–128. [Google Scholar] [CrossRef]
Velasco, C.; Salgado-Montejo, A.; Marmolejo-Ramos, F.; Spence, C. Predictive packaging design: Tasting shapes, typefaces, names, and sounds. Food Qual. Prefer. 2014, 34, 88–95. [Google Scholar] [CrossRef]
Pulvermuller, F. How neurons make meaning: Brain mechanisms for embodied and abstract-symbolic Semantics. Trends Cogn. Sci. 2013, 17, 458–470. [Google Scholar] [CrossRef] [PubMed]
Kulke, L.; Pasqualette, L. Emotional content influences eye-movements under natural but not under instructed conditions. Cogn. Emot. 2021, 36, 332–344. [Google Scholar] [CrossRef]
Zhang, A.M.; Su, L.; Zhang, Y. EEG data augmentation for emotion recognition with a multiple generator conditional Wasserstein GAN. Complex Intell. Syst. 2021, 8, 3059–3071. [Google Scholar] [CrossRef]
Zheng, W.L.; Liu, W.; Lu, Y.F.; Lu, B.L.; Cichocki, A. EmotionMeter: A multimodal framework for recognizing human emotions. IEEE Trans. Cybern. 2019, 49, 1110–1122. [Google Scholar] [CrossRef] [PubMed]
Khan, W.; Crockett, K.; O’Shea, J.; Hussain, A.; Khan, B.M. Deception in the eyes of deceiver: A computer vision and machine learning based automated deception. Expert Syst. Appl. 2021, 169, 114341. [Google Scholar] [CrossRef]
Conati, C.; Lallé, S.; Rahman, M.A.; Toker, D. Comparing and Combining Interaction Data and Eye-tracking Data for the Real-time Prediction of User Cognitive Abilities in Visualization Tasks. ACM Trans. Interact. Intell. Syst. 2020, 10, 12. [Google Scholar] [CrossRef]
Scott, N.; Zhang, R.; Le, D.; Moyle, B. A review of eye-tracking research in tourism. Curr. Issues Tour. 2019, 22, 1244–1261. [Google Scholar] [CrossRef]
Carter, B.T.; Luke, S.G. Best practices in eye tracking research. Int. J. Psychophysiol. 2020, 155, 49–62. [Google Scholar] [CrossRef]
Spering, M. Eye movements as a window into decision-making. Annu. Rev. Vis. Sci. 2022, 8, 427–448. [Google Scholar] [CrossRef]
Klaib, A.F.; Alsrehin, N.O.; Melhem, W.Y.; Bashtawi, H.O.; Magableh, A.A. Eye tracking algorithms, techniques, tools, and applications with an emphasis on machine learning and internet of things technologies. Expert. Syst. Appl. 2021, 166, 114037. [Google Scholar] [CrossRef]
Yan, B.; Pei, T.Y.; Wang, X.J. Wavelet method for automatic detection of eye-movement behaviors. IEEE Sens. J. 2019, 19, 3085–3091. [Google Scholar] [CrossRef]
Katona, J. Measuring cognition load using eye-tracking parameters based on algorithm description tools. Sensors 2022, 22, 912. [Google Scholar] [CrossRef]

Figure 1. Construction method of ocean white noise perceptual system.

Figure 2. Generation method of ocean white noise and dynamic graphics.

Figure 3. The DCNN network structure.

Figure 4. Image feature processing and extraction process.

Figure 5. The feature point sampling and optimization. (a) Inverse feature map of waves; (b) initial feature point sampling map; (c) optimized feature point sampling map; (d) the line graphic.

Figure 6. Algorithm flow for generating dynamic graphics.

Figure 7. Displaying background design for 8 types of white noise.

Figure 8. Terminal display.

Figure 9. The human–computer interaction algorithm.

Figure 10. User viewpoint tracking map under different audiovisual environments. (a) Static graphic in white noise; (b) dynamic graphic in white noise; (c) dynamic graphic in silence.

Figure 11. The hot spot map under different audiovisual environments. (a) Static graphic in white noise; (b) dynamic graphic in white noise; (c) dynamic graphic in silence.

Figure 12. User viewpoint tracking map at different sound intensity levels: (a) at 30 dB; (b) at 40 dB; (c) at 50 dB; (d) at 60 dB; (e) at 70 dB.

Figure 13. User hot spot map at different sound intensity levels (superposition of 10 subjects): (a) at 30 dB; (b) at 40 dB; (c) at 50 dB; (d) at 60 dB; (e) at 70 dB.

Figure 14. Fixation duration influence.

Figure 15. Saccade influence.

Figure 16. Pupil diameter influence.

Table 1. The comparison test result statistics of spectrum analysis and auditory testing.

Generating Method	Spectrum Analysis (Fitness Percent: 0–100%)	Auditory Testing (Authenticity Rating: 0–10)	Auditory Testing (Comfort Rating: 0–10)
Uniform random sequences	59.31–82.64%	4.89–8.15	3.52–7.94
Autocorrelation coefficients matrix	63.54–87.96%	5.23–8.68	6.86–8.73
Deep learning network	83.42–92.53%	7.86–9.25	8.12–9.16

Table 2. Relevant data statistics of the audiovisual stimulation comparative experiment.

Stimulation Eye-Tracking Data	Dynamic Graphic in Silence	Dynamic Graphic in White Noise	Static Graphic in White Noise
first entry fixation time FF (s)	0.02	0.02	0.10
average fixation duration AF (s)	0.62	0.45	0.47
average saccade count AS (N/s)	1.90	2.19	1.92
mean pupil diameter MP (mm)	3.14	3.16	3.12

Table 3. Significant difference analysis on the eye-tracking data with different stimulus materials.

FP Analysis Stimulation	First Entry Fixation Time (FF)			Average Fixation Duration (AF)			Average Saccade Count (AS)			Mean Pupil Diameter (MP)
FP Analysis Stimulation	F	P	Significance	F	P	Significance	F	P	Significance	F	P	Significance
dynamic/static graphic	21.190	0.000	very	3.181	0.017	relatively	2.136	0.024	relatively	0.002	0.045	commonly
white noise/silence	0.245	0.622	none	1.541	0.047	commonly	1.786	0.052	commonly	0.032	0.038	commonly

Table 4. The relevant data statistics of the sound intensity testing experiment.

Tactile Pressure (Pa)	30	40	50	60	70
Sound intensity change (db)	30	40	50	60	70
First entry fixation time FF (s)	0.01	0.02	0.02	0.02	0.01
Average fixation duration AF (s)	0.58	0.52	0.45	0.59	0.53
Average number of saccades AS (N/s)	1.86	1.88	2.19	1.92	1.94
Mean pupil diameter MP (mm)	3.14	3.14	3.16	3.20	3.24

Table 5. Variance and statistical significance analysis on user perception data groups at different sound intensity.

First Entry Fixation Time		Average Fixation Duration		Average Number of Saccades		Mean Pupil Diameter
F	P	F	P	F	P	F	P
0.402	0.037	0.779	0.034	1.786	0.023	0.376	0.046

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qian, T.; Li, Y.; Chen, J. Nonlinear Perception Characteristics Analysis of Ocean White Noise Based on Deep Learning Algorithms. Mathematics 2024, 12, 2892. https://doi.org/10.3390/math12182892

AMA Style

Qian T, Li Y, Chen J. Nonlinear Perception Characteristics Analysis of Ocean White Noise Based on Deep Learning Algorithms. Mathematics. 2024; 12(18):2892. https://doi.org/10.3390/math12182892

Chicago/Turabian Style

Qian, Tao, Ying Li, and Jun Chen. 2024. "Nonlinear Perception Characteristics Analysis of Ocean White Noise Based on Deep Learning Algorithms" Mathematics 12, no. 18: 2892. https://doi.org/10.3390/math12182892

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Nonlinear Perception Characteristics Analysis of Ocean White Noise Based on Deep Learning Algorithms

Abstract

1. Introduction

2. Novel Methods for Generating Ocean White Noise Perceptual System

2.1. White Noise Generation Based on Deep Learning Algorithms

2.2. Construction of Feature Graphic Database

2.3. Construction of Muti-Modal Perception System Environment for Ocean White Noise

3. Multi-Modal Perception Experimental Research of Ocean White Noise

3.1. Nonlinear Multi-Modal Perception Research Method

3.2. Multi-Modal Perception Experiment Research

3.3. Experimental Apparatus and Participants

3.4. Design of Experimental Process and Stimulus Material

3.4.1. Audiovisual Comparative Experiment

3.4.2. Sound Intensity Testing Experiment

4. Numerical Analysis on Experimental Results

4.1. Analysis of User Multi-Modal Perception and Its Influencing Factors

4.2. Influence Analysis of White Noise Sound Intensity on Visual Perception

4.3. Analysis on User Multi-Sensory Interaction and Change Patterns

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI