1. Introduction
When ocean waves propagate to the shoreline, the decrease in the water depth shortens its wavelength and increases wave height. Upon reaching a physical limit, the wave breaks without maintaining its form, which is defined as the “Wave-Breaking” phenomenon [
1]. Wave breaking causes various events in the process of wave-energy (transported from deep water) dispersion, such as impact waves, longshore currents, rip currents and sediment transport, and also affects vessels and the stability of coastal structures [
2]. Therefore, information on wave breaking is essential for the design and maintenance of coastal structures. In addition, it is considered a necessary factor for predicting and responding to sediment transport and morphological changes in the nearshore area. Particularly among the physical quantities related to wave breaking, the most important properties are the wave height and water depth at the position of the breaking wave, which are defined as the “Breaking-Wave Height” and “Breaking-Water Depth,” respectively. These quantitative values, called wave-breaking indexes, indicate the starting point of the breaking wave and the maximum wave height in the nearshore, and numerous studies have been conducted to estimate these values [
3,
4,
5,
6,
7,
8]. However, owing to strong turbulent features and nonlinearity, observing and predicting coastal breaking waves is complex, and many studies are still challenged by this subject [
9,
10].
Initial studies related to wave breaking began with the theoretical approach presented by Michell [
11] and McCowan [
12]. Michell proposed the wave-steepness limit of the deep-water wave value of 0.142 using the relationship between the water-particle motion and wave celerity at the wave crest. McCowan proposed a ratio of 0.78 of the wave height and water depth when the wave breaks while considering a solitary wave in deep-water. Subsequently, Miche [
13] applied the linear wave theory to the results drawn by Michell and expressed the breaking-wave height under the monochromatic-regular-wave condition by using the wavelength and water depth at the breaking point as input variables. After that, Ippen and Kulin [
14] also pointed out the inadequacy of the presented formula, wherein the wave-breaking mechanism of a solitary wave was not suitable for estimating the breaking-wave height and water depth of wind waves.
It was reported that various errors occurred in estimating the breaking-wave height and water depth using the linear wave theory due to the nonlinearity of the waves by the shallow-water-depth effect [
8,
15,
16]. Therefore, numerous studies have attempted to estimate empirical wave-breaking formulas for the breaking-wave height and water depth in order to overcome these limitations. Hence, the breaking-wave height and water depth were estimated in subsequent studies by reproducing various wave conditions in wave-flume laboratory experiments and obtaining consecutive wave transformations depending on the water depth [
17,
18,
19]. Maruyama et al. [
20], Stive [
21], and Smith and Kraus [
15] performed wave-flume laboratory experiments by installing offshore structures resembling the actual coastal terrain or by setting a sand bar (formed by sand moved out towards the open sea by waves) as the topographical conditions. Numerous empirical wave-breaking formulas have been proposed based on wave-flume laboratory experiment data obtained under various conditions, including the above [
8,
9,
22,
23,
24,
25,
26,
27,
28,
29,
30]. However, consistent performance was limited under generalized conditions as these empirical wave-breaking formulas depended on certain laboratory experiment data. Furthermore, the methods and basic forms of the breaking index used to develop the empirical wave-breaking formulas were different from each other. In particular, the empirical wave-breaking formulas suggested by Goda [
8], Robertson et al. [
9], Tadayon et al. [
30], and others had a high degree of reproducibility because the estimated breaking-wave height and water depth were plugged in as the input values for the equations, but that makes them insufficient for coastal engineering applications.
In this study, a method based on a multilayer neural network with a nonlinear activation function and backpropagation is proposed in order to estimate the breaking-wave height and breaking-water depth by discovering the nonlinear relationship between the deep-water wave condition, bottom slope and wave-breaking index [
31]. For the improvement of the applicability and usability compared to the previous wave-breaking formulas, the newly proposed method is designed to simultaneously obtain breaking-wave indexes (breaking-wave height and water depth). Additionally, there is no possibility of error from the secondary transformation of raw wave data by directly utilizing the input and output without nondimensionalization. Next, various experiment data for monochromatic regular waves published in previous studies were collected and used for training the network. The proposed neural network comprising the nonlinear activation function and backpropagation can use all of the data acquired from various laboratories without distinguishing between the experimental conditions. This gives the proposed model robust applicability to laboratory experiment conditions. Finally, the performance of our proposed method is evaluated in comparison with the existing breaking-wave-index formulas.
2. Related Works
The existing breaking-wave formulas use the deep-water wave data and bottom slope as input variables and reproduce the breaking-wave height and wave-breaking location. In this study, the accuracy and meaningfulness of these formulas were summarized, and the laboratory experiment data for the monochromatic regular wave from the previous studies were applied to construct the neural network for estimating the breaking-wave height and water depth. The variables proposed in this study are defined as follows: (1) , , and indicate the deep-water wave height, water depth, and wavelength, respectively; (2) and denote the breaking-wave height and breaking-water depth, respectively; and (3) denotes the gradient of the bottom slope.
Le Mehaute and Koh [
32] considered the deep-water wave steepness and bottom slope parameters as input variables and predicted the nondimensionalized breaking-wave height with the deep-water wave height using Equation (1). This equation was derived using the data of the gentle slope, which has high utilization potential as it demonstrated sound predictive performance in the majority of slope ranges [
33]. Additionally, this equation was the first trial that simultaneously considered the wave steepness and bottom slope in calculating the breaking-wave height. Subsequently, this was partially modified by Galvin [
34] and Collins and Weir [
35]. Data from wave-flume laboratory experiments performed by Suquet [
36], Hamada [
37], and Iversen [
17] were used to develop the equation, and the data were within the following ranges: 0.02
0.2 and 0.002
/
0.093. From these datasets, the experimental results published by Iversen [
17] were collected and used in this study.
Rattanapitikon and Shibayama [
28] used 574 experimental data in the ranges 0
0.38 and 0.001
/
0.100 to validate the existing wave-breaking formulas. In addition, they modified the terms that correspond to the bottom slope in the equations developed by Komar and Gaughan [
19], Goda [
22], and Ostendorf and Madsen [
24].
Rattanapitikon and Shibayama [
38] supplemented approximately 100 large wave-flume experimental data (0
0.29 and 0.003
/
0.112) from Kajima et al. [
39] and Smith and Kraus [
40] to propose the new forms of empirical equations that can predict the breaking-wave height and water depth. The equations are presented in Equations (2) and (3).
From the results of Rattanapitikon and Shibayama [
38], the deep-water wave steepness and bottom slope were used as input variables to estimate the breaking-wave height and water depth using the nondimensionalized form by deep-water wavelength. The laboratory experiment data used to fit the equation comprised 695 data generated from 26 wave-flume laboratory experiments. In this study, 351 accessible data were used to construct a neural-network model for predicting breaking-wave indexes (breaking-wave height and water depth).
Xie et al. [
29] restricted the wave-breaker type to “plunging” to improve the breaking-water-depth-prediction accuracy. The new formula was developed based on the linear wave theory, so it can significantly improve the applicability of the input wave condition. However, the equation is too complicated and applicable only under certain wave-breaker types (plunging). A semi-empirical formula in Equation (4) was proposed, which applied 242 data covering a range of 0.0125
0.2 and 0.0016
/
0.092, acquired from six sources, for fitting the equation coefficient. The proposed equation was verified using their own 25 wave-flume laboratory experiments and eight plunging wave data from Lara et al. [
41]. All of the experimental data were collected and used in the learning phase of the proposed model and in its validation.
Lee and Cho [
42] used 860 data obtained from previous experimental datasets in the ranges of 0.01
0.2 and 0.018
/
0.1272. They applied them in order to derive the linear relationship using several linear regression methods and a feed-forward neural network. Moreover, the linear relationships of the breaking-wave height and water depth nondimensionalized by deep-water wavelengths were derived and expressed in terms of the deep-water wave steepness and bottom slope, as shown in Equations (5) and (6).
The equations of Lee and Cho [
42] fit very well on the linear relationship between deep-water wave condition, bottom slope, and wave-breaker index. However, both equations tend to overestimate the results for gentle bottom slopes (
m ≤ 0.02). In addition, the linear relation limits the performance when the nonlinearity of the wave is strong or when the scale effect of the experiment is important. In this study, 433 data were collected and used to construct the newly proposed model.
3. Methodology
3.1. Data
In this study, openly published wave-flume laboratory experimental data were used in developing the new breaking-wave-index-prediction model that applies a neural network with a nonlinear activation function and backpropagation to the estimation of the breaking-wave height and water depth.
Table 1 summarizes the source and experimental conditions of 630 data collected from 31 previous studies. A substantial part of the data was obtained from Gaughan [
43] and Smith and Kraus [
15], and these are indicated as “*” and “**” in the Source column of
Table 1, respectively. The remaining data were collected from Bowen et al. [
44], Weggel and Maxwell [
45], Ozaki et al. [
46], Van Dorn [
47], Kirgoz [
48], Ishida and Yamaguchi [
49], Sakai et al. [
50], Ting and Kirby [
51], Kakuno et al. [
52], Yüksel et al. [
53], Hoque [
54], Shin and Cox [
55], Deo and Jagdale [
56], Lara et al. [
41], Mori and Kakuno [
57], and Xie et al. [
29]. Each dataset consists of input values such as the deep-water wave height, bottom slope, period or wavelength, and breaking-wave heights and water depths obtained from wave-flume laboratory experiments with slopes distributed between 0.009–0.225. Among the data from Iversen [
17], Ishida and Yamaguchi [
49], Yüksel et al. [
53], and Xie et al. [
29], those that did not include the breaking-wave heights or water depths were excluded from this study. The breaking-wave heights and water depths of 584 data used in this study are plotted in
Figure 1. In
Figure 1, the data located independently in the upper right corner are the results obtained from the experiments performed by Maruyama et al. [
20] and Stive [
21] in a large wave flume within a movable bed condition. There are various wave-flume laboratory experimental data other than the acquired data in this study. In this study, we only utilized easily accessible data in order to facilitate the reproducibility of the process and the results of this research for estimating the breaking-wave height and water depth.
A total of 584 data were classified as training and test data at a ratio of approximately 3:1 such that the bottom slope and deep-water wave height and period data could be distributed evenly according to the experimental conditions. Thus, the amount of data for training and testing was approximately 455 and 129, respectively.
Figure 2 presents the outer and inner circles as the training data and test data, respectively, in addition to the composition of the experimental conditions for acquiring each wave datum.
3.2. Multilayer Neural Network for Estimating Wave-Breaking Index
To abstract the more useful features by creating a multi-level and multi-neuron neural network, called the fully connected Deep Neural Network (DNN), which automatically learns more appropriate weights and thresholds based on the structure of the related data on wave breaking, we proposed a multilayer neural network comprising a number of hidden layers, neurons per layer, and connections per unit with a nonlinear activation function and backpropagation.
Figure 3 illustrates a multilayer neural network with three layers, where the input layer, denoted as
z, represents multiple factors (the bottom slope and deep-water wave height and period) that affect the wave breaking, and the output layer, denoted as
a, refers to the breaking-wave height and breaking-water depth.
We used the sigmoid activation function [
58], which transforms the weighted sum of the input into an output from neurons in a hidden layer of the network as follows:
The activation function is nonlinear and may be referred to as nonlinearity in the layer or the network design. The function takes any real value as an input and outputs values in the range of 0 to 1. The larger the input (more positive), the closer the output value will be to 1.0, whereas the smaller the input (more negative), the closer the output will be to 0.0.
Backpropagation is operated in order to determine the optimal learning result by propagating the error in the reverse direction, unlike the feed-forward neural network. First, the input is transmitted to the final output during the feed-forward process. Second, the error and cost function are determined at the final output layer. Thereafter, in the backpropagation process, the errors between the expected outputs and actual values obtained from the final step are propagated in the reverse direction, and each weight and bias value of the neurons is updated.
Such optimizers for training neural networks are responsible for finding the free parameters (usually denoted as weights) of a cost function that, typically, includes a performance measure evaluated on the training set and additional regularization terms. Adaptive moment (Adam) is an update to the root-mean-square propagation (RMSProp) optimizer, wherein momentum [
59] is incorporated, i.e., in addition to storing an exponentially decaying average of the previous squared gradients, Adam also employs an exponentially decaying average of the previous gradients. Loss denotes the loss function that is employed at the training time and is given by the mean-squared error (MSE) as follows:
The MSE is calculated as the average of the squared differences between the estimated () and actual values (y). The epoch and batch size for the experiment were 317 and 1, respectively, and the learning rate was 0.0001. All data were normalized using the z-score for standardization to put different variables on the same scale.
In contrast to the previous empirical equations, the raw wave-period value was plugged in as the input values instead of deep-water wavelength because the dimension between the input and output did not need to be considered. Moreover, that eliminated the possibility of errors resulting from the dispersion relation equation, which is used to convert deep-water wavelength to wave period, induced by the linear wave theory.
3.3. Evaluation Metrics
The bias (
B), root-mean-square errors (
RMSE), and Pearson correlation coefficient (
R) were used as the evaluation metrics to verify a newly constructed breaking-wave-index-prediction model.
B and
RMSE indicate the difference between the actual value
and the value estimated through the model
and
R numerically expresses the similarity between the two values. The model performance is considered high when the absolute value of
B is small, the
RMSE value is small, and the
R value is close to 1. There is no absolute standard, but an
R value of 0.8 or higher generally implies a suitable correlation between the estimations of the prediction model and real values.
and
indicate the average of
and
, respectively, and
refers to the amount of data.
4. Results
In this study, 129 test data, excluding the data used to train the network, were used to evaluate wave-breaking-index-estimation performance. The results showed that
B,
RMSE, and
R of the breaking-wave height and water depth were 0.004, 0.019, and 0.894 and 0.005, 0.021, and 0.921, respectively. These results were compared with the approach of modeling the linear relationship using multiple regression analysis. In contrast to previous breaking-wave formulas that nondimensionalized the wave height and water depth, this newly proposed model utilizes real values of deep-water wave height, period, bottom slope, breaking-wave height, and breaking-water depth. This makes it possible to estimate more accurate inference results by excluding the errors from the secondary transformation of the raw wave data. Therefore, the existing breaking-wave formulas were reorganized to evaluate and compare the performance of the proposed model and the previous breaking-wave formulas for estimating the breaking-wave height and water depth (
Table 2).
The proposed model shows better performance for each breaking-wave height and water depth than the three existing breaking-wave formulas in
Figure 4. The analysis showed that no significant difference in performance was observed between the proposed model and previous breaking-wave formulas. Excluding the equations of Rattanapitikon and Shibayama (CA_Hb_2 and CA_hb_2) in
Figure 4, the results of the others were considered to have suitable predictive performances as the absolute values of B for both the breaking-wave height and water depth were less than or equal to 0.005.
A scatterplot of the entirety of the test data is presented in
Figure 5 to obtain a more precise comparison between the proposed model and the linear regression equations (CA_Hb_3 and CA_hb_3) using the feed-forward neural network with the linear activation function from Lee and Cho [
42]. The equations of Lee and Cho (CA_Hb_3 and CA_hb_3) present, on average, the lowest B absolute value for all the targets, but they have a higher RMSE value and a lower R value compared to the proposed model. This is presumed to be because, as seen in
Figure 5a,b, the equations of Lee and Cho (CA_Hb_3 and CA_hb_3) presented irrational estimations for some data. A total of three data were considered to correspond to this issue. Two data were obtained from the study of Galvin [
34] and the other from that of Ting and Kirby [
51]. These data are the low-scale condition with a period equal to or greater than 4 s. This implies that the equations of Lee and Cho (CA_Hb_3 and CA_hb_3) were ineffective in the estimation of breaking-wave parameters in certain conditions. In contrast, the proposed model produced a reasonable prediction performance for the test data under all conditions. It was presumed that this study tried to separate the training and test data as equally as possible by reflecting the data conditions so that the proposed model could properly learn the inherent irregularities in the wave-flume laboratory experimental data. Such an attempt contributed to enhancing the versatility of the proposed model.
Meanwhile, the equations of Le Mehaute and Koh (CA_Hb_1) and Xie et al. (CA_hb_1) present an acceptable performance in predicting the breaking-wave height and water depth, respectively. However, only one wave-breaking parameter could be obtained from these equations, which have less practicality and applicability than the proposed model. In contrast, the equations of Rattanapitikon and Shibayama (CA_Hb_2 and CA_hb_2) show a lower RMSE value and higher R value, but an equal or larger B absolute value as compared to those of the proposed model for breaking-wave parameters. In conclusion, regarding B as the criteria with the highest priority for assessing the equation accuracy, since the amount of training data is small, and reflecting the practical applicability, the proposed model is better than the existing breaking-wave formulas.
A new dataset was constructed from the test data while excluding the data used for the formation of the previous breaking-wave formula, and the performance analysis was performed using the same method as before for a fair comparison and evaluation of the proposed and conventional methods. In contrast to the above case, wherein the entire test dataset was applied, only the statistics-based, nonlinear equations (CA_Hb_2 and CA_hb_2) and the linear regression equations (CA_Hb_3 and CA_hb_3) using the feed-forward neural network, which can calculate both breaking-wave parameters, were compared. The amount of newly structured data for comparatively evaluating the methods was 55 and 34 for the studies of Rattanapitikon and Shibayama [
38] and Lee and Cho [
42], respectively. Their compositions according to the bottom slope and deep-water wave height and period are presented in
Figure 6.
The performances of a multilayer neural network and the equations of Rattanapitikon and Shibayama (CA_Hb_2 and CA_hb_2) for 55 test data out of 129 (data used to calculate the Rattanapitikon and Shibayama equations excluded) are provided in
Figure 7.
Figure 8 presents the breaking-wave height and water depth predictions obtained using the proposed model and the Rattanapitikon and Shibayama equations by applying the new dataset specified above. First, the performance for estimating breaking-wave height was compared in
Figure 7a–c. The proposed model demonstrated a higher performance for B compared to the empirical equation of Rattanapitikon and Shibayama (CA_Hb_2). This was identified in the majority of the predicted values from the empirical equation that remained below the perfect agreement line, while the predicted values of the proposed model in
Figure 8a were similar to the perfect agreement line in all areas. The proposed model and empirical equation had similar results for RMSE and R. The performances for the breaking-water depth were evaluated using
Figure 7d–f. Unlike the breaking-wave height results, the values of B obtained using the proposed model and the empirical equation (CA_hb_2) were similar. In contrast, the empirical equation exhibited higher performance in terms of the RMSE and R. This is presumably because, as shown in
Figure 8b, the predicted values of the empirical equation did not significantly deviate from the perfect agreement line in the majority of the areas, but the proposed model tended to overestimate when the observation value was less than or equal to 0.1 m.
Figure 9 presents the statistics of the error metric for the proposed model and the equations of Lee and Cho (CA_Hb_3 and CA_hb_3), which were calculated using 34 data out of 129 (the data used to calculate the Lee and Cho equations were excluded).
Figure 10 provides a comparison between the estimations for the breaking-wave height and water depth that were obtained with the proposed model and the empirical equations by employing the previously introduced new dataset. First, the estimations for the breaking-wave height were compared in
Figure 9a–c. The proposed model exhibited a slightly higher performance for B than the equation of Lee and Cho (CA_Hb_3). For the remaining evaluation indicators, the empirical equation demonstrated a higher performance. Next, the predicted results for the breaking-water depth were compared in
Figure 9d–f. Unlike the results mentioned earlier, the equation of Lee and Cho (CA_hb_3) provided a smaller B absolute value than the proposed model. However, the RMSE and R values of the proposed model were superior to those of the former.
The proposed model did not demonstrate a significantly higher performance than the previous breaking-wave formulas as the amount of data used for the training was noticeably small, and the targets with observation values less than or equal to 0.1 m were overestimated, as pointed out above. That can be confirmed through
Figure 8 and
Figure 10.
The results in the case where the sigmoid activation function is not applied (denoted as NonAF) for the modeling of the linear relationship in the same network architecture in order to compare the performances with and without the consideration of the nonlinearity between the floor slope, deep-water wave height, and period and the breaking-wave height and water depth, which correspond to the results of training the linear relationship, are listed in
Table 3. Based on the aggregate statistical values presented in
Table 3, the linear model derived from our proposed network architecture (NonAF) can be observed to exhibit a slightly higher performance compared to the proposed model. Such a result was obtained because the proposed model tended to overestimate all of the targets with observation values less than or equal to 0.1 m, as shown in
Figure 11a,b. In contrast, for observation values greater than 0.1 m, no significant difference in performance between the models was observed. However, this does not imply a linear relationship between the input wave characteristics and wave-breaking parameters. The main reason for this is that most of the input wave characteristics were disproportionately from a one-dimensional, small-scale wave-flume laboratory experiment rather than field conditions. In addition, the results of previous studies have demonstrated that wave breaking is an extremely unpredictable phenomenon. Most of all, the amount of data used for training and testing was insufficient. These aspects are expected to improve as data acquisition is updated in the future.
5. Discussion and Conclusions
The breaking wave is a significant factor for predicting and responding to the coastal hydrodynamic and environmental issues, so numerous theoretical studies and wave-flume laboratory experiments were conducted in order to acquire the prediction accuracy and knowledge of wave breaking. However, the nonlinearity between parameters, such as the bottom slope, deep-water wave height, and period, has not been fully incorporated into the existing empirical equations. Therefore, this study proposed a multilayer neural network utilizing a nonlinear activation function and backpropagation in order to investigate the effects of nonlinearity. From 31 sources, 630 data were collected to construct the proposed network for estimating the breaking-wave index. After excluding the data that were inapplicable for analysis, 584 data were used for training and testing the newly proposed model. Approximately 80% of the total data (455) were randomly selected for training the model, and the remaining data (129) were used to evaluate the performance. The bottom slope, deep-water wave height, and period were plugged in as the input variables that simultaneously estimated the breaking-wave height and wave-breaking location. The estimated breaking-wave-index performances were evaluated using error metrics and compared with the existing wave-breaking formulas.
Consequently, the performance of the proposed model was better than the existing breaking-wave-index formulas as well as being robustly applicable to laboratory experiment conditions, such as wave condition, bottom slope, and experimental scale. Furthermore, the proposed method directly predicted the breaking-wave height and water depth with nondimensionalization. The input and target variables of the proposed model were not nondimensionalized and directly estimated the breaking-wave height and water depth. So, it can exclude errors from the secondary transformation of raw wave data and improve the prediction performance. However, it tended to overestimate the breaking-wave height and water depth in the case of observation values less than or equal to 0.1 m. The proposed method is expected to show significantly better performance compared to the existing method if there is sufficient data. Therefore, in future studies, more wave-flume laboratory experiments will be performed under various conditions in order to obtain the data of the bottom slope, deep-water wave height and period, and breaking-wave height and water depth. In addition, the spatial distribution of continuous water surface elevation is important for accurately estimating the breaking-wave height and water depth. However, it is difficult to obtain that information using conventional acoustic sensors. Thus, if continuous wave measurement in space and time is possible through visual intelligence of experimental video, more data can be obtained, so more accurate wave-breaker-index estimation and high applicability can be expected.
Contribution Points
The accuracy of estimating the breaking-wave height and water depth was improved by fully incorporating the nonlinear relationship between deep water wave condition, bottom slope, and wave-breaker index.
Furthermore, a single model was proposed for simultaneously estimating the breaking-wave height and water depth by setting the input variable as deep-water wave data. This gave invaluable usability to the proposed model in this study.
The performance of the proposed model is robustly applicable to laboratory experiment conditions, such as wave condition, bottom slope, and experimental scale.
The newly proposed model directly utilizes breaking-wave height and water depth without nondimensionalization; thus, applicability can be significantly improved and excludes errors from the secondary transformation of raw wave data.