TPE-Optimized DNN with Attention Mechanism for Prediction of Tower Crane Payload Moving Conditions

Akber, Muhammad Zeshan; Chan, Wai-Kit; Lee, Hiu-Hung; Anwar, Ghazanfar Ali

doi:10.3390/math12193006

Open AccessArticle

TPE-Optimized DNN with Attention Mechanism for Prediction of Tower Crane Payload Moving Conditions

by

Muhammad Zeshan Akber

^*,

Wai-Kit Chan

,

Hiu-Hung Lee

and

Ghazanfar Ali Anwar

^*

Centre for Advances in Reliability and Safety, Hong Kong Science and Technology Parks, Pak Shek Kok, New Territories, Hong Kong, China

^*

Authors to whom correspondence should be addressed.

Mathematics 2024, 12(19), 3006; https://doi.org/10.3390/math12193006

Submission received: 14 August 2024 / Revised: 19 September 2024 / Accepted: 25 September 2024 / Published: 26 September 2024

(This article belongs to the Special Issue Advances in Mathematical Methods, Machine Learning and Deep Learning Based Applications, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

Accurately predicting the payload movement and ensuring efficient control during dynamic tower crane operations are crucial for crane safety, including the ability to predict payload mass within a safe or normal range. This research utilizes deep learning to accurately predict the normal and abnormal payload movement of tower cranes. A scaled-down tower crane prototype with a systematic data acquisition system is built to perform experiments and data collection. The data related to 12 test case scenarios are gathered, and each test case represents a specific combination of hoisting and slewing motion and payload mass to counterweight ratio, defining tower crane operational variations. This comprehensive data is investigated using a novel attention-based deep neural network with Tree-Structured Parzen Estimator optimization (TPE-AttDNN). The proposed TPE-AttDNN achieved a prediction accuracy of 0.95 with a false positive rate of 0.08. These results clearly demonstrate the effectiveness of the proposed model in accurately predicting the tower crane payload moving condition. To ensure a more reliable performance assessment of the proposed AttDNN, we carried out ablation experiments that highlighted the significance of the model’s individual components.

Keywords:

tower crane; payload moving; deep neural network; Tree-Structured Parzen Estimator

MSC:

68T05; 68W05; 90C59

1. Introduction

Cranes play a vital role in the industrial, construction, and logistic sectors as material-handling machines. Ensuring the safety, reliability, performance, and cost-effectiveness of material handling systems requires monitoring the static and dynamic load. Accurately determining and efficiently controlling the payload moving condition under dynamic crane operations is crucial for crane safety. For example, if a payload mass is incorrectly assessed, overloading can occur, leading to structural failures or tipping hazards [1,2,3]. It is essential to detect anomalies in payload mass variation to proactively address potential safety risks and protect both the construction site and personnel. Tower cranes, which are an extensively used crane type, specifically in modern construction projects, are subjected to diverse dynamics and working conditions. The challenges to safe payload movement using tower cranes arise from factors such as variations in slew and hoisting speeds, fluctuations in counterweight, positioning of jib/trolley/cable, and the dynamic behavior of the payload [4,5,6,7].

Accurate prediction of payload moving conditions is crucial for the safe operation of cranes. A safe movement of payload is also associated with the prediction of payload mass within a safe or normal range. Accurate prediction of payload mass ensures that the crane operates within its designated limits, preventing overloading and safeguarding structural integrity. Also, it plays a vital role in lift planning and execution, enabling efficient transportation of materials without unnecessary delays. Moreover, correctly estimating the payload mass helps prevent dynamic instabilities, such as oscillations or uncontrolled movements, which pose significant safety risks [1,2,3,8]. Traditionally, payload mass assessment has relied on manual calculations and basic sensor technologies like load cells and pressure transducers. Some studies have also utilized analytical modeling to estimate static and dynamic loads in material handling and lifting systems [1]. While these methods have provided some effectiveness, they are not without limitations. Manual calculations are prone to human error and can be time-consuming, particularly in the dynamic environments commonly found on construction sites.

Crane cells and scales are widely employed in industrial settings and are integrated into various load monitoring systems, such as Load Moment Indicators (LMI) and Rated Capacity Indicators (RCI) [2,3]. In hydraulic cranes, the payload weight is often determined based on hydraulic pressures in the lift cylinders [9]. When cranes use ropes, cables, or chains for load transportation, an additional strain-gauge-based sensor is typically installed on the hoisting equipment. However, due to the diverse configurations of cranes in industrial practice, there are situations where installing crane scales onto the hoisting equipment is not feasible. In such cases, or when hybrid payload monitoring requires complementary measurement devices, an alternative solution is to indirectly derive load weight information from other sensors [1]. This approach involves combining sensor data with analytical models.

Model-based methods for payload estimation in hydraulically actuated material handling machines commonly utilize information from hydraulic pressure sensors (pressure difference across the boom) and angular position sensors. The model parameters are estimated offline and/or online, typically using techniques like least squares or recursive least squares [10,11]. Various models have been developed in the literature to address the dynamics of hoisted loads, often employing the Lagrangian approach. Examples include dynamic models of hydraulic lattice mobile cranes [12], overhead crane hoists with crane structure dynamic models [13,14], and numerical simulations analyzing the relationship between payload weight and the deviation in the amplitude of the electric motor current in an overhead crane hoist with an asynchronous electric motor model [15]. In some cases, a digital-twin approach utilizing a CAD model of a knuckle boom crane has been proposed for real-time monitoring of payload weight [16]. This approach combines a virtual strain gauge sensor with a nonlinear finite element model of a crane to estimate load weight through inverse modeling, using input signals from physical strain gauges used in experimental setups. These methods, while effective to a certain degree, often lack the precision and adaptability required for dynamic and complex construction environments.

Moreover, the idea to estimate a payload weight using data-driven models identified using machine learning techniques based on data collected from available sensors is implemented in several recent papers. The ANN model trained using the Levenberg–Marquardt algorithm is developed in [17] for payload weight detection of the four-wheel-drive loader based on the differential pressure across the boom cylinder, boom and bucket strokes, and vibration acceleration of the frame. The ANN model is used to estimate a payload mass for a hydraulic forestry crane based on the hydraulic pressure of the inner boom cylinder and the grapple position [18,19]. The vision system and convolutional neural network are reported to recognize a wood log weight for a forestry crane [20]. However, a thorough analysis of existing literature reveals that most of these studies have focused on machinery other than tower cranes. Consequently, the specific issue of safe payload movement assessment for tower cranes using advanced data-driven techniques remains largely unexplored and open for investigation. Furthermore, many of the proposed models do not take advantage of recent advancements in algorithm development, particularly in the realm of artificial intelligence (AI), such as the utilization of transformer architectures [21].

In summary, this research paper delves into utilizing AI models to predict the conditions of payload movement in tower crane operations. The proposed methodology makes a threefold contribution. Firstly, it involves the development of a scaled-down tower crane model for conducting experiments effectively within a controlled environment while adhering to scale conditions. Secondly, in terms of data collection, it recognizes the importance of key dynamic factors like counterweight, slew, and hoisting speed scenarios in gathering data from the scaled-down model. And thirdly, on the algorithmic front, we introduced a novel modification to construct the architecture of the DNN and proposed TPE-AttDNN (Tree-Structured Parzen Estimator optimized Attention-based deep neural network) for accurate and confident predictions. The findings of this study highlight the strength of data-driven AI models to effectively learn the complexities of dynamic tower crane operations and to achieve safe payload movement, leading to efficient and safe construction practices.

2. Overview of Research Methodology

In this section, we introduce the methodology for condition assessment of payload movement in a miniature Tower Crane (TC), and the overall flowchart of the developed methodology is shown in Figure 1. The proposed approach consists of four primary stages: (1) building an experimental setup for collecting the raw data, (2) data preparation and processing, (3) development of a deep learning model (TPE-AttDNN) for condition prediction of TC payload movement, and (4) model performance assessment and evaluation (comparative analysis and ablation experiment).

3. Experimental Setup and Data Collection

3.1. Scale-Down Tower Crane Model and Data Acquisition System

Tower cranes, which are an extensively used crane type, specifically in modern construction projects, are subjected to diverse dynamics and working conditions. Figure 2 presents a typical tower crane structure with its key components. This study develops an AI-based model to predict the safe and unsafe conditions of tower crane payload movement. To achieve this, a comprehensive experimental setup has been developed.

The experimental setup of this study consists of three components, including (1) a tower crane prototype, (2) a control box, and (3) a collection point. The Tower crane prototype represents the scale-down model of the commercially operating Potain MCT 805 M40 [22].

The prototype is equipped with two motors and one IMU sensor device. The motors functionalize the hoisting and slew motion of the payload. The sensor is mounted on the jib of the tower crane, detecting acceleration, angular velocity, angle and magnetic field corresponding to the three axes of X, Y and Z, and thus provides measurements relating to 12 distinct channels. The control box includes the hardware systems to control the working of the prototype for varying the speed and time associated with the slew and hoisting motions. The data collection point is a lab computer that includes a configured user interface to collect and process data transferred from the sensor.

3.2. Experiment Design

The movement of a payload under the dynamic operation of a tower crane involves multiple degrees of freedom, considering hoisting (up and down motion), slewing (rotational motion of the jib), translational motion of the trolley, and payload pendulation angles in the x and y axes. In our research, we have covered four degrees of freedom, incorporating hoisting, slewing, and pendulation angles [23]. Therefore, considering the dynamic working conditions of tower cranes at a construction site, we designed our experiment to have 12 test cases of tower cranes working, each representing different combinations of weight scenarios and slew/hoisting speeds. In each test case, the tower crane performed a sequence of actions to complete the cycle of a payload moving from point A to B, including hoisting up (lifting the load from point A), slewing anti-clockwise (moving towards point B), hoisting down (placing at point B), hoisting up (lifting the load from point B), slewing clockwise (moving back towards point A) and hoisting down (returning payload at its original position or placing at point A). With each sub-task allotted 5 s, a complete test case measures a signal of 30 s.

The 12 cases that were conducted represent various combinations of slew and hoisting speeds and corresponding payload weight, as detailed in Table 1. For each of these 12 cases, we gathered 20 observations. Therefore, we have a total of 240 observations, and each observation contains measurements related to 12 distinct signals denoting Acceleration, Angular velocities, Angle and Magnetic field corresponding to three axes of X, Y and Z axes. Thus, the final raw dataset contains a total of 2880 observations, each of which represents a signal of 30 s measured with an output frequency of 10 Hz.

In the context of this study, the normal scenario refers to the TC working with a payload mass within the range of 33% to 67% of the counterweight. Abnormal payload conditions occur when the tower crane moves without any payload attached (0%) or tries to lift a load surpassing its maximum capacity (100%). In contrast, the range of 33% to 67% of the payload to counterweight is deemed safe for tower crane operations. This range ensures that the payload is in equilibrium with the counterweight, promoting stability and safe functioning.

4. Data Preparation and Insights

4.1. Data Exclusion

During the data collection process, we configured the sensor output frequency to 10 Hz, and the duration of the TC payload moving cycle was set to 30 s. However, slight variations in time were recorded within the set of 20 observations, resulting in signals of different lengths. Most of the signals fell within the range of 301 to 320. These slight inconsistencies in the sampling frequency could be attributed to the manual recording of each observation or other signal measuring and transmitting errors.

To ensure consistency and uniformity in the dataset, we performed data exclusion by removing signals with lengths below 301. As a result, 78 out of the total 2880 data points were removed from the raw data. This step was taken to enhance the effectiveness of the AI models’ learning process and maintain a standardized dataset for analysis. By eliminating signals with shorter lengths, we aimed to mitigate any potential bias or irregularities introduced by the inconsistent signal durations. This data preprocessing step will contribute to the accurate training and evaluation of the AI models and improve the reliability of the subsequent analysis.

4.2. Signal Length Adjustment

Following the data exclusion process, we proceeded with signal length adjustment to ensure data consistency. This adjustment involved truncating all signals to a uniform length of 301 data points. To achieve this, we employed a method where we selected the middle index of each signal and extracted a subset of readings centered around that timestamp. Specifically, we retained 150 ± readings (left and right) with respect to the middle timestamp.

For instance, if a signal initially had a length of 316, we identified its middle index, which in this case would be 158. To establish a standardized structure, we considered the readings indexed from 8 (158 − 150) to 309 (158 + 150 + 1). By adopting this approach, we aimed to capture the maximum relevant information from the payload moving cycle while minimizing any minor uncertainties associated with signal measurement at the start and end of each observation, which could have been introduced due to manual recording. By ensuring that all signals had a consistent length of 301 data points, we were able to proceed with accurate analysis, training, and evaluation of AI models. This standardized structure enables reliable comparisons and facilitates meaningful insights from the data collected.

4.3. Data Normalization

For data normalization, we performed min–max scaling to transform the data to a fixed range, typically between 0 and 1. This is done by subtracting the minimum value of the feature and dividing it by the range of the feature. The mathematical expression for min-max scaling is:

x_{t} = \frac{x - {m i n}_{x}}{{m a x}_{x} - {m i n}_{x}}

(1)

Here,

x, a n d x_{t}

are the original and transformed values of the data point, respectively. And

{m i n}_{x} a n d {m a x}_{x}

are the minimum and maximum values of the features. The min–max scaling improves the training stability and performance of neural networks by ensuring that features have comparable scales. This helps in faster convergence and prevents certain features from dominating the learning process due to differing magnitudes [24].

4.4. Correlation Analysis

We employed the Pearson correlation method to calculate the pairwise correlation between continuous variables. The results are presented in Figure 3. An absolute correlation value greater than 0.7 signifies a strong correlation association between variables. For instance, Acceleration X(g) exhibits strong correlations with Angle Y(°) = −0.78, Magnetic field Y(ʯt) = 0.73, and Magnetic field Z(ʯt) = −0.74 for the case of COM3. The results show that neither of the two features has a very high correlation; therefore, we did not drop any signal for further analysis.

5. Model Development

5.1. Deep Neural Networks with Attention Mechanisms

Machine learning models have become increasingly popular in classification and regression tasks, with several algorithms being widely used, including K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). KNN is a non-parametric classifier that relies on instance-based learning, where the class of an unknown observation is determined by its proximity to similar observations. SVM, on the other hand, uses a hyperplane or a set of hyperplanes to separate classes with the goal of maximizing the margin between them. The type of hyperplane used can be linear, radial basis, polynomial, or sigmoid, depending on the kernel function employed. Meanwhile, RF and XGBoost use ensembles of decision trees to make predictions, combining the performance of individual trees to produce an outcome [25,26,27].

A comparison of these machine learning methods is presented in Table 2, highlighting their advantages and disadvantages. This study applied an attention-based deep neural network (AttDNN) to perform anomaly prediction of tower crane payload, which is particularly well-suited for handling high-dimensional, complex data, such as sensor-based data. In contrast, the other methods mentioned are more suitable for smaller and tabular datasets [28,29].

A typical deep neural network consists of interconnected layers of neurons, and the information between the layers is transformed through associated weights at neurons. Mathematically, the weight transformation (a.k.a activation of the neuron:

a_{i}

) of any i^th neuron is achieved as follows:

a_{i} = \sum w_{i j} o_{i} a n d y_{i} = f (a_{i})

(2)

For j number of neurons in preceding layers,

w_{i j}

is the weight connecting i^th neuron and neuron j.

o_{i}

is the output of the i^th neuron, and

y_{i}

denotes the activation function. In our study, we used the ReLU activation function for hidden layers and sigmoid for the output layer.

An important recent advancement in the field of neural networks is the introduction of the transformer architecture, which is based on the concept of self-attention mechanisms. The transformer model was initially introduced in the field of natural language processing (NLP) [30]. Since then, numerous studies have highlighted its superior performance compared to other deep learning models across various domains, including computer vision [31], NLP, audio processing [32,33], and even disciplines like chemistry [34], economics [34,35], and life sciences [36].

Typical Transformer architecture consists of two main components: an encoder and a decoder. However, it diverges from traditional autoencoder models by not relying on recurrence and convolutions to generate outputs. Instead, the transformer incorporates positional encoding and employs self-attention mechanisms in both the encoder and decoder. These modules are stacked together, forming a deep learning architecture that comprises multiple layers of self-attention and feed-forward neural networks [21,30]. A key novelty in the transformer model is the introduction of an attention mechanism with three key variations, notably self-attention, masked self-attention, and cross-self-attention. Also, multi-head attention is a variant of self-attention that involves performing multiple parallel self-attention operations, or attention heads, on the same input sequence. Figure 4 provides a visual representation of a multi-head attention block within the Transformer model.

Mathematically, the multi-head attention can be defined as follows.

M u l t i H e a d a t t e n t i o n (Q, K, V) = C o n c a t ({h e a d}_{1}, {h e a d}_{2} {, \dots, h e a d}_{n}) W^{O}

(3)

where

{h e a d}_{i} = A t t e n t i o n (Q W_{i}^{Q}, K W_{i}^{K}, V W_{i}^{v})

.

Here

Q, K, a n d V

represents the query, key, and value vector estimated for and input

X

. The output matrix for a single attention layer (head) is calculated as:

A t t e n t i o n (Q, K, V) = s o f t m a x (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

(4)

Here,

d_{k}

represents the dimensions of the query and key vectors. In summary, an attention function takes a query and a collection of key-value pairs as input and produces an output. In this process, the query, keys, values, and output are all represented as vectors. The output is computed by taking a weighted sum of the values, where the weights are determined by a compatibility function that measures the relationship between the query and the corresponding key.

Hence, the utilization of the attention mechanism is a crucial component of the transformer model. To leverage this mechanism, we have designed a deep neural network called AttDNN, as shown in Figure 5. The proposed architecture contains a multi-head attention block inside the stacking of dense blocks. Each dense block consists of a fully connected neural network layer followed by batch normalization and LeakyRelu activation. In this architecture, the parameters, including the number of dense blocks before and after the attention block (dbb__n and dba__n), the number of neurons (n) in each fully connected layer of dense blocks, and the number of heads (h) in Attention block, will be determined using TPE optimization technique as explained in Section 5.2.

5.2. Tree-Structured Parzen Estimator (TPE) for AttDNN Optimization

This section discusses the details of the TPE technique used for hyperparameter tunning of AttDNN. TPE is a Bayesian optimization method that is widely recognized for its exceptional performance in navigating intricate parameter search spaces. Notably, TPE offers significant advantages over other hyperparameter search techniques, such as Grid Search (GS) and Random Search (RS). It demonstrates lower computational complexity and is capable of effectively handling large parametric spaces, making it a robust choice for hyperparameter optimization of AttDNN [37,38].

The TPE method includes the estimation of two probability density functions (PDFs),

p_{b} (x)

and

p_{w} (x)

. These PDFs represent the better and worse groups, respectively, based on the domain variables. The observations are divided into these two groups using a pre-defined percentile threshold, denoted as

y^{*}

. The modeling of this threshold is achieved using simple Parzen estimators, which are also known as kernel density estimators (KDEs) [39,40]. The TPE method can be defined as follows for a given set of observations:

p (x| y, D) = \{\begin{matrix} p_{b} (x), i f y \leq y^{*} \\ p_{w} (x), i f y \geq y^{*} \end{matrix}\}

(5)

The ratio of the two density functions,

p_{b} (x)

and

p_{w} (x)

, is used to construct the acquisition function that aims to find the optimal configuration by balancing exploration (trying new configurations) and exploitation (focusing on promising configurations). In TPE, a common acquisition function used is the expected improvement (EI) [39,40]:

{E I}_{y^{*}} : = \int_{- \infty}^{y^{*}} (y^{*} - y) p (y | x) d y

(6)

For

γ = p (y < y^{*})

and

\int p (x |y) p (y) d y = γ p_{b} (x) + (1 - γ) p_{w} (x)

EI is constructed as:

{E I}_{y^{*}} = \frac{γ y * p_{b} (x) - p_{b} (x) \int_{- \infty}^{y^{*}} p (y) d y}{γ * p_{b} (x) + (1 - γ) * p_{x} (x)} \propto {(γ + \frac{p_{w} (x)}{p_{b} (x)} (1 - γ))}^{- 1}

(7)

The implementation of TPE-based optimization of AttDNN is illustrated using the pseudocode in Figure 6.

Where z is the set of hyperparameters of the search space, s is the metrics score of AttDNN with z in the validation dataset, and H is the history of validation scores and the selected z.

The optimization objective is set to maximize the accuracy of the model on the validation dataset that guides the optimization process. When defining the search space, six hyperparameters are considered, and specific search ranges are assigned to each of them. These six hyperparameters are neurons: the number of neurons in each fully connected layer of dense block controls the depth of the network, num_denseblock_before: number of neural network dense blocks before attention block. num_denseblock_after: the number of neural network dense blocks after the attention block and num_heads: the number of heads in the attention block, num_epochs: the number of epochs for training, and batch_size: the size of minibatch parameter used in training.

Table 3 lists these hyperparameters and their respective search ranges. To fit the TPE, a total of 300 trials or iterations are conducted. Upon completing these trials, the optimal configuration of hyperparameters is discovered, and the results are saved for further analysis and utilization.

5.3. Stratified Data Partitioning

A data partitioning ratio of 8:2:1 was used to split the dataset into training, validation, and testing subsets. The training and validation datasets were used to optimize the AttDNN model using the TPE algorithm, while the testing dataset was kept separate and reserved solely for evaluating the model’s generalized performance. This separation ensures that the reported performance metrics reflect how well the optimized AttDNN model can handle unseen data. Additionally, a stratified partitioning approach was employed, considering speed scenarios for hoisting and slewing motions, channel of measurement, and weight category. This approach ensures that the training and testing datasets have equal representation from different types of speed scenarios, twelve channels, and abnormal and normal weight cases, enhancing the reliability of the model’s predictions.

5.4. Performance Evaluation

For evaluating the performance of our models, we considered several performance metrics, including the false positive rate, false negative rate, accuracy, F1 score, Shannon entropy loss, and predictive variance. These metrics provide insights into different aspects of model performance and help assess its effectiveness in classification tasks. Additionally, to provide a comprehensive overview of the results, we visualized the performance using confusion matrices, as shown in Table 4. A typical confusion matrix is used for binary classification problems and consists of four prediction results:

True Positive (TP): Represents the number of positive class instances that are correctly predicted as positive by the model.
True Negative (TN): Indicates the number of negative class instances that are correctly predicted as negative by the model.
False Positive (FP): Refers to the number of negative class instances that are wrongly predicted as positive by the model.
False Negative (FN): Signifies the number of positive class instances that are wrongly predicted as negative by the model.

By analyzing the values in the confusion matrix, we can gain insights into the model’s performance in terms of correctly and incorrectly classified instances for each class. This evaluation approach helps us to understand the strengths and weaknesses of the model and make informed decisions regarding its effectiveness and potential improvements.

Based on the confusion matrix, we can calculate the false positive rate, false negative rate, accuracy, and F1 score of the model [25,41].

F a l s e p o s i t i v e r a t e = \frac{F P}{F P + T N}

(8)

F a l s e n e g a t i v e r a t e = \frac{F N}{T P + F N}

(9)

F 1 S c o r e = \frac{2 T P}{2 T P + F P + F N}

(10)

A c c u r a c y = \frac{T P + T N}{T P + F P + F N + T N}

(11)

Additionally, to evaluate the uncertainty quantification of the model, we can use Shannon entropy and predictive variance.

S h a n n o n E n t r o p y = H (p) = - p {l o g}_{2} (p) - (1 - p) {l o g}_{2} (1 - p)

(12)

P r e d i c t i v e v a r i a n c e = σ^{2} = p (1 - p)

(13)

6. Results and Discussions

In this research, a novel TPE-AttDNN (Tree-Structured Parzen Estimator optimized Attention-based Deep Neural Network) model was developed to predict the condition of payload movement in a miniature Tower Crane (TC). The model utilized sensor data obtained under varying hoisting and slew motions and counterweight scenarios. Normal conditions were defined as payload masses within 33% to 67% of the counterweight, ensuring stability. Abnormal conditions included TC movement without any payload or with a payload equal to 100% of the counterweight. To demonstrate the effectiveness of our AI approach, we performed a series of analyses, including time complexity analysis and convergence analysis of proposed TPE optimization, performance comparison of TPE-AttDNN with simple or Naïve DNN, and ablation experiments to understand the effect of the model components.

6.1. TPE Efficiency and Optimization Performance

The proposed TPE-AttDNN is optimized using 300 trials. The objective function for each TPE iteration is defined based on the accuracy metric obtained from the validation dataset. The training was performed on a 13th Gen Intel(R) Core(TM) i9-13900K processor with 32 CPUs running at ~3.0 GHz. On this system, the TPE-based optimization averaged 62 s per trial. For time complexity analysis, we computed the duration for all the trails and scaled them based on the total number of training parameters in the network. These results are further illustrated in Figure 7, illustrating the relationship between the number of trials and the time per trial, measured in milliseconds per parameter (msec/params). As the number of trials increases from 0 to 300, the time per trial also increases steadily and almost linearly. This indicates that the computational complexity and the time required for each trial rise proportionally with the number of trials conducted. To further understand the time complexity of TPE, we analyzed the number of neurons in both dense blocks against the number of trials conducted. The results are shown in Figure 8. From the figure, it is evident that the variation in the number of neurons does not significantly impact the results and follows a stable trend, possibly due to the random selection nature of Bayesian TPE.

Hence, consistent with previous studies [42], the TPE optimization process of AttDNN exhibits linear time complexity. In contrast, hyperparameter tuning using grid search optimization may exhibit very large time complexity [43] with a large parametric search space, making TPE a more efficient choice for extensive hyperparameter optimization.

After discussing the time complexity of TPE, we now present the results of hyperparameter tuning of AttDNN as visualized in Figure 9, showing the variation of model accuracy with respect to six hyperparameters: batch size, number of epochs, the number of neurons, the number of attention heads, and the number of dense blocks before and after the attention block. Each subplot demonstrates how varying these hyperparameters impacts accuracy, with the highest accuracy of 0.95 marked in each case. The best model is seen when a mini-batch size of 128 is used and 500 epochs are set for training. With the number of neurons ranging from 30 to 70, the peak accuracy is seen at 66 neurons. The model performs best when the number of heads is set to five in the attention block. For the dense blocks before and after, the accuracy remains relatively stable across different values, achieving the highest when there are six dense blocks before and seven dense blocks after the attention block. Overall, the model converges with an increase in iterations, and the validation accuracy gradually stabilizes within a narrow range, demonstrating robust performance across these hyperparameters with optimal settings at specific values. In conclusion, TPE-AttDNN has an optimal configuration with 12 hidden layers, consisting of four dense blocks before the attention block and seven after, with 68 neurons in each layer wrapped around the layer of attention block having five heads and trained with a 128 minibatch size until 500 epochs are reached.

6.2. Comparison of TPE-AttDNN with Naïve DNN

The performance of the proposed TPE-AttDNN is compared with Naïve DNN. The Naïve DNN contains 12 hidden fully connected layers and 50 neurons in each layer. For Naïve DNN, we kept the traditional setting such that it does not include attention block and batch normalization; moreover, the Relu activation function is used instead of LRelu. The training and validation loss of both models for 500 epochs is shown in Figure 10. It is evident that the TPE-AttDNN model shows lower training and validation loss compared to the Naïve-DNN model, indicating better performance and generalization. The TPE-AttDNN model converges faster and maintains a more stable loss throughout the epochs. The Naïve-DNN model exhibits higher and more fluctuating validation loss, suggesting overfitting or less effective learning.

Table 5 shows the confusion matrices for Naïve-DNN and TPE-AttDNN estimated on testing datasets. Moreover, the results of other performance metrics, including false positive rate, false negative rate, accuracy, and F1 score, are shown in Table 6. Clearly, TPE-AttDNN outperforms Naïve-DNN and achieves better results for all metrics with 0.94 overall accuracy in prediction. The proposed model accurately predicts abnormal cases in 126 out of 137 cases within the test dataset. In other words, only 11 false positives are observed out of 137 cases, giving a false positive rate of 0.08. In contrast, the Naïve-DNN achieves 0.80 accuracy, giving a higher number of false positives and false negatives.

To check the learning confidence of the model, the predictions are further evaluated for uncertainty quantification, and the results are shown in Figure 11. Based on THE Shannon entropy and predictive variance, it is evident that the TPE-AttDNN model exhibits lower uncertainty in its predictions than the Naïve-DNN model. The reduced uncertainty in the TPE-AttDNN model’s predictions suggests it is a more reliable model for the given dataset.

6.3. Ablation Experiments

After a comparison of the proposed model with Naïve-DNN, the efficiency of the proposed AttDNN is further evaluated by conducting ablation experiments on key components of AttDNN, including the following: removing the dense block before and after the attention block, removing the attention block, without using batch normalization in dense blocks, using Relu activation in dense block. The results of the ablation experiments are presented in Table 7.

The analysis reveals that the most critical component in the AttDNN model is the dense block following the attention block. When this dense block is omitted, the accuracy drops significantly from 0.9395 to 0.8256. This highlights its essential role in enhancing model performance. Furthermore, the absence of batch normalization in fully connected layers results in the second-worst performance, indicating its importance in stabilizing and improving training efficiency. Each component contributes uniquely to the overall architecture, as demonstrated by the variations in accuracy and Shannon Entropy across different configurations in Table 7. For instance, the complete AttDNN model achieves the highest accuracy and the lowest Shannon entropy, underscoring the synergistic effect of all components working together. This comprehensive evaluation affirms the necessity of each element in constructing an effective AttDNN model.

7. Conclusions and Future Works

Accurate prediction of payload is essential for ensuring crane safety, preventing overloading, and maintaining structural integrity. This research demonstrates the effectiveness of the attention-based deep neural network with Tree-Structured Parzen Estimator optimization (TPE-AttDNN) in predicting payload mass in tower cranes, addressing the challenges posed by the dynamic and complex functionality of these cranes. The TPE-AttDNN utilizes data collected by a sensor mounted on the mast of a scaled tower crane model. To comprehensively cover the operational variations of tower cranes, data are gathered by varying combinations of hoisting and slewing motions, as well as different payload and counterweight scenarios. The following conclusions are drawn based on the research.

The Bayesian-based TPE is a good choice for hyperparameter optimization of deep neural networks. Its computational efficiency, adaptability to various hyperparameter types, and scalability in high-dimensional spaces make it a favored method for optimizing intricate neural network architectures.
The proposed TPE-AttDNN significantly outperforms the simple Naïve-DNN model, achieving an overall accuracy of 0.95 and demonstrating superior performance across various metrics, including lower false positive rates and better F1 scores. These results highlight the proposed model’s effectiveness in predicting payload conditions.
The reduced predictive uncertainties of TPE-AttDNN, indicated by lower values for Shannon entropy and predictive variance, ensure reliable predictions. This will lead to more reliable decision-making in crane operations, decrease the risk of accidents, and enhance the efficiency of material handling.
Ablation experiments show that each component of the AttDNN model is crucial to its architecture, uniquely enhancing performance. The dense block following the attention block is particularly critical, as its removal significantly reduces accuracy. Batch normalization in fully connected layers is also vital for stabilizing and improving training efficiency.

We have identified several potential directions for future work. Firstly, we observed instances of overlapping signals between the Normal and Abnormal classes. Understanding the factors contributing to this overlap will help in developing more effective solutions. Secondly, our current work solely relies on data from a single sensor. To enhance the performance of our deep learning model, we plan to incorporate data from multiple sensors. By fusing information obtained from different sensors, we anticipate improved accuracy and a more comprehensive understanding of the system under study. Thirdly, it is recommended that future studies consider employing multiple ML methods to yield more credible and comprehensive results. Furthermore, considering the tower crane’s increased degree of freedom during operation, our future work will encompass the analysis of hosting, slew, and trolley motion. Exploring advanced synthetic data generation methods, such as the diffusion model, can further improve the model’s performance. Additionally, future studies should conduct sensitivity analyses based on data features, data preprocessing methods (e.g., varying signal length and frequencies), sensor channels, and tower crane working conditions. These future directions aim to address overlap issues, improve accuracy, analyze additional crane motions, and enhance data quality.

Author Contributions

Conceptualization, M.Z.A. and W.-K.C.; methodology, M.Z.A.; software, M.Z.A. and W.-K.C.; validation, M.Z.A., W.-K.C., H.-H.L. and G.A.A.; formal analysis, M.Z.A. and G.A.A.; investigation, M.Z.A., W.-K.C., H.-H.L. and G.A.A.; resources, W.-K.C. and H.-H.L.; data curation, M.Z.A. and G.A.A.; writing—original draft preparation, M.Z.A.; writing—review and editing, W.-K.C., H.-H.L. and G.A.A.; visualization, M.Z.A. and W.-K.C.; supervision, H.-H.L. and W.-K.C.; project administration, H.-H.L. and W.-K.C.; funding acquisition, H.-H.L. and W.-K.C. All authors have read and agreed to the published version of the manuscript.

Funding

The work presented in this article is supported by the Centre for Advances in Reliability and Safety (CAiRS) admitted under the AiR@InnoHK Research Cluster.

Data Availability Statement

Relevant data could be provided on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kusznir, T.; Smoczek, J. Soft-Computing-Based Estimation of a Static Load for an Overhead Crane. Sensors 2023, 23, 5842. [Google Scholar] [CrossRef] [PubMed]
Neitzel, R.L.; Seixas, N.S.; Ren, K.K. A Review of Crane Safety in the Construction Industry. Appl. Occup. Environ. Hyg. 2001, 16, 1106–1117. [Google Scholar] [CrossRef] [PubMed]
Fang, Y.; Cho, Y.K.; Chen, J. A Framework for Real-Time pro-Active Safety Assistance for Mobile Crane Lifting Operations. Autom. Constr. 2016, 72, 367–379. [Google Scholar] [CrossRef]
Rauscher, F.; Sawodny, O. Modeling and Control of Tower Cranes with Elastic Structure. IEEE Trans. Control Syst. Technol. 2021, 29, 64–79. [Google Scholar] [CrossRef]
Zhang, K.; Wang, H.; Zhou, Y.; Wang, K.; Fang, X.; Chen, B.; Mu, D. An Inclined Tower Crane Suited to Bridge Tower Construction. Sci. Rep. 2022, 12, 21934. [Google Scholar] [CrossRef]
Sadeghi, H.; Zhang, X.; Mohandes, S.R. Developing an Ensemble Risk Analysis Framework for Improving the Safety of Tower Crane Operations under Coupled Fuzzy-Based Environment. Saf. Sci. 2023, 158, 105957. [Google Scholar] [CrossRef]
Hou, C.; Liu, C.; Li, Z.; Zhong, D. An Improved Non-Singular Fast Terminal Sliding Mode Control Scheme for 5-DOF Tower Cranes with the Unknown Payload Masses, Frictions and Wind Disturbances. ISA Trans. 2024, 149, 81–93. [Google Scholar] [CrossRef]
Ramli, L.; Mohamed, Z.; Abdullahi, A.M.; Jaafar, H.I.; Lazim, I.M. Control Strategies for Crane Systems: A Comprehensive Review. Mech. Syst. Signal Process. 2017, 95, 1–23. [Google Scholar] [CrossRef]
Kalairassan, G.; Boopathi, M.; Mohan, R.M. Analysis of Load Monitoring System in Hydraulic Mobile Cranes. IOP Conf. Ser. Mater. Sci. Eng. 2017, 263, 062045. [Google Scholar] [CrossRef]
Ferlibas, M.; Ghabcheloo, R. Load Weight Estimation on an Excavator in Static and Dynamic Motions. In Proceedings of the 17th Scandinavian International Conference on Fluid Power, SICFP’21, Linköping, Sweden, 1–2 June 2021; pp. 90–103. [Google Scholar] [CrossRef]
Renner, A.; Wind, H.; Sawodny, O. Online Payload Estimation for Hydraulically Actuated Manipulators. Mechatronics 2020, 66, 102322. [Google Scholar] [CrossRef]
Sun, G.; Kleeberger, M.; Liu, J. Complete Dynamic Calculation of Lattice Mobile Crane during Hoisting Motion. Mech. Mach. Theory 2005, 40, 447–466. [Google Scholar] [CrossRef]
Bogdevičius, M.; Vika, A. Investigation of the Dynamics of an Overhead Crane Lifting Process in a Vertical Plane. Transport 2005, 20, 176–180. [Google Scholar] [CrossRef]
Haniszewski, T. Modeling the Dynamics of Cargo Lifting Process by Overhead Crane for Dynamic Overload Factor Estimation. J. Vibroeng. 2017, 19, 75–86. [Google Scholar] [CrossRef]
Semykina, I.Y.U.; Kipervasser, M.V.; Gerasimuk, A.V. Study of Drive Currents for Lifting Bridge Cranes of Metallurgical Enterprises for Early Diagnosis of Load Excess Weight. J. Min. Inst. 2021, 247, 122–131. [Google Scholar] [CrossRef]
Moi, T.; Cibicik, A.; Rølvåg, T. Digital Twin Based Condition Monitoring of a Knuckle Boom Crane: An Experimental Study. Eng. Fail. Anal. 2020, 112, 104517. [Google Scholar] [CrossRef]
Hindman, J.J. Dynamic Payload Estimation in Four Wheel Drive Loaders. Ph.D. Thesis, University of Saskatchewan, Saskatoon, SK, Canada, 2008. [Google Scholar]
Starke, M.; Geiger, C. Field Setup and Assessment of a Cloud-Data Based Crane Scale (CCS) Considering Weight- and Local Green Wood Density-Related Volume References. Croat. J. For. Eng. 2022, 43, 29–45. [Google Scholar] [CrossRef]
Geiger, C.; Starke, M.; Greff, D.; Geimer, M. The Potential of a Weight Detection System for Forwarders Using an Artificial Neural Network. In Proceedings of the 51st Symposium on Forest Mechanization—FORMEC, Improved Forest Mechanisation: Mobilizing Natural Resources and Preventing Wildfires, Madrid, Spain, 25–27 September 2018; pp. 25–27. [Google Scholar]
Geiger, C.; Maier, N.; Kalinke, F.; Geimer, M. Assistance System for an Automated Log-Quality and Assortment Estimation Based on Data-Driven Approaches Using Hydraulic Signals of Forestry Machines. In Proceedings of the 12th International Fluid Power Conference (IFK 2020), Dresden, Germany, 12–14 October 2020. [Google Scholar]
Lin, T.; Wang, Y.; Liu, X.; Qiu, X. A Survey of Transformers. AI Open 2022, 3, 111–132. [Google Scholar] [CrossRef]
Cranepedia Potain MCT 805 M40. 2024. Available online: https://cranepedia.com/ (accessed on 26 May 2024).
Rehman, S.M.F.u.; Mohamed, Z.; Husain, A.R.; Ramli, L.; Abbasi, M.A.; Anjum, W.; Shaheed, M.H. Adaptive Input Shaper for Payload Swing Control of a 5-DOF Tower Crane with Parameter Uncertainties and Obstacle Avoidance. Autom. Constr. 2023, 154, 104963. [Google Scholar] [CrossRef]
Shi, W.; Gong, Y.; Tao, X.; Wang, J.; Zheng, N. Improving CNN Performance Accuracies with Min–Max Objective. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 2872–2885. [Google Scholar] [CrossRef]
Zhang, X.; Akber, M.Z.; Zheng, W. Predicting the Slump of Industrially Produced Concrete Using Machine Learning: A Multiclass Classification Approach. J. Build. Eng. 2022, 58, 104997. [Google Scholar] [CrossRef]
Akber, M.Z. Improving the Experience of Machine Learning in Compressive Strength Prediction of Industrial Concrete Considering Mixing Proportions, Engineered Ratios and Atmospheric Features. Constr. Build. Mater. 2024, 444, 137884. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning Data Mining, Inference, and Prediction, 10th ed.; Springer Science: Berlin, Germany, 2013; ISBN 978-0-387-84858-7. [Google Scholar]
Raschka, S. Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning. arXiv 2020, arXiv:1811.12808. [Google Scholar]
Zein, E.H.; Urvoy, T. Tabular Data Generation: Can We Fool XGBoost? In Proceedings of the NeurIPS 2022 First Table Representation Workshop, New Orleans, LA, USA, 2 December 2022.
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A Survey on Vision Transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 87–110. [Google Scholar] [CrossRef] [PubMed]
Dong, L.; Xu, S.; Xu, B. Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 13 April 2018; pp. 5884–5888. [Google Scholar]
Moussavou Boussougou, M.K.; Park, D.-J. Attention-Based 1D CNN-BiLSTM Hybrid Model Enhanced with FastText Word Embedding for Korean Voice Phishing Detection. Mathematics 2023, 11, 3217. [Google Scholar] [CrossRef]
Schwaller, P.; Laino, T.; Gaudin, T.; Bolgar, P.; Hunter, C.A.; Bekas, C.; Lee, A.A. Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction. ACS Cent. Sci. 2019, 5, 1572–1583. [Google Scholar] [CrossRef]
Sang, S.; Li, L. A Novel Variant of LSTM Stock Prediction Method Incorporating Attention Mechanism. Mathematics 2024, 12, 945. [Google Scholar] [CrossRef]
Yuan, Y.; Zhang, Y.; Zhu, L.; Cai, L.; Qian, Y. Exploiting Cross-Scale Attention Transformer and Progressive Edge Refinement for Retinal Vessel Segmentation. Mathematics 2024, 12, 264. [Google Scholar] [CrossRef]
Yang, L.; Shami, A. On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice. Neurocomputing 2020, 415, 295–316. [Google Scholar] [CrossRef]
Zhang, X.; Dai, C.; Li, W.; Chen, Y. Prediction of Compressive Strength of Recycled Aggregate Concrete Using Machine Learning and Bayesian Optimization Methods. Front. Earth Sci. 2023, 11. [Google Scholar] [CrossRef]
Bergstra, J.; Bardenet, R.; Bengio, Y.; Kégl, B. Algorithms for Hyper-Parameter Optimization. Adv. Neural Inf. Process. Syst. 2011, 24. [Google Scholar]
Watanabe, S. Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance. arXiv 2023, arXiv:2304.11127. [Google Scholar]
Bekkar, M.; Djemaa, H.K.; Alitouche, T.A. Evaluation Measures for Models Assessment over Imbalanced Data Sets. J. Inf. Eng. Appl. 2013, 3, 10. [Google Scholar]
Nomura, M. Simple and Scalable Parallelized Bayesian Optimization. arXiv 2020, arXiv:2006.13600. [Google Scholar]
Yuanyuan, S.; Yongming, W.; Lili, G.; Zhongsong, M.; Shan, J. The Comparison of Optimizing SVM by GA and Grid Search. In Proceedings of the 2017 13th IEEE International Conference on Electronic Measurement & Instruments (ICEMI), Yangzhou, China, 20–22 October 2017; pp. 354–360. [Google Scholar]

Figure 1. Overall flowchart of the proposed methodology.

Figure 2. A typical tower crane.

Figure 3. Correlation of sensor channels.

Figure 4. A typical representation of the multi-head attention block [30].

Figure 5. Proposed architecture of attention-based deep neural network (AttDNN).

Figure 6. Pseudocode of Tree-Structured Parzen Estimator Optimizing AttDNN.

Figure 7. The time complexity of TPE-AttDNN for 300 trials.

Figure 8. Distribution of hyperparameters in TPE-Trials.

Figure 9. Six tuning parameters with validation accuracy.

Figure 10. Training and Validation Loss for Naïve-DNN and TPE-AttDNN over 500 Epochs.

Figure 11. Results of uncertainty quantification based on Shannon entropy and predictive variance.

Table 1. Details of tower crane functional variations under 12 test case scenarios.

Test Case ID	Slew Speed	Hoisting Speed	Payload to Counterweight Ratio
W0SSHS	Slow	Slow	0%
W0SSHF	Slow	Fast	0%
W0SFHF	Fast	Fast	0%
W33SSHS	Slow	Slow	33%
W33SSHF	Slow	Fast	33%
W33SFHF	Fast	Fast	33%
W67SSHS	Slow	Slow	67%
W67SSHF	Slow	Fast	67%
W67SFHF	Fast	Fast	67%
W100SSHS	Slow	Slow	100%
W100SSHF	Slow	Fast	100%
W100SFHF	Fast	Fast	10%

Table 2. Advantages and disadvantages of different machine learning methods.

Machine Learning Methods	Advantages	Disadvantages
kNN	Fast training speed, handles small to medium-sized datasets, highly interpretable results, can handle non-linear data	May not perform well on high-dimensional and non-linear data, may not perform well on outliers and missing data
SVM	Handles high-dimensional data, robust to outliers and missing data, highly interpretable results, can handle non-linear data	Computationally expensive, may not perform well on non-linear data, not suitable for large datasets and may not perform well on small datasets
RF	Handles high-dimensional data, robust to overfitting, highly interpretable results	Computationally expensive, does not perform well on small datasets and may not be suitable for large datasets.
XGBoost	Fast training speed, handles high-dimensional data, robust to overfitting, highly interpretable results, can handle non-linear data	May not perform well on complex data, sensitive to hyperparameter tuning, may not be suitable for large datasets, may not perform well on non-linear data
AttDNN	Handles complex data and dynamic systems, accurate and confident predictions, can handle high-dimensionality non-linearity of data	Computationally expensive, requires large amounts of training data, difficult to interpret results

Table 3. Hyperparameters to be optimized of AttDNN.

Hyperparameter to Tuning	Description	Data Type	Searching Space
neurons	Number of neurons in fully connected layers	Integer	30–70
num_denseblock_before	Number of layers before attention block	Integer	2–10
num_denseblock_after	Number of layers after attention block	Integer	2–10
num_heads	Number of heads in the attention block	Integer	2–8
num_epochs	Number of training epochs	Categorical	[50, 100, 150, 200, 250, 500]
Batch_size	Minibatch size used for training	Categorical	[32, 64, 128, 256]

Table 4. The confusion matrix binary (two-class) classification.

		Actual Class
		P	N
Predicted class	P	TP	FP
Predicted class	N	FN	TN

Here, P—Positive, N—Negative, TP—True positive, FP—False positive, TN—True negative, FN—False negative.

Table 5. Confusion matrix for prediction on a test data set of Naïve-DNN and TPE-AttDNN.

Predicted Values
Actual values		Normal	Abnormal
	TPE-AttDNN
	Normal	126	11
	Abnormal	6	138
	Naïve-DNN
	Normal	120	17
	Abnormal	38	106

Table 6. Accuracy of Naïve-DNN and TPE-DNN on testing dataset.

Model	Performance Metric
	False Positive Rate (FPR)	False Negative Rate (FNR)	Accuracy	F1 Score
TPE-AttDNN	0.0803	0.0417	0.9395	0.9420
Naïve-DNN	0.1241	0.2639	0.8043	0.7940

Table 7. Ablation study on the performance of model components.

Number	Dense Block Before	Dense Block After	Attention Block	Batch Normalization	LRelu	Relu	Accuracy	Shannon Entropy
1	-	✓	✓	✓	✓	-	0.8826	0.1187
2	✓	-	✓	✓	✓	-	0.8256	0.2422
3	✓	✓	-	✓	✓	-	0.9146	0.0886
4	✓	✓	✓	-	✓	-	0.8648	0.1588
5	✓	✓	✓	✓	-	✓	0.8932	0.1588
AttDNN	✓	✓	✓	✓	✓	-	0.9395	0.0727

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Akber, M.Z.; Chan, W.-K.; Lee, H.-H.; Anwar, G.A. TPE-Optimized DNN with Attention Mechanism for Prediction of Tower Crane Payload Moving Conditions. Mathematics 2024, 12, 3006. https://doi.org/10.3390/math12193006

AMA Style

Akber MZ, Chan W-K, Lee H-H, Anwar GA. TPE-Optimized DNN with Attention Mechanism for Prediction of Tower Crane Payload Moving Conditions. Mathematics. 2024; 12(19):3006. https://doi.org/10.3390/math12193006

Chicago/Turabian Style

Akber, Muhammad Zeshan, Wai-Kit Chan, Hiu-Hung Lee, and Ghazanfar Ali Anwar. 2024. "TPE-Optimized DNN with Attention Mechanism for Prediction of Tower Crane Payload Moving Conditions" Mathematics 12, no. 19: 3006. https://doi.org/10.3390/math12193006

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

TPE-Optimized DNN with Attention Mechanism for Prediction of Tower Crane Payload Moving Conditions

Abstract

1. Introduction

2. Overview of Research Methodology

3. Experimental Setup and Data Collection

3.1. Scale-Down Tower Crane Model and Data Acquisition System

3.2. Experiment Design

4. Data Preparation and Insights

4.1. Data Exclusion

4.2. Signal Length Adjustment

4.3. Data Normalization

4.4. Correlation Analysis

5. Model Development

5.1. Deep Neural Networks with Attention Mechanisms

5.2. Tree-Structured Parzen Estimator (TPE) for AttDNN Optimization

5.3. Stratified Data Partitioning

5.4. Performance Evaluation

6. Results and Discussions

6.1. TPE Efficiency and Optimization Performance

6.2. Comparison of TPE-AttDNN with Naïve DNN

6.3. Ablation Experiments

7. Conclusions and Future Works

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI