1. Introduction
Underwater acoustic communication networks [
1,
2,
3] can facilitate the exchange of information among multiple types of observation equipment in diverse underwater locations. This technology has been extensively applied in marine resource exploitation, marine environment monitoring, and related fields [
4,
5,
6]. As a result, this topic has emerged as a significant area of research.
After extensive development, underwater acoustic communication technology has evolved to include a wide range of communication modes [
7,
8,
9,
10,
11,
12], including single-carrier coherent phase shift keying (PSK), noncoherent multifrequency shift keying (MFSK), and random spread spectrum (PRSS). Each mode offers distinct advantages in terms of communication capability, reliability, and invisibility. However, the current underwater acoustic communication network typically relies on a single mode, which may not effectively adapt to complex and dynamic marine environments or the increasing diversity of applications. By dynamically selecting the optimal communication mode [
13,
14,
15], an underwater network can maximize efficiency in various environments and application requirements. In a previous paper [
14], a multimode underwater acoustic network was introduced, although it only demonstrated its capability for multimode transmission without analyzing its impact on network performance. In another paper [
15], Liu proposed a multimode network in which the communication regime is selected based on the channel delay and SNR. However, the choice of the communication regime is mainly determined by the marine environment and the point-to-point receiving SNR, focusing solely on the node’s perspective rather than considering a multimode optimization scheme that maximizes the network’s overall performance. Throughput serves as a crucial metric for evaluating the transmission performance of an underwater acoustic network. Therefore, this paper aims to analyze the throughput performance of a multimode underwater acoustic communication network and determine the optimal multimode solution.
The extended propagation delay of the underwater acoustic channel gives rise to an ambiguous channel state (busy/idle) and uncertain signal arrival time. This complexity creates challenges when designing traditional protocols for underwater acoustic communication networks and significantly reduces network throughput performance. However, recent research [
15,
16,
17,
18] has presented an alternative viewpoint. Chitre [
16] proposed a novel method for analyzing throughput in networks with large propagation delays, which deviates from the traditional approach of suppressing such delays. Instead, it utilizes propagation delay as an advantage to achieve higher network throughput. Building upon this approach, Said [
17] proposed a method for analyzing the upper limit of throughput performance in linear underwater acoustic networks. In [
18], the authors focused on the physical layer and examined the throughput performance in a time-slotted UASN with guard time. Kan [
19] analyzed the throughput performance of network nodes with different packet sizes. The study of network performance heavily relies on the state of the physical layer transmission. However, these studies have oversimplified the physical layer of underwater acoustic communication by assuming that each node possesses identical transmission capability without accounting for the variations in channel characteristics and communication rates across different links. Furthermore, refs. [
15,
16,
17,
18] utilized a traditional interference conflict model that assumes that conflicting signals cannot be successfully decoded when they overlap. In reality, multiple communication signals can overlap and be decoded successfully under certain conditions. This inconsistency with real-world scenarios leads to inaccurate limitations on network throughput performance.
Scholars have studied the influence of underwater acoustic channels on network transmission. Zhong [
20] conducted research on the difference between the horizontal and vertical propagation of underwater acoustic channels and proposed a method for analyzing the throughput performance based on this differentiation. However, this research failed to acknowledge that the discrepancy between horizontal and vertical propagation is caused by the channel arrival structure and did not thoroughly analyze the channel structure and physical layer communication. On the other hand, Nan [
21] and Hua [
22] utilized the signal-to-noise ratio to assess the successful reception of physical layer underwater acoustic communication signals and designed an MAC protocol. Nevertheless, in an underwater acoustic environment, channel characteristics such as the signal Doppler shift and multipath expansion structure significantly impact the transmission status of the physical layer. Currently, the propagation characteristics of ocean channels are not considered in underwater acoustic networks. This can lead to inadequate descriptions of interference conflicts, ultimately affecting the performance analysis and protocol design of underwater acoustic networks.
Building upon the aforementioned research, this paper initially analyses the influence of underwater acoustic channels on the bit error rate and packet loss rate of underwater acoustic communication. It develops a decoding conflict model that adheres to the transmission laws of underwater acoustic channels, thus accurately modeling conflicts in underwater acoustic networks. Subsequently, by taking into account physical layer characteristics such as dynamic channels and multimode transmission, the transmission of the network is described using the decoding conflict model. By integrating the Markov decision process, an analysis model for achieving high-precision performance in multimode underwater acoustic networks is formulated. Additionally, a performance analysis method for multimode underwater acoustic networks is proposed. These efforts lay a theoretical foundation for underwater acoustic network design and subsequent protocol research.
The rest of the paper is organized as follows. The system model is introduced in
Section 2. In
Section 3, the Markov decision process modeling of throughput performance limits of multimode underwater acoustic networks is presented. Additionally, a solution method based on dynamic programming and greedy algorithms is provided. Simulation results are presented and discussed in
Section 4 and conclusions are drawn in
Section 5.
2. System Model and Assumptions
In [
15,
16,
17,
18,
19], the authors proposed a throughput performance analysis method based on the traditional interference conflict model. Based on this, this paper proposes a more accurate interference conflict model and throughput analysis method, which are briefly introduced below.
2.1. Integer Delay Matrices and Traditional Network Interference Conflict Model
In a linear acoustic communication network with nonzero propagation delay, the propagation delay between nodes
i and
j is defined as
, where ∀
, 1 ≤
i ≤
N. Assume that the channel is symmetric, i.e.,
. The geometric relationship between network nodes is represented by the delay matrix
, which is defined by using a standard unit time slot length.
where
represents the spatial location of node
i in three-dimensional space;
represents the current network environment parameters, primarily including the sound velocity profile and submarine type;
represents the propagation delay calculated by the Bellhop model under the current environment and network node location; and
τ represents the length of a time slot. It is important to note that the elements of the delay matrix are propagation delays between links in units of time slot length
τ and can be rational numbers, i.e.,
D can be a noninteger delay matrix. However, with an appropriate choice of time slot length
τ, the given noninteger delay matrix can be approximated by an integer delay matrix
D as:
where
is the integer nearest to a and less than
.
The conventional model of network interference collision proposes that when the timing of two packets reaching the same receiver overlaps, the receiving node is unable to correctly receive any packet.
The network transmission mentioned in this paper is unicast propagation, which means that a sending node corresponds to only one receiving node, and the length of the message is equal to the slot length τ. Assuming normalization of the distance between adjacent nodes = 1 as the standard, and normalization of node distance for communication to RC = 1 and node interference distance to RI = k, this assumption implies that each node can only communicate with its neighbors and that nodes within a distance of k from the sending node will experience interference in reception.
2.2. Schedules and Network Throughput
The slot scheduling matrix for the network with N nodes and a period of T is denoted as SNT. The element in the matrix represents the state of node j at time slot t. In a half-duplex transceiver configuration for unicast transmission mode, if = i > 0, it indicates that node j sends data to node i in time slot t; if = −i < 0, it means that node j receives data from node i at time slot t; and if = 0, it signifies that node j is silent during time slot t.
Assume the communication distance of the network is denoted as
RC =
and the interference distance is denoted as
RI =
. If node
j transmits information to node
i in slot
t, and node
i receives it in slot
Dij, the conditions for receiving the signal from node
j are as follows:
To ensure successful reception of the signal from node
j, it is essential that no signals from other nodes reach node
i within the same time slot, as indicated in Formula (4).
Assume that the transmission rate of the link between nodes
i and
j is
bps, and the amount of information transmitted in one time slot is
bits. The corresponding block transmission delay
μ equals the unit time slot length
τ. The average network throughput
Y is defined as the total amount of successfully received information by all nodes in the network within a unit of time, normalized by the link rate. The average throughput
Y of the scheduling matrix
SNT with period
T is represented by the equation
where
is a truth function, where a value of 1 is true and 0 is false. Considering the physical layer link transmission rate, the final network throughput is
.
In [
16], the authors show that the normalized throughput of an
N node unicast network does not exceed
N/2 when the propagation delay is non-zero.
The following two examples illustrate the fundamental principle of time slot scheduling in the context of long propagation delay. We consider a 3-node equilateral triangle network with nodes positioned at the vertices of equilateral triangles, where the normalized communication distance and interference distance are denoted as
RC = 1 and
RI = 1, respectively. The transfer time for a message block is given by
μ =
τ. The delay matrix D and scheduling matrix S for this network are as follows:
For the three-node network with an equilateral triangle topology, three nodes can successfully transmit 6 units of information in the T = 4τ timeslot. The network throughput is Y = 6ντ/4τ = 1.5ν. When the propagation delay factor is not considered, the three nodes complete data transmission in three-time slots respectively, and the throughput is Y = 1ν. Therefore, the throughput of the three-node network with non-zero propagation delay is 50% higher than that of the network with zero propagation delay.
For the three-node network with an isosceles triangle topology, the delay matrix and scheduling matrix of the network are expressed as
Three nodes can successfully transmit 12 units of information in the T = 8τ time slot. The network throughput is Y = 12ντ/8τ = 1.5ν.
2.3. Decoding Conflict Model and Multimode Underwater Acoustic Communication
The conventional model for network interference collisions is derived from terrestrial wireless networks and does not consider the characteristics of multipath propagation in underwater acoustic channels. As a result, it may provide an inaccurate description of interference conflicts. This paper presents a conflict decoding model designed to align with the propagation characteristics of underwater acoustic channels to achieve precise modeling of conflicts within underwater networks. The decoding conflict model is based on the bit error rate of receiving node
i, which determines whether receiving node
j can successfully receive data packets. If the bit error rate of receiving node
i is below a certain threshold, then node
j can successfully receive data packets from node
j; otherwise, node
j cannot receive data successfully. It can be expressed by the following formula:
where
represents the received signal-to-interference plus noise ratio at receiving node
i,
is the received signal power,
is the interference signal power,
is the noise power,
and
denote the positions of nodes
i and
j, and
stands for the Doppler between transmitting and receiving nodes. Thus,
is the bit error rate value obtained under the current condition, and
is the bit error rate requirement of the application layer.
The following formula describes the decoding conflict model through the received signal-to-interference plus noise ratio:
where
is the processing algorithm of the current communication mode, and
is the minimum received signal-to-interference plus noise ratio to achieve conflict-free transmission under the constraints of the current sending and receiving conditions and the transmission requirements of the application layer.
It is assumed that the network nodes all support three communication modes, namely, PSK, MFSK, and PRSS, and the communication rate is .
3. Throughput Performance Limits of Multimode Underwater Acoustic Networks
In
Section 2.2, the communication capability of each link is assumed to be equal, resulting in a normalized throughput upper limit of the network packet of N/2. However, these conclusions may be problematic when considering the varying data transmission capabilities between different network links. For instance, in the scheduling result of an isosceles triangle network described in
Section 2.2, where an MFSK communication system is employed at the physical layer with a communication rate of 800 bps, the upper limit of network throughput reaches 1200 bps. However, because the distance between the nodes
is relatively close and the propagation loss is small, it can adopt a high-rate communication mode, such as PSK, whose communication rate can reach 1800 bps. The propagation loss
and
is large, and the MFSK with a low communication rate can still be used to transmit data. At this time, the optimal scheduling result of the three-node network is given according to Formula (5), and its throughput limit is 1/2 × 00 + 1/2 × 800 + 1/2 × 1800 = 1700 bps. However, if only
(node 1 and node 2) is used to transmit a PSK signal, its network throughput is 2/2 × 1800 = 1800 bps, which is greater than the maximum throughput of 1700 bps for the three-node network. For multimode networks, the traditional throughput upper limit analysis results have problems and are no longer applicable.
The Markov decision process [
23,
24,
25] can be described by a quintuple:
. The specific meanings are shown in
Table 1.
The policy
is a mapping function from state to action
:
. That is, given the current state
, the next action
is determined according to the policy
. For every fixed strategy π, the value function
satisfies the Bellman equation:
The action value function can be defined considering the value effect of the action
taken
The purpose of solving the Markov decision process is to find an optimal action strategy π in the current state , and the corresponding value function of the optimal action strategy is the largest.
In this paper, the throughput performance limit problem of multimode networks is modeled as a Markov decision process, and the greedy algorithm and dynamic programming algorithm are combined to iteratively solve the network throughput and optimal scheduling results. The following is a detailed introduction.
3.1. Modeling of Throughput Performance Limits in Multimode Networks
3.1.1. System State Modeling
In the previous introduction to the network node scheduling model, the states of nodes are denoted as , and the time slot indicates the current time. In an -node network, each node has states, which include sending signals to other nodes (the number of states is ), receiving signals from other nodes (the number of states is ), and an idle state. The states of the sending and receiving nodes of each link correspond to each other, for example, in link , node transmits signals that reach node through a propagation delay . This corresponding relationship can be utilized to reduce the number of node states and simplify the MDP process. Specifically, only the sending state and idle state are considered individual node states, while the receiving state of network nodes is jointly determined by both the sending state of source nodes and the propagation delay . The receiving status of the node is no longer considered, and the status of the node is reduced to .
The propagation delay corresponding to the maximum transmission range in the network is denoted by . The decision of time slot is related only to the node state of the time slot to , so the network state can be determined by the network node state of the first moments. Therefore, the state can be modeled as a matrix of , where represents the state of the i-th node at slot t − G – 1 + k.
3.1.2. Action Modeling
Action modeling is similar to state modeling in that the number of states per node remains constant at . Since state includes the time slot from to , the signal transmitted during this period will be received in time slots to , and action will directly impact each node’s ability to successfully receive. To evaluate the return function, network action must include the actions of all nodes in slots to . Therefore, action can be represented as a matrix , where indicates the action of node in slot . Each node has actions, including sending actions and idle actions.
3.1.3. Probabilistic Modeling of System State Transition
The system state transition probability is a deterministic function. The next state is determined by the current state and action a.
3.1.4. Reward Function Modeling
The reward function is represented by the ability of the current state to successfully receive data in the future. Specifically, under the decoding conflict model, whether each node can successfully receive data is analyzed according to the current state and action .
(1) When the node receives the signal in the sending state, the amount of data successfully received is 0.
(2) If the node does not receive the signal that the destination node is its own, the amount of data received at this time is 0.
(3) In other cases, the amount of successfully received data can be analyzed by the decoding conflict model.
It can be expressed by Formula (12):
where
means that the signal sent by node
j in slot
reaches node
i in slot
,
is the number of supported communication systems,
is the processing algorithm of the
i-th communication mode and
is the rate of the
i-th communication system. If
, it indicates that the signal sent by node
in slot
reaches node
i in slot
, which is the interference signal of node
i,
is the interference signal energy, and
is the total interference signal energy.
After that, the amount of data received by each node at each moment is added up as the reward. The transmission scale of the signal in the network is
, so it is necessary to consider the benefits of the time scale of
.
The damping coefficient is set to the constant 1.
3.2. The Method of Solving the Model
In this paper, we utilize a combination of dynamic programming and greedy algorithms to iteratively solve the Markov decision process, ultimately obtaining the throughput performance of a multimode underwater acoustic network. The key components of our approach include policy evaluation, policy improvement, and policy iteration, which are discussed in detail below.
3.2.1. Policy Evaluation—Estimating State Value
The Bellman equation of the state value function is shown in Equation (10). If according to strategy
, the value function of each state can be iterated continuously with the Bellman equation
where
is the value function of state
after iteration
, and
is the value function of state
after iteration
.
3.2.2. Policy Enhancement—Estimate Action Value
We use policy evaluation to estimate the state value. The current value function is already the optimal value function under the current strategy. However, the current state may not necessarily be the best, so it is necessary to further improve this strategy.
Let us assume that in state
, we perform a new action
and run the original policy
thereafter, calculating its action value as:
The behavior pattern formed by this new action
is called a new strategy
. By comparing the quality of
and
π′,
and
are compared, from which the current optimal strategy is selected. This process is strategy promotion, as shown below:
3.2.3. Policy Iteration
When a strategy
is provided, a value function
can be derived based on the strategy. Subsequently, a greedy strategy
can be obtained based on the value function. A new value function can then be obtained according to the new strategy
, leading to the generation of a new greedy strategy. This iterative cycle eventually yields the optimal value function
and the optimal strategy
. The process of policy updating and improvement through cyclic iteration is referred to as policy iteration. The goal of policy iteration is to achieve convergence toward an optimal policy by iteratively calculating the value function. The specific process is outlined below:
where
indicates policy evaluation and
indicates policy promotion. The convergence condition is that the optimal strategy obtained in the last two iterations is the same.
3.2.4. Algorithms for Performance Analysis
The throughput performance analysis algorithms of multiregime networks are shown as follows, including Algorithms 1 and 2. Algorithm 1 is used to calculate the optimal action value function, and Algorithm 2 is used to transform the results of Algorithm 1 into throughput performance limits and optimal scheduling schemes for multimode networks.
Algorithm 1 Multimode network throughput performance analysis algorithm |
Input: Network node sound source level (SL), propagation loss (TL), delay matrix (), background noise level (NL), communication rate () and decodable SNR threshold () of multiple communication systems, time slot length τ Output: Optimal policy Procedure: Set the value function of the initial system state to 0, and the current policy to evenly distribute all actions, While true
1. According to the current state , combined with the parameters of the sonar equation, SL, TL, etc., the SINR of each network node at different times of action is calculated
1.1 Combined with the reward function model corresponding to Formula (11), the profit value of each action under the current state is analyzed successively.
1.2 According to the strategy evaluation Formula (13), update the value function of the current state .
2. Repeat step 1 to evaluate the value function of the next state until the value function evaluation of all states is completed.
3. According to the policy promotion Formula (15), the new optimal strategy for all states is analyzed in turn.
4. If = is true, then
4.1 The optimal policy is obtained and output. 4.2 Break.
End if End while |
Algorithm 2 Throughput and scheduling result analysis algorithm |
Input: Optimal policy , temporary scheduling matrix , status number Output: Multimode network throughput performance limit , optimal scheduling scheme Procedure: 1. 2. flag true 3. 4. While flag
4.1
4.2
4.3
4.4 if the network state saved in C has a loop, then
is the part of the loop in ;
Calculate the throughput of according to the formula (12), which is .
flag false.
End if
End while 5. , return 2 until all states are computed. 6. 7. |
4. Performance Evaluation
We evaluate and compare the performance of the proposed methods in multimode networks via simulations. We compare the throughput of multimode networks under different network topologies and different channel qualities.
In the simulation process, the network nodes all support three communication modes, namely, PSK, MFSK, and PRSS, with a bandwidth of 3.2 k. Other parameters are shown in
Table 2.
4.1. Performance Analysis of the Three-Node Single-Mode Network
Under the equilateral triangle network topology as shown in
Figure 1a, the proposed method is used to analyze the throughput performance of the single-mode network, and the results are compared with those of traditional methods to verify the correctness of the proposed method.
The results obtained after strategy iteration are shown in the figure below. The diagram illustrates up to three action choices for each state. The optimal scheduling scheme is determined through the analysis of Algorithm 1, as shown in
Figure 2. When the initial state sequence number is three, adopting the optimal action (serial number 8) leads to the formation of a new state (serial number 8). Subsequently, the optimal action for this new state is identified, and this process continues until a closed loop exists. The scheduling result in
Figure 2 can be represented by Formula (17): a cycle comprises four scheduling time slots with status serial numbers 3, 8, 9, and 19, along with their corresponding specific status values [0;1;0], [0;3;2], [2;0;0], and [3;0;1].
There are three optimal scheduling schemes obtained through Algorithm 2, respectively.
Only one scheme is given in the paper [
14], which is the same as the first scheme in this paper, and all three possible schemes are given in this paper, which verifies the validity of the method in this paper.
4.2. Performance Analysis of Multimode Networks with Different SLs in Arbitrary Topologies
In the following, the throughput performance of the multimode network is analyzed using the method proposed in this paper under an irregular topology. The topology of an irregular triangular network is shown in
Figure 3.
The maximum time slot length of the irregular triangle network is G = 2. This network includes three transmission links
,
, and
, with corresponding link lengths rounded to
,
, and
respectively. The sound source level is measured at 186 dB. The optimal scheduling results are illustrated in
Figure 4 where different lines represent different nodes and different columns represent various time slots.
In the scheduling results:
- -
“Send_*” indicates that the node is transmitting signals during this time slot; the number corresponding to * represents the destination node of the transmitted signals.
- -
“Rcv_*” indicates that the node receives signals during this time slot; the number corresponding to * represents the source node of the received signals.
The corresponding scheduling result consists of eight slots with an upper limit output of 2400 bps per slot. To achieve maximum throughput, two communication modes PSK and MFSK are utilized within the network.
Figure 5 shows a comparison of the throughput analysis results between the proposed method and the traditional method at different emission source levels. The traditional method refers to the method presented in [
16]. The throughput performance of the proposed method surpasses that of the method in [
26], with a continued improvement observed as the number of iterations increases. After the fourth iteration of the algorithm, convergence is achieved in reaching the upper limit of throughput across all sound source levels. When comparing the throughput performance limits under different sound source levels, the results of the two methods are similar for small and large sound sources. The largest disparity between the two methods appears when the sound source level is 179 dB, at which point the method proposed in this paper achieves a 1.5 kbps per slot higher throughput compared to the method in [
16]. On average, the performance is improved by 68%.
Table 3 presents the results of the algorithm throughput analysis at various sound source levels. It is apparent that there is a direct correlation between the sound source level and the performance limit of network throughput and the communication rate of the associated communication link. When the sound source level is low, only the PRSS mode with the slowest speed can be selected for transmission. However, once the sound source level exceeds 188 dB, all links transition to using the PSK mode with the highest transmission speed. At this threshold, the upper limit of network throughput aligns with the conclusion of Chitre’s paper. It is evident that this paper’s algorithm takes into consideration physical layer characteristics such as link channel quality and multiple modes, resulting in a more accurate analysis of the upper limit of network throughput in an actual underwater acoustic environment.
4.3. Network Throughput under Different Hydrological Environments
Different hydrological conditions in the marine environment correspond to different propagation losses. The network throughput under three typical hydrological conditions is analyzed. The sound velocity profiles of these three conditions are depicted in
Figure 6a, and the corresponding transmission losses at different distances are obtained using the Bellhop model, as shown in
Figure 6. The different hydrological conditions are associated with different transmission losses, with hydrologic condition 1 having the smallest transmission loss and hydrologic condition 3 having the largest transmission loss. The propagation loss under hydrologic condition 1 is 0.48 dB and 1.31 dB smaller than that under hydrologic conditions 2 and 3, respectively.
The decodable SNR thresholds were analyzed for different systems and transmission distances using Monte Carlo simulation, yielding the specific results illustrated in
Figure 7. In the figure, link 3 represents the transmission link between node 1 and node 2, link 2 represents the transmission link between node 1 and node 3, and link 1 represents the transmission link between node 2 and node 3. The figure reveals the following findings:
(1) Among different communication systems, the PRSS mode requires the smallest decoded SNR threshold, while the PSK mode demands the most.
(2) The largest threshold signal-to-noise ratio corresponds to hydrological condition 1, while hydrologic conditions 2 and 3 have smaller threshold signal-to-noise ratios. This difference is due to the complex channel multipath structure under hydrological condition 1, which increases the decoding threshold. On the other hand, hydrologic conditions 2 and 3 have simpler channel structures, resulting in smaller decoding thresholds.
(3) In the PRSS mode, decodable signal-to-noise ratio thresholds show minimal variation across different distances and hydrological conditions. However, significant differences exist within the PSK mode when considering various distances and hydrological conditions. This difference can be attributed to the increased sensitivity of the PSK mode to channel multipath structures compared to that of the PRSS mode.
The analysis of network throughput under different hydrological conditions is presented in
Figure 8. It is evident that the throughput performance limits of each sound source level vary under different hydrological conditions, providing results that are more consistent with real-world situations. Similar to
Figure 5, the throughput results under different hydrological conditions are consistent for smaller and larger sound source levels. This is because when the sound source level is large enough, all links support multimode communication, whereas when the sound source level is small, no communication is possible. The maximum difference between the throughput upper limits of the three hydrologic conditions is 0.5 kbits/slot. Compared to the traditional method, the average throughput performance is improved by 67.5%, 78.3%, and 67.5%, respectively, under the three hydrological conditions.
5. Conclusions
We have conducted simulations to demonstrate that considering the differences in the communication capabilities of different links can lead to more accurate throughput limits for multimode networks. The signal-to-noise ratio of received signals at different nodes can vary due to factors such as the network node sound source level, propagation loss, marine environment, and interference signals. Higher received signal-to-noise ratios in communication systems can support higher transmission rates, while lower ratios support lower rates. By leveraging this understanding, we remodeled the throughput analysis process and found that the upper limit of the throughput performance for multimode networks is greater than that of traditional methods. Additionally, we developed a decoding conflict model that incorporates the propagation characteristics of actual underwater acoustic channels, providing a more precise assessment of underwater acoustic signal conflicts.
In summary, our proposed methods offer an accurate evaluation of the throughput performance of multisystem underwater acoustic networks. However, it is worth noting that when the network topology becomes more complex or the number of nodes increases, the scale of the state matrix and action matrix becomes very large, rendering the solution algorithm based on the combination of dynamic programming and greedy algorithms impractical. In light of this, we plan to enhance the solution method and propose a new approach that can address the application needs of any network topology. Additionally, we intend to conduct sea tests to apply and validate the method outlined in this paper, with the aim of further refining and improving its effectiveness.