A high-heat flux removal from microdevices is still a major challenge even after more than four decades of research and development. The micro-cooling devices are being investigated both experimentally and in combination with the computational study. The experimental results are not synchronous, but the computational results appear to minimize the anomaly in the experimental analysis of micro-devices termed as “scaling effects,” which are difficult to capture in experimental investigation [
1]. Various computation-intensive studies have been performed in the past exploring the effect of fluid flow characteristics and the heat-transfer effects. The micro-cooling devices encompass the use of microchannels [
2,
3,
4,
5], porous materials [
6], and microjets [
7,
8,
9,
10,
11,
12,
13,
14] with a promising future in all types of designs. However, each of them has its own set of challenges, from the design stage to the manufacturing stage. The substrate temperature non-uniformity along the length of the microchannel heat sink is one of its inherent features. In order to counter it, Samal and Moharana [
15] provided a novel concept of recharging design of microchannel, which improves temperature uniformity and reduces back axial conduction. The porous materials have to tackle the challenge of the high-pressure drop [
6]. The microjet impingement needs to deal with the problem of material erosion [
16], but relaxes on the constraint of temperature non-uniformity and pressure drop. Most of the investigation of micro-cooling assumes uniform heat flux generation; however, in practical cases, microprocessor chips generate non-uniform heat flux [
17]. The flexibility in microjet arrangement allows it to be directly embedded within the device to alleviate the hotspots and thermal stresses.
Many investigations were conducted over different designs to improve the heat transfer effects in the heat sink within the constraints such as maximum pumping power, maximum pressure drop, a constant flow rate of cooling fluid, constant heat flux, constant interface area, or constant cross-sectional area. Hajmohammadi et al. [
7] investigated to obtain minimum peak temperature in the substrate, for optimal location and size of heat sink attachments, with the total area of heat sink remaining constant. It was suggested that dividing the large heat sink into a group of smaller ones enhances thermal performance by reducing peak temperatures. Husain et al. [
8] numerically investigated a hybrid design of microchannel comprising of a microchannel pillar and jet impingement technique, which enhances heat transfer rate as the stagnation effect under jet diminishes due to channel flow. Zhang [
9] combined a slot jet with microchannel design and analyzed the hybrid design numerically. For the same cross-sectional area, three channel shapes were identified, such as circular, trapezoidal, and rectangular. Results showed that the module with a circular channel had the highest pressure drop while the trapezoidal channel shape has the least substrate temperature. The highest pressure drop in the circular shape is attributed to the strong vorticities formed perpendicular to the direction of flow. Peng et al. [
10] performed a comparative analysis between traditional microchannel heat sink (TMC) and multi-jet microchannel (MJMC) heat sink, using numerical techniques. They found that temperature uniformity improves significantly with the help of an increasing number of jets. The cooling performance of MJMC surpasses TMC. Husain et al. [
11] compared the effect of different flow spent schemes with and without extraction ports. They presented that fluid removed from edges in unconfined flows have higher temperature uniformities and lower pressure drop. Qidwai et al. [
4] found that temperature uniformity is obtained at the trade-off with Nusselt number in their study of a diverging channel. Han et al. [
12] combined the microchannel with the microjet to capture the merits of both designs. The full-length trenches take away the spent fluid, which helps to reduce the crossflow effect. The maximum temperature distribution remains within the range of 25 °C. Deng et al. [
14] investigated slot jet array and hybrid microchannel slot jet array heat sinks in the range of Reynolds number 230 to 2760. They affirmed that a hybrid heat sink has better performance for the same flow rate. Xie et al. [
18] presented an argument that a uniform heat flux condition is acceptable for chips to surface only, rather than for heat sink surface, as assumed in many studies. The study included a combination of flow arrangements and different positions of heat-generating chips. The investigation highlighted that chip position significantly affects the temperature distribution. Lelea [
19] found that the position and number of jets strongly influence hydrothermal performance. For fixed pumping power, the minimum temperature is higher for heat sinks with multiple inlets, and the temperature difference across the substrate is reduced. Wiriyasart and Naphon [
20] considered parametric study of different fin structures liquid jet impingement to provide guidelines for heat sink design for minimum thermal resistance.
In most of the investigations, hybrid designs are analyzed for uniform heat flux conditions. For non-uniform heat flux, Yoon et al. [
21], considered thermal resistance as an indicator to obtain the optimal position of partial heating for uniform and non-uniform heat flux conditions. They concluded that partial heating located at the centre behind the heat sink is the optimal position irrespective of the non-uniformity of heat flux. Sharma et al. [
17] proposed a one-dimensional semi-empirical modeling approach for quick determination of initial design of microchannel heat sink targeting hot-spots. Design parameters in a microchannel heat sink were optimized by Hadad et al. [
22] for non-uniform heat flux, using the RSM and JAYA algorithms together. The other work by Hadad et al. [
23] investigated the shape optimization of a water-cooled impingement micro-channel heat sink, including manifolds.
1.2. Machine Learning-Based Surrogate Model Optimization Techniques
Machine learning algorithms evolve constantly with the aim for better prediction and the least amount of time and effort. They have been implemented in several fields such as intelligent communications [
29], 3D printing technology [
30], hydrological science [
31], and many more. Many studies are conducted, in order to take advantage of machine learning, to improve heat transfer in heat sinks. Some investigations are focused on shape optimization, such Krzywanski [
32], who introduced the design methodology of a bio-inspired falling-film heat exchanger using the combination of Genetic Algorithm (GA) and Artificial Neural Network (ANN). They reported that the AGENN model can successfully determine the required design parameters and operating conditions to generate the desired total heat transfer rate of the heat exchanger. Han et al. [
33] used surrogate-based shape optimization of the wing-body of aircraft. They used Latin Hypercube sampling; however, to choose new sample points they used infill-criterion, so as to generate new designs based on the known designs. To explore the sensitivity of the change in design parameters, the investigation comprised of Maximum-Likelihood Estimation (MLE), with its only drawback that large sample points are needed to estimate the correlation. They suggested using surrogate-based models several times to reach the global optimum.
Xi et al. [
34] used the MATLAB function nonlinear fitting model ‘nlinfit’ to obtain a correlation of 90 experimental data points to determine the Nusselt number. Then, the Back Propagation Neural Network (BPNN) was used to train the model with a Genetic Algorithm (GA) to improve heat transfer performance of ribbed channels. In the combined GA-BPNN algorithm, GA was used to optimize the weights and threshold of BPNN so that, together, they could reach the global optimum solution. Singh [
35] addressed the problem of turbine-blade cooling using a slot jet cooling mechanism. The location of slot jet impingement on a concave surface was optimized using hybrid feed-forward Artificial Neural Network (ANN) and Genetic Algorithm (GA). ANN was used to train nonlinear data on 175 cases in the design space. The predicted optimal solution of ANN-GA was simulated in CFD simulation, for which the difference in output was negligible. Shi et al. [
36] proposed a methodology for modifying heat exchanger design using CFD coupled with the Radial Basis Neural Network surrogate model and genetic algorithm to achieve maximum flow uniformity.
1.3. Optimization Techniques
GA algorithms can be sub-divided into three operations: Mutation, Crossover, and Selection. Tan et al. [
37] suggested that parameters in GA are chosen based on the rule of thumb. There is no standard to select a proper strategy or parameter or their combination, which may lead to poor convergence and trapping in local optima. Additionally, these algorithms need to be run a thousand times before reaching global optima. Samii [
38] compared the two different approaches and concluded that PSO is simpler to use based on the fact that it has higher convergence compared with GA, since, when the individual possesses the same genetic material, the crossover effect is almost eliminated. For structural optimization Eagle strategy, the PSO algorithm combines the advantages of global search and intense local search [
39]. The Eagle strategy uses the Lévy walk method to identify the search domain, then PSO is applied in the second stage. This efficient combination reduces the computation time and increases the probability of achieving global optima.
Bello-Ochende et al. [
40] used the Constructal theory in combination with CFD to reduce maximum wall temperature; however, the originator of the Constructal theory considers it to be a universally valid theory and not an optimality statement [
41].
Entropy Generation minimization (EGM) coupled with CFD is another methodology, based on the first principle, applied in many investigations [
42,
43].
Some studies preferred to use the simplified conjugate gradient method [
44,
45], which uses sensitivity analysis to establish the conjugate direction. However, in this method, the probability of trapping in the local optima is quite high [
46].