Efficient Hybrid Parallel Scheme for Caputo Time-Fractional PDEs on Multicore Architectures

Shams, Mudassir; Carpentieri, Bruno

doi:10.3390/fractalfract9090607

Open AccessArticle

Efficient Hybrid Parallel Scheme for Caputo Time-Fractional PDEs on Multicore Architectures

by

Mudassir Shams

^1,2 and

Bruno Carpentieri

^2,*

¹

Department of Mathematics, Faculty of Arts and Science, Balikesir University, 10145 Balıkesir, Turkey

²

Faculty of Engineering, Free University of Bozen-Bolzano, 39100 Bolzano, Italy

^*

Author to whom correspondence should be addressed.

Fractal Fract. 2025, 9(9), 607; https://doi.org/10.3390/fractalfract9090607

Submission received: 19 August 2025 / Revised: 12 September 2025 / Accepted: 18 September 2025 / Published: 19 September 2025

(This article belongs to the Special Issue Recent Advances in the Spatial and Temporal Discretizations of Fractional PDEs, Second Edition)

Download

Browse Figures

Versions Notes

Abstract

We present a hybrid parallel scheme for efficiently solving Caputo time-fractional partial differential equations (CTFPDEs) with integer-order spatial derivatives on multicore CPU and GPU platforms. The approach combines a second-order spatial discretization with the

L 1

time-stepping scheme and employs MATLAB parfor parallelization to achieve significant reductions in runtime and memory usage. A theoretical third-order convergence rate is established under smooth-solution assumptions, and the analysis also accounts for the loss of accuracy near the initial time

t = t_{0}

caused by weak singularities inherent in time-fractional models. Unlike many existing approaches that rely on locally convergent strategies, the proposed method ensures global convergence even for distant or randomly chosen initial guesses. Benchmark problems from fractional biological models—including glucose–insulin regulation, tumor growth under chemotherapy, and drug diffusion in tissue—are used to validate the robustness and reliability of the scheme. Numerical experiments confirm near-linear speedup on up to four CPU cores and show that the method outperforms conventional techniques in terms of convergence rate, residual error, iteration count, and efficiency. These results demonstrate the method’s suitability for large-scale CTFPDE simulations in scientific and engineering applications.

Keywords:

Caputo time-fractional PDEs; parallel computing; multicore architecture; biomedical fractional models; computational efficiency

1. Introduction

The past few decades have witnessed rapid progress in computational biology, personalized medicine, and medical diagnostics, driven in part by advances in the modeling and simulation of biological processes [1,2,3]. Classical approaches, typically formulated through integer-order differential equations, have played a central role in describing many physiological phenomena. Yet, such models often fall short when applied to complex biological systems characterized by long-range spatial interactions, memory effects, anomalous transport, and nonlocal behavior—features that classical partial differential equations (PDEs) do not adequately capture. To overcome these limitations, Caputo time-fractional-order partial differential equations (CTFPDEs) have emerged as a flexible and powerful framework for the simulation of real-world problems in biomedical engineering [4,5].

CTFPDEs, defined by derivatives of non-integer order in time or space, provide an enriched modeling framework capable of capturing hereditary and memory-dependent properties inherent in many biological tissues and systems [6]. They naturally generalize classical integer-order models, allowing for a more faithful interpretation of experimental data in contexts such as anomalous drug diffusion [7], electrical signal propagation in cardiac tissue [8], viscoelastic behavior of soft biological materials [9], and neural signal transmission [10]. Despite these advantages, CTFPDEs also give rise to significant analytical and computational challenges, particularly in the presence of nonlinear dynamics, multi-scale geometries, or patient-specific parameters. A representative form of such equations is

\{\begin{matrix} \frac{\partial^{σ} u}{\partial t^{σ}} + \frac{\partial^{2} u}{\partial x^{2}} + \frac{\partial u}{\partial x} + u (x, t) = f (x, t), & x \in [x^{[0]}, x^{[n]}], t \in [t^{[0]}, t^{[n]}], \\ u (x, 0) = g_{1} (x), \\ u (0, t) = g_{2} (t), u (L, t) = g_{3} (t) . \end{matrix}

(1)

where

σ \in (0, 1]

denotes the Caputo fractional order of the differential operator.

More generally, let

Σ = \sum_{i = 1}^{l} σ_{i}, σ_{i} \in (m_{i} - 1, m_{i}), m_{i} \in Z^{+}, i = 1, \dots, l .

The Caputo fractional derivative of order

Σ

with respect to the variables

(x_{1}, \dots, x_{l})

is defined as [11]:

{}^{C}D_{γ, x_{1}^{σ_{1}}, \cdot\cdot\cdot, x_{l}^{σ_{l}}}^{Σ} f (t_{1}, \dots, t_{l}) = \int_{γ}^{x_{l}} \dots \int_{γ}^{x_{1}} \frac{\prod_{i = 1}^{l} {(t_{i} - ξ_{i})}^{m_{i} - σ_{i} - 1} \frac{\partial^{m_{1} + \cdot\cdot\cdot + m_{l}} f (ξ_{1}, \dots, ξ_{l})}{\partial ξ_{1}^{m_{1}} \cdot\cdot\cdot \partial ξ_{l}^{m_{l}}}}{\prod_{i = 1}^{l} Γ (m_{i} - σ_{i})} d ξ_{1} \cdot\cdot\cdot d ξ_{l},

(2)

where

σ_{i}

denotes the fractional order associated with

x_{i}

,

m_{i} = ⌈ σ_{i} ⌉

, and

Γ (\cdot)

is the Gamma function.

As a special case with

l = 1

, we obtain the Caputo time-fractional derivative. For a function

u (x, t)

, the Caputo fractional derivative of order

σ

with respect to time t is defined as

{}^{C}D_{t}^{σ} u (x, t) = \frac{1}{Γ (m - σ)} \int_{0}^{t} {(t - τ)}^{m - σ - 1} \frac{\partial^{m} u}{\partial τ^{m}} (x, τ) d τ, m - 1 < σ < m, m = ⌈ σ ⌉ .

(3)

In particular, for

0 < σ < 1

, we recover the commonly used form

{}^{C}D_{t}^{σ} u (x, t) = \frac{1}{Γ (1 - σ)} \int_{0}^{t} {(t - τ)}^{- σ} \frac{\partial u}{\partial τ} (x, τ) d τ .

(4)

Equations (1)–(4) illustrate the mathematical structure of CTFPDEs frequently encountered in biomedical modeling. Due to their complexity, such equations generally do not admit closed-form solutions. In addition, biological systems often give rise to highly nonlinear, stiff, and spatially inhomogeneous CTFPDEs [12]. The presence of memory terms further increases the computational cost, rendering traditional methods impractical for realistic simulations—particularly in high-dimensional or real-time settings. This work is motivated by the need to bridge this gap, addressing the lack of scalable, parallel, and memory-efficient numerical solvers for CTFPDEs in biomedical imaging [13,14].

A wide range of methods has been proposed to tackle these challenges, spanning exact and semi-analytical approaches to fully numerical techniques. Analytical methods such as Laplace and Fourier transforms [15], Mittag-Leffler representations [16], and Green’s functions [17] offer valuable theoretical insights, but their applicability is largely confined to a limited class of linear, time-invariant CTFPDEs with idealized boundary conditions. These approaches become inadequate when applied to realistic biomedical models that involve nonlinear reaction–diffusion systems, irregular domains, or patient-specific heterogeneities [18,19]. To address these limitations, various analytical approximation techniques have been proposed, including the Adomian decomposition method [20], homotopy perturbation method [21], variational iteration method [22], and the fractional reduced differential transform method [23]. These methods are commonly used for solving nonlinear CTFPDEs and can provide closed-form-like series solutions. However, they often suffer from convergence issues, slow series truncation, and limited applicability—typically being effective only for weakly nonlinear problems.

In biological systems, nonlinearity often arises from feedback loops, saturation effects, or bifurcation dynamics, rendering analytical methods either divergent or symbolically intractable. As a result, research efforts have increasingly focused on numerical approaches, including the finite difference method [24], finite element method [25], spectral methods [26], convolution quadrature [27], and fractional Runge–Kutta methods [28]. These techniques offer broader applicability and have been employed in modeling cardiac conduction, tumor growth, neural activity, and drug delivery.

Nevertheless, numerical schemes for CTFPDEs are substantially more challenging than those for integer-order equations. The nonlocal nature of fractional derivatives necessitates storing the entire solution history, leading to significant memory consumption and computational overhead, particularly in long-time or high-dimensional simulations. Additionally, such schemes are subject to stability constraints, discretization errors, and complications in the implementation of fractional boundary conditions—especially in complex biomedical geometries [29,30]. Despite sustained research efforts, many of these challenges remain unresolved.

The limited scalability, high memory requirements, and inadequate real-time performance of existing methods provide the primary motivation for developing novel high-performance computational techniques. To overcome these challenges, we adopt parallel computational methodologies specifically designed for CTFPDEs. In contrast to conventional schemes that rely on sequential time-stepping or global matrix assembly, parallel methods leverage modern hardware architectures—such as multi-core processors and GPUs—to evolve solution components concurrently in space and time. These approaches are particularly well suited to CTFPDEs, as they

Alleviate memory bottlenecks by distributing historical data across processors;
Accelerate convergence through the concurrent evaluation of memory integrals and local operators;
Enable adaptive domain decomposition for efficient modeling of complex anatomical geometries;
Support real-time simulations for applications such as drug-response prediction, ECG signal reconstruction, and patient-specific medical imaging.

This study introduces a new class of parallel iterative methods for nonlinear fractional partial differential equations (FPDEs), specifically designed to address the stiffness, degeneracy and memory-dependent dynamics commonly encountered in biomedical applications. The proposed methodology is based on the following key components:

A generalized parallel fractional framework capable of handling Caputo-type time derivatives and nonlinear spatial operators;
A domain-decomposed architecture that enables concurrent computation of local solution components, thereby reducing overall simulation time;
A symbolic–numeric hybrid strategy, in which the nonlinear systems arising from CTFPDE discretizations are solved using Newton-type methods;
Comprehensive evaluation metrics, including convergence rate, CPU usage, residual error dynamics, memory efficiency, and biological interpretability.

The efficiency and applicability of the proposed schemes are demonstrated through three biomedical case studies:

A fractional cardiac conduction system with nonlinear reaction–diffusion terms;
A dynamical model of depression incorporating feedback mechanisms and long-term memory;
A sub-diffusion drug delivery model in layered biological tissues.

Overall, this study offers the following contributions in comparison to existing work:

Problem Scope: We address CTFPDEs derived from biomedical models that exhibit spatial heterogeneity and memory effects, extending beyond the idealized benchmark problems commonly considered in the literature.
Parallelization Strategy: The proposed schemes implement parfor-based parallelization in MATLAB, enabling efficient utilization of multi-core processors and yielding measurable reductions in computational time.
Benchmarking and Validation: Comprehensive tests are conducted, encompassing comparisons with analytical solutions, performance benchmarks, and application-driven case studies, in order to confirm the accuracy and robustness of the proposed approach.

The remainder of this paper is organized as follows. Section 2 introduces the class of CTFPDEs under consideration and outlines the proposed parallel scheme, including its construction and preliminary analysis. Section 3 presents the theoretical foundations of the method, covering discretization procedures, memory management, and convergence results. It also details the computational implementation, including parallel matrix assembly, solver design, and performance evaluation. Benchmarking against standard methods is provided, along with three biomedical case studies that demonstrate the effectiveness of the proposed approach. For each case study, we discuss the biological background, the formulation of the fractional model, and the simulation results, including error analysis, stability assessment, and physiological validation. Finally, Section 4 concludes the paper by summarizing the main findings and highlighting the translational potential of the method for clinical applications.

2. Construction and Analysis of the Next-Generation Computational Schemes

Building on the contributions outlined above, we now turn to the construction and analysis of the proposed computational framework. The numerical solution of CTFPDEs provides a practical approach to modeling complex phenomena that are analytically intractable, particularly in biomedical applications. Such methods are essential for capturing memory effects, nonlocal interactions, and anomalous diffusion that characterize many physiological processes.

Spatial and temporal discretization can be performed using finite difference, finite element, spectral, or convolution quadrature methods, with suitable adaptations for handling nonlinearities, heterogeneous coefficients, and irregular boundary conditions. These procedures typically lead to a system of nonlinear equations of the form

\begin{matrix} f_{1} (x_{1}, \cdot\cdot\cdot, x_{n}) = 0, \\ f_{2} (x_{1}, \cdot\cdot\cdot, x_{n}) = 0, \\ ⋮ \\ f_{n} (x_{1}, \cdot\cdot\cdot, x_{n}) = 0, \end{matrix}

(5)

where each function

f_{i}

maps a vector

x = {(x_{1}, x_{2}, \dots, x_{n})}^{t} \in R^{n}

to

R

. Defining

F (x) = {(f_{1} (x), \cdot\cdot\cdot, f_{n} (x))}^{t}

(6)

system (5) can be compactly written as

F (x) = 0

.

Numerical techniques for solving CTFPDEs via the nonlinear system formulation in (6) are often classified as local methods, most notably the classical Newton method [31] and its higher-order extensions [32,33,34]. These methods offer high accuracy and fast local convergence in the neighborhood of a solution, but they are highly sensitive to the choice of initial guess and generally lack guarantees of global convergence. Representative higher-order schemes include:

(i) The two-step, third-order method of Noor et al. [35]:

x^{[ℏ + 1]} = x^{[ℏ]} - \frac{F (x^{[ℏ]})}{F^{'} (x^{[ℏ]}) + 3 F^{'} (\frac{x^{[ℏ]} + 2 y^{[ℏ]}}{3})},

(7)

where

y^{[ℏ]} = x^{[ℏ]} - \frac{F (x^{[ℏ]})}{F^{'} (x^{[ℏ]})},

(8)

(ii) The third-order method of Dehgan [36]:

x^{[ℏ + 1]} = x^{[ℏ]} - \frac{\frac{1}{2} F (2 x^{[ℏ]} - y^{[ℏ]}) - \frac{3}{2} F (y^{[ℏ]})}{F^{'} (y^{[ℏ]})},

(9)

(iii) The super-cubic method of Darvishi et al. [37]:

x^{[ℏ + 1]} = x^{[ℏ]} - \frac{F (x^{[ℏ]})}{2} (\frac{1}{F^{'} (x^{[ℏ]})} + \frac{1}{F^{'} (y^{[ℏ]})}),

(10)

(iv) The third-order method of Sharma et al. [38]:

x^{[ℏ + 1]} = x^{[ℏ]} - \frac{1}{2} (- I + \frac{9}{4} \frac{F^{'} (x^{[ℏ]})}{F^{'} (y^{[ℏ]})} + \frac{3}{4} \frac{F^{'} (z^{[ℏ]})}{F^{'} (x^{[ℏ]})}) \frac{F (x^{[ℏ]})}{F^{'} (x^{[ℏ]})},

(11)

where

z^{[ℏ]} = x^{[ℏ]} - \frac{2}{3} \frac{F (x^{[ℏ]})}{F^{'} (x^{[ℏ]})};

(v) The fourth-order method of Cordero et al. [39]:

x^{[ℏ + 1]} = y^{[ℏ]} - (2 - \frac{F^{'} (y^{[ℏ]})}{F^{'} (x^{[ℏ]})}) \frac{F (x^{[ℏ]})}{F^{'} (x^{[ℏ]})} .

(12)

Several other single- and multi-step iterative methods for solving (6) have been proposed; see, for example, refs. [40,41,42,43] and the references therein. These conventional solvers, however, remain highly sensitive to the choice of initial guess and may fail to converge for strongly nonlinear or ill-conditioned systems. Their convergence is typically local and can be affected by parameter perturbations. For large-scale problems, repeated evaluations of the function and Jacobian lead to substantial computational overhead. Furthermore, their inherently sequential structure limits scalability, reducing their effectiveness for high-dimensional or time-dependent problems encountered in practice.

To address these limitations, we turn to parallel techniques that distribute computational tasks across multiple processors to achieve faster runtimes. Such approaches also enhance resilience and scalability, making them well suited for solving large nonlinear systems. One example is a generalized version of the classical Weierstrass–Durand–Kerner (WDK) method [44], which can be written as

x_{i}^{[ℏ + 1]} = x_{i}^{[ℏ]} - \frac{F (x_{j}^{[ℏ]})}{\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, x_{t}^{[ℏ]})},

(13)

and which converges quadratically to the exact solution of (6). Here,

G (x_{i}^{[ℏ]}, x_{t}^{[ℏ]})

computes the WDK updates to find all solutions simultaneously, with

x \in C^{n}

and

F : C^{n} \to C^{n}

We refer to method (13) as WDM^[C₂].

A generalized version of the Abreth–Ehrlich method [45] for solving (6), denoted ELM^[C₃], is given by

x_{i}^{[ℏ + 1]} = x_{i}^{[ℏ]} - \frac{F (x_{i}^{[ℏ]})}{(\frac{J_{F} (α_{i})}{F (x_{i}^{[ℏ]})} - \sum_{\begin{matrix} t = 1 \\ t \neq i \end{matrix}}^{d_{i}} (\frac{1}{x_{i}^{[ℏ]} - x_{t}^{[ℏ]}})) F (x_{i}^{[ℏ]})},

(14)

where

\frac{J_{F} (α_{i})}{F (x_{i}^{[ℏ]})} = \frac{J_{F} (α_{i}) F^{T} (x_{i}^{[ℏ]})}{F^{T} (x_{i}^{[ℏ]}) F (x_{i}^{[ℏ]})},

and

J_{F} (α_{i}) = [\begin{matrix} \frac{\partial f_{1}}{\partial x_{1}} & \frac{\partial f_{1}}{\partial x_{2}} & \dots & \frac{\partial f_{1}}{\partial x_{n}} \\ \frac{\partial f_{2}}{\partial x_{1}} & \frac{\partial f_{2}}{\partial x_{2}} & \dots & \frac{\partial f_{2}}{\partial x_{n}} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ \frac{\partial f_{n}}{\partial x_{1}} & \frac{\partial f_{n}}{\partial x_{2}} & \dots & \frac{\partial f_{n}}{\partial x_{n}} \end{matrix}] .

Cordero et al. [46] proposed a parallel scheme with convergence order

2 p

(for

p = 1

), expressed as

x_{i}^{[ℏ + 1]} = x_{i}^{[ℏ]} - \frac{F (x_{i}^{[ℏ]})}{F [x_{i}^{[ℏ]}, x_{i}^{[ℏ]} + β F (x_{i}^{[ℏ]})] - F (x_{i}^{[ℏ]}) \sum_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{d_{i}} (\frac{1}{x_{i}^{[ℏ]} - x_{t}^{[ℏ]}})},

(15)

where

x_{i}^{[ℏ]} \in R

We refer to method (15) as ACM^[C₂].

Building on these earlier methods, the present work seeks to improve efficiency, stability, and scalability in solving nonlinear systems arising from CTFPDEs. The proposed scheme is examined through memory complexity estimates, stability analysis, and convergence theorems, thereby ensuring both theoretical rigor and practical feasibility.

2.1. Construction of the Scheme

Motivated by the methods discussed above, the primary goal of this study is to develop more efficient variants of single-step and two-step parallel techniques. The proposed single-step scheme is defined as

x_{i}^{[ℏ + 1]} = x_{i}^{[ℏ]} - \frac{F (x_{j}^{[ℏ]})}{\underset{\overset{t = 1}{t \neq i}}{\overset{n}{Π}} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})},

(16)

where

z_{t}^{[ℏ]} = x_{t}^{[ℏ]} - \frac{F (x_{j}^{[ℏ]})}{F^{'} (x_{j}^{[ℏ]})} .

We refer to method (16) as

{CMM}_{1}^{[C_{3}]}

.

Building on this formulation, we construct a two-step parallel method given by

x_{i}^{[ℏ + 1]} = y_{i}^{[ℏ]} - (2 I - \frac{\underset{\overset{t = 1}{t \neq i}}{\overset{n}{Π}} G (y_{i}^{[ℏ]}, y_{t}^{[ℏ]})}{\underset{\overset{t = 1}{t \neq i}}{\overset{n}{Π}} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})}) \frac{F (x_{i}^{[ℏ]})}{\underset{\overset{t = 1}{t \neq i}}{\overset{n}{Π}} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})},

(17)

where

y_{i}^{[ℏ]} = x_{i}^{[ℏ]} - \frac{F (x_{j}^{[ℏ]})}{\underset{\overset{t = 1}{t \neq i}}{\overset{n}{Π}} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})} .

We refer to method (17) as

{CMM}_{2}^{[C_{3}]}

.

2.2. Theoretical Convergence Analysis

The convergence of parallel iterative algorithms is typically established through local convergence analysis, which guarantees convergence to the exact solution of (6) provided that the initial guess is sufficiently close. Such analysis not only determines the order of convergence but also offers valuable guidance for designing stable and efficient algorithms. In high-performance computing contexts, local convergence analysis plays a key role in preventing divergence, enhancing robustness, and ensuring consistent performance across processors, particularly when synchronization is required for large-scale nonlinear problems.

Theorem 1.

Let

α = (α_{1}, \dots, α_{n})

denote the solutions of nonlinear system (5). If the initial approximations

x_{1}^{[0]}, \dots, x_{n}^{[0]}

are sufficiently close to and distinct from the exact solutions, then method

{WDM}^{[C_{2}]}

converges with order 2.

Proof.

Define the errors

e_{i}^{[ℏ]} = x_{i}^{[ℏ]} - α_{i}, e_{i}^{[ℏ + 1]} = x_{i}^{[ℏ + 1]} - α_{i} .

Then,

x_{i}^{[ℏ + 1]} - α_{i} = x_{i}^{[ℏ]} - α_{i} - \frac{F (x_{i}^{[ℏ]})}{\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, x_{t}^{[ℏ]})},

(18)

which implies

e_{i}^{[ℏ + 1]} = e_{i}^{[ℏ]} - \frac{F (x_{i}^{[ℏ]})}{\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, x_{t}^{[ℏ]})} .

(19)

A Taylor expansion gives

F (x_{j}^{[ℏ]}) = J_{F} (α_{i}) e_{i}^{[ℏ]} + O (∥ e_{i}^{[ℏ]} ∥^{2}),

(20)

where

J_{F} (α_{i})

denotes the Jacobian of

F

at

α_{i}

, assumed to be nonsingular. Similarly,

\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, x_{t}^{[ℏ]}) = D_{i} + O (max_{i} {∥ e_{i}^{[ℏ]} ∥}^{2}),

(21)

with

D_{i} \neq 0

. Substituting into (19) yields

e_{i}^{[ℏ + 1]} = e_{i}^{[ℏ]} - \frac{J_{F} (α_{i}) e_{i}^{[ℏ]} + O (∥ e_{i}^{[ℏ]} ∥^{2})}{D_{i} + O ({max}_{t} ∥ e_{t}^{[ℏ]} ∥)} + higher-order terms .

(22)

Assuming

∥ e_{i}^{[ℏ]} ∥ \approx ∥ e_{t}^{[ℏ]} ∥

, we obtain

\begin{matrix} e_{i}^{[ℏ + 1]} & = e_{i}^{[ℏ]} O (max_{t} ∥ e_{t}^{[ℏ]} ∥), \end{matrix}

(23)

\begin{matrix} = O (max_{t} {∥ e_{t}^{[ℏ]} ∥}^{2}) . \end{matrix}

(24)

Hence, the method converges quadratically. □

Theorem 2.

Let

α_{1}, \dots, α_{n}

denote the solutions of nonlinear system (5). If the initial approximations

x_{1}^{[0]}, \dots, x_{n}^{[0]}

are sufficiently close to and distinct from the exact solutions, then method

{CMM}_{1}^{[C_{3}]}

converges with order 3.

Proof.

Define the errors

e_{i}^{[ℏ]} = x_{i}^{[ℏ]} - α_{i}, e_{i}^{[ℏ + 1]} = x_{i}^{[ℏ + 1]} - α_{i} .

Then,

x_{i}^{[ℏ + 1]} - α_{i} = x_{i}^{[ℏ]} - α_{i} - \frac{F (x_{i}^{[ℏ]})}{\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})},

(25)

which implies

e_{i}^{[ℏ + 1]} = e_{i}^{[ℏ]} - \frac{F (x_{i}^{[ℏ]})}{\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})} .

A Taylor expansion yields

F (x_{j}^{[ℏ]}) = J_{F} (α_{i}) e_{i}^{[ℏ]} + R_{i} (e_{i}^{[ℏ]}),

(26)

where

$J_{F} (α_{i})$ is the Jacobian of $F$ at $α_{i}$ , assumed nonsingular;
$R_{i} (e_{i}^{[ℏ]})$ collects higher-order terms, with $R_{i} (e_{i}^{[ℏ]}) = O (∥ e_{i}^{[ℏ]} ∥^{2})$ .

Expanding

\prod_{\begin{matrix} t = 1 \\ t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})

around the solution and assuming it is nonzero and continuous near

ζ_{i}

, we obtain

\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]}) = D_{i}^{[ℏ]} + S_{j} (e_{i}^{[ℏ]}),

(27)

where

D_{i}^{[ℏ]} \neq 0

, and

S_{j} (e_{i}^{[ℏ]}) = O (max_{t} {∥ e_{t}^{[ℏ]} ∥}^{2}) .

(28)

Hence,

\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]}) = D_{i} + O (max_{t} {∥ e_{t}^{[ℏ]} ∥}^{2}) .

(29)

Since

J_{F} (α_{i})

is invertible and bounded, and

1 / D_{i}

is also bounded, it follows that

e_{i}^{[ℏ + 1]} = e_{i}^{[ℏ]} - \frac{J_{F} (α_{i}) e_{i}^{[ℏ]} + R_{i} (e_{i}^{[ℏ]})}{\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})} + higher-order terms,

(30)

or equivalently,

e_{i}^{[ℏ + 1]} = e_{i}^{[ℏ]} - \frac{J_{F} (α_{i}) e_{i}^{[ℏ]} + R_{i} (e_{i}^{[ℏ]})}{D_{i}^{[ℏ]} + S_{t} (e_{t}^{[ℏ]})} + higher-order terms .

(31)

Assuming

∥ e_{i}^{[ℏ]} ∥ = ∥ e_{t}^{[ℏ]} ∥ = ∥ e^{[ℏ]} ∥

, we obtain

\begin{matrix} e_{i}^{[ℏ + 1]} & = e_{i}^{[ℏ]} O (max_{t} {∥ e_{t}^{[ℏ]} ∥}^{2}), \end{matrix}

(32)

\begin{matrix} = O (∥ e^{[ℏ]} ∥^{3}) . \end{matrix}

(33)

Therefore, method

{CMM}_{1}^{[C_{3}]}

converges cubically. □

Theorem 3.

Let

α_{1}, \dots, α_{σ}

be simple solutions of (5). If the initial approximations

x_{1}^{[0]}, \dots, x_{n}^{[0]}

are sufficiently close to and distinct from these roots, then method

{CMM}_{2}^{[C_{3}]}

converges with order 3.

Proof.

Define the errors

e_{i x}^{[ℏ]} = x_{i}^{[ℏ]} - α_{i}, e_{i y}^{[ℏ]} = y_{i}^{[ℏ]} - α_{i}, e_{i}^{[ℏ + 1]} = x_{i}^{[ℏ + 1]} - α_{i} .

Then,

y_{i}^{[j]} - α_{i} = x_{i}^{[j]} - α_{i} - \frac{F (x_{i}^{[j]})}{\underset{\overset{t = 1}{t \neq i}}{\overset{n}{Π}} G (x_{i}^{[j]}, z_{t}^{[j]})},

(34)

and a Taylor expansion gives

F (x_{j}^{[ℏ]}) = J_{F} (α_{i}) e_{i}^{[ℏ]} + R_{i} (e_{i}^{[ℏ]}),

(35)

where

$J_{F} (α_{i})$ is the Jacobian of $F$ evaluated at $α_{i}$ , assumed nonsingular;
$R_{i} (e_{i}^{[ℏ]})$ collects the higher-order terms, with $R_{i} (e_{i}^{[ℏ]}) = O (∥ e_{i}^{[ℏ]} ∥^{2})$ .

Expanding

\prod_{\begin{matrix} t = 1 \\ t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})

around the solution and assuming it is nonzero and continuous near

ζ_{i}

, we obtain

\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]}) = D_{i}^{[ℏ]} + S_{t} (e_{t}^{[ℏ]}),

(36)

where

D_{i}^{[ℏ]} \neq 0

, and

S_{j} (e_{i x}^{[ℏ]}) = O (max_{t} {∥ e_{t x}^{[ℏ]} ∥}^{2}) .

(37)

Hence,

\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]}) = D_{i} + O (max_{t} {∥ e_{t x}^{[ℏ]} ∥}^{2}) .

(38)

Since

J_{F} (α_{i})

is invertible and bounded, and

1 / D_{i}

is also bounded, it follows that

e_{i y}^{[ℏ]} = e_{i x}^{[ℏ]} - \frac{J_{F} (α_{i}) e_{i}^{[ℏ]} + R_{i} (e_{i}^{[ℏ]})}{\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})} + higher-order terms .

(39)

e_{i y}^{[ℏ + 1]} = e_{i x}^{[ℏ]} - \frac{J_{F} (α_{i}) e_{i x}^{[ℏ]} + R_{i} (e_{i x}^{[ℏ]})}{D_{i}^{[ℏ]} + S_{t} (e_{t x}^{[ℏ]})} + higher-order terms .

(40)

Thus,

e_{i}^{[ℏ + 1]} = e_{i}^{[ℏ]} O (max_{t} {∥ e_{t}^{[ℏ]} ∥}^{2}) .

(41)

Assuming

∥ e_{i x}^{[ℏ]} ∥ = ∥ e_{t x}^{[ℏ]} ∥ = ∥ e^{[ℏ]} ∥

, we deduce

e_{i y}^{[ℏ]} = O (∥ e^{[ℏ]} ∥^{3}) .

(42)

Now, considering the second sub-step of scheme

{CMM}_{2}^{[C_{3}]}

, we obtain

x_{i}^{[ℏ + 1]} - α_{i} = y_{i}^{[ℏ]} - α_{i} - (2 I - \frac{\prod_{\begin{matrix} t = 1 \\ t \neq i \end{matrix}}^{n} G (y_{i}^{[ℏ]}, y_{t}^{[ℏ]})}{\prod_{\begin{matrix} t = 1 \\ t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})}) \frac{F (x_{i}^{[ℏ]})}{\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, x_{t}^{[ℏ]})},

(43)

which yields

e_{i}^{[ℏ + 1]} = e_{i y}^{[ℏ]} - (2 I - \frac{\prod_{\begin{matrix} t = 1 \\ t \neq i \end{matrix}}^{n} G (y_{i}^{[ℏ]}, y_{t}^{[ℏ]})}{\prod_{\begin{matrix} t = 1 \\ t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})}) \frac{F (x_{i}^{[ℏ]})}{\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} G (x_{i}^{[ℏ]}, x_{t}^{[ℏ]})} .

(44)

Since

\prod_{\begin{matrix} t = 1 t \neq i \end{matrix}}^{n} (\frac{G (y_{i}^{[ℏ]}, y_{t}^{[ℏ]})}{G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})}) \approx I,

we obtain

e_{i}^{[ℏ + 1]} = e_{i y}^{[ℏ]} - (2 I - \prod_{\begin{matrix} t = 1 \\ t \neq i \end{matrix}}^{n} (\frac{G (y_{i}^{[ℏ]}, y_{t}^{[ℏ]})}{G (x_{i}^{[ℏ]}, z_{t}^{[ℏ]})})) \frac{J_{F} (ξ_{i}) e_{i}^{[ℏ]} + R_{i} (e_{i}^{[ℏ]})}{D_{i}^{[ℏ]} + S_{t} (e_{t}^{[ℏ]})} + higher-order terms,

(45)

and therefore,

e_{i}^{[ℏ + 1]} = e_{i y}^{[ℏ]} O (max_{t} {∥ e_{t x}^{[ℏ]} ∥}^{2}) .

(46)

Assuming

∥ e_{i x}^{[ℏ]} ∥ = ∥ e_{t y}^{[ℏ]} ∥ = ∥ e^{[ℏ]} ∥

, it follows that

e_{i}^{[ℏ + 1]} = O (∥ e^{[ℏ]} ∥^{3}) .

(47)

Hence, method

{CMM}_{2}^{[C_{3}]}

converges cubically. □

3. Computational Efficiency and Numerical Outcomes

Parallel methods are particularly valued in computational mathematics for their ability to approximate all solutions of (5) simultaneously. In contrast to classical root-finding techniques, which compute one solution at a time, parallel updating schemes improve the approximations of all solutions concurrently. This inherent parallelism reduces the overall computational time, which is especially advantageous for high-degree systems arising in engineering and biomedical applications.

The proposed method demonstrates global convergence in the neighborhood of the solutions and achieves rapid convergence to high-accuracy results when provided with suitable initial guesses. Its iterative formula is simple and does not require explicit Jacobian evaluations or matrix factorizations, both of which are computationally expensive for large nonlinear systems. Furthermore, the method can be implemented in either element-wise or diagonalized form, allowing additional optimization depending on the problem structure and the available computational resources. This combination of fast convergence, parallelizability, and low per-iteration cost makes parallel schemes especially well suited for solving large-scale nonlinear systems arising from fractional PDE discretizations in biomedical engineering, where robustness and efficiency are critical.

Computational efficiency: Parallel approaches enhance computational efficiency in biomedical FPDE applications by updating distinct solutions simultaneously, thereby reducing execution time while preserving accuracy and convergence stability in complex nonlinear models. The computational cost of parallel schemes depends on both the number of iterations and the dimension of (5). It can be quantified through the percentage computational efficiency, defined as [47]:

℘ [ς_{i}, ς_{j}] = [\frac{E (ς_{i})}{E (ς_{j})} - 1] \times 100,

(48)

where

E (ς_{i}) = \frac{log r}{W_{a s} A S + W_{m} M + W_{d} D},

(49)

and

W_{a s}

,

W_{m}

, and

W_{d}

denote the weights associated with addition/subtraction, multiplication, and division operations, respectively.

Figure 1 and Table 1 demonstrate that the proposed method outperforms existing approaches in terms of computational efficiency and the number of arithmetic operations required to reach a prescribed tolerance. Here,

φ^{[*]} = n^{2} + O (n)

, with all schemes requiring the same number of divisions per iteration.

Fractal analysis: Fractal analysis is employed to identify favorable regions in the complex plane for initial guesses, thereby enabling faster and more reliable convergence of iterative schemes. Fractal patterns were generated for a system of

2 \times 2

nonlinear equations by considering a mesh of

2000 \times 2000

points over the complex domain

[- 2, 2] \times [- 2, 2]

. Each grid point corresponds to a distinct initial guess, and the assigned color indicates the root to which the iterative process converges. The resulting fractal boundaries highlight the method’s sensitivity near basin edges, thus providing insight into its global convergence behavior. The dense and well-structured basins of attraction confirm the robustness and stability of the technique. Moreover, the observed regularity of the patterns suggests limited chaotic behavior, further supporting the efficiency and reliability of the proposed scheme. To illustrate fractal analysis, we considered the nonlinear system

\{\begin{matrix} x_{1}^{2} + x_{2}^{2} - e^{x_{1}} = 0, \\ x_{1} x_{2} - e^{x_{2}} + 1 = 0, \end{matrix}

(50)

which has the approximate solutions

(x_{1}, x_{2}) \approx (- 0.7035, 0.0)

and

(1.5407, 1.3119)

.

The dynamical results in Table 2 and Figure 2a–e clearly demonstrate that the proposed method outperforms existing schemes in terms of convergence efficiency for solving CTFPDEs. It requires fewer iterations, uses less memory, and consequently reduces the overall computational cost. Moreover, Table 2 shows that the number of basic arithmetic operations, computed using Equation (48), is significantly smaller compared with competing approaches (

{WDM}^{[C_{2}]}

,

{ACM}^{[C_{2}]}

,

{ELM}^{[C_{3}]}

). The elapsed time required to generate the corresponding fractals is also markedly lower, confirming the speed advantage of our scheme. These results indicate that the method not only accelerates convergence but also preserves numerical stability across all test cases. In contrast, earlier approaches produce less uniform basins with broader chaotic regions, whereas our scheme yields well-defined convergence zones and improved robustness against divergence in challenging scenarios. Consequently, the newly developed methods

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

provide a reliable and effective framework for accurately solving (50).

3.1. Implementation of Methodology, Convergence Enhancement, and Result Visualization

This section introduces the numerical framework for solving fractional-order partial differential equations (PDEs) arising in biomedical engineering applications. The approach comprises three main stages: discretization of the fractional PDEs using the L1 scheme, solution of the resulting nonlinear system via a parallel root-finding algorithm, and convergence enhancement through initial vector sampling combined with adaptive stopping criteria. All simulations were performed in MATLAB R2023b (The MathWorks, Natick, MA, USA) on a PC equipped with an Intel Core i7 processor (Intel Corporation, Santa Clara, CA, USA) and 16 GB RAM. Convergence was defined as the residual norm falling below tol. The following performance metrics were recorded:

-: Iteration count;
-: Percentage convergence (P-Con);
-: Computational time (CPU seconds);
-: Memory usage (MB);
-: Percentage convergence under random initial values.

Discretization of CTFPDEs Using the L1 Scheme: Consider a CTFPDE defined on the spatial domain

[0, L]

and time interval

[0, T]

, governed by a Caputo derivative of order

α \in (0, 1]

:

\{\begin{matrix} \frac{\partial^{σ} u}{\partial t^{σ}} + \frac{\partial^{2} u}{\partial x^{2}} + \frac{\partial u}{\partial x} + u (x, t) = f (x, t), x \in [x^{[0]}, x^{[n]}], t \in [t^{[0]}, t^{[n]}], \\ u (x, 0) = g_{1} (x), u (0, t) = g_{2} (t), u (L, t) = g_{3} (t) . \end{matrix}

(51)

Here,

\frac{\partial^{σ} u}{\partial t^{σ}}

denotes the Caputo fractional derivative in time,

\frac{\partial^{2} u}{\partial x^{2}}

represents diffusion, and

f (x, t)

is a nonlinear source term. At discrete times

t_{n} = n τ

, the Caputo derivative is approximated using the L1 scheme [48]:

\frac{\partial^{σ} u}{\partial t^{σ}} \approx \frac{1}{τ^{σ} Γ (2 - σ)} [b_{0} u_{i}^{n} - \sum_{k = 1}^{n - 1} (b_{n - k - 1} - b_{n - k}) u_{i}^{k} - b_{n - 1} u_{i}^{0}],

(52)

where

b_{k} = {(k + 1)}^{1 - σ} - k^{1 - σ}

. For compactness, this can be rewritten as

\frac{\partial^{α} u}{\partial t^{α}} \approx \frac{1}{τ^{σ} Γ (2 - σ)} \sum_{k = 0}^{n} w_{k}^{(n)} u_{i}^{k},

(53)

with

w_{0}^{(n)} = b_{0}

,

w_{k}^{(n)} = b_{n - k - 1} - b_{n - k}

for

1 \leq k < n

, and

w_{n}^{(n)} = - b_{n - 1}

. The spatial derivatives are approximated using standard central differences:

\frac{\partial^{2} u}{\partial x^{2}} \approx \frac{u_{i + 1}^{n} - 2 u_{i}^{n} + u_{i - 1}^{n}}{h^{2}}, \frac{\partial u}{\partial x} \approx \frac{u_{i + 1}^{n} - u_{i - 1}^{n}}{2 h} .

(54)

Combining the temporal and spatial discretizations reduces the CTFPDE to a system of nonlinear algebraic equations at each time step:

F (u) = 0,

(55)

where

u \in R^{n}

denotes the vector of unknowns at time

t_{n}

.

Implementation: To solve

F (u) = 0

, we employ a parallel iterative scheme designed to approximate all roots of the nonlinear system simultaneously. At iteration ℏ, the vector of approximations is

u^{[ℏ]} = {[u_{1}^{[ℏ]}, \dots, u_{n}^{[ℏ]}]}^{t} .

(56)

The scheme was implemented in two forms:

-: Criteria I: Element-wise scheme. Each solution component is updated independently in parallel:

$u_{i}^{[ℏ + 1]} = u_{i}^{[ℏ]} - \frac{F (u_{j}^{[ℏ]})}{\underset{\begin{matrix} t = 1 \\ t \neq i \end{matrix}}{\overset{n}{Π}} (u_{i}^{[ℏ]}, u_{t}^{[ℏ]})} .$

(57)
-: Criteria II: Diagonalized scheme. The system is reformulated in matrix form and updated via

$u_{i}^{[ℏ + 1]} = u_{i}^{[ℏ]} - \frac{F (u_{j}^{[ℏ]})}{\underset{\begin{matrix} t = 1 \\ t \neq i \end{matrix}}{\overset{n}{Π}} G (u_{i}^{[ℏ]}, u_{t}^{[ℏ]})},$

(58)

where $G$ is a diagonal matrix that approximates a suitable operator to accelerate convergence.
-: Parallel implementation with MATLAB parfor. Both element-wise and diagonalized schemes were parallelized using MATLAB’s parfor construct, which distributes independent computations across multiple CPU cores. The main iteration loop was executed in parallel while ensuring data consistency and avoiding race conditions. Unlike OpenMP, MATLAB’s parfor replicates loop variables for each worker rather than automatically sharing memory, which may affect large-scale memory usage.
This parallelization reduces computational time while preserving the accuracy of the serial version. To quantify performance, we measured the serial CPU time (T_seri), parallel CPU time (T_para), and the speedup ratio, defined as

$Speedup Ratio = ϕ_{s p} = \frac{T_{seri}}{T_{para}},$

(59)

where T_seri corresponds to execution on a single core without MATLAB parfor, and T_para to execution on four cores with MATLAB parfor. Algorithm 1 and the flow chart in Figure 3 illustrate the complete implementation, including the computation of the COC and residual error for approximating the solution of (55). A higher speedup ratio indicates greater efficiency.

Acceleration of Convergence and Stopping Criteria. To enhance both convergence speed and reliability, the following steps were applied:

-: Initial Vector Sampling. For each numerical experiment, a single initial guess vector $x_{0}$ is drawn randomly from a feasible domain, with magnitude close to $10^{- 1}$ to improve the convergence rate. This unbiased initialization avoids selection bias and provides a fair evaluation of algorithmic robustness.
-: Selection Criterion. The iterative scheme is run on all sampled vectors, and the one yielding the highest accuracy is retained, measured by

$∥u_{i}^{[ℏ + 1]} - u_{i}^{[ℏ]}∥ < t o l,$

(60)

where $t o l = 10^{- 32}$ . This high precision is achieved in MATLAB using the vpa function with digits = 64.
-: Stopping Criteria. The iteration is terminated once any of the following conditions is satisfied:

${∥F (u_{i}^{[ℏ]})∥}_{2} < t o l, {∥F (u_{i}^{[ℏ]})∥}_{\infty} < t o l .$

(61)

Visualization and Validation. The approximate solution

u^{[ℏ]}

obtained from the iterative scheme is validated against exact or benchmark solutions. Both are plotted to illustrate the accuracy and convergence behavior. Algorithm 2 summarizes the complete approach, and Figure 3 presents the corresponding computational workflow.

3.2. Applications in Biomedical Engineering

Benchmark models play a central role in assessing the effectiveness and reliability of numerical methods for real-world biomedical problems. They provide a controlled setting in which computational performance can be tested under challenging conditions, such as highly nonlinear systems of equations. The inclusion of fractional-order parallel systems in these benchmarks allows researchers to evaluate accuracy, stability, and convergence when applied to medical and biological processes.

Algorithm 1 Parallel VPA Weierstrass method for solving fractional PDEs

Require:: Number of spatial nodes n, domain length L, time step $τ$ , fractional order $α$ , diffusion coefficient D, maximum number of iterations MaxIter, tolerance Tol, Weierstrass relaxation factor $λ_{W}$ , small epsilon $ε$ for VPA.
Ensure:: Approximate solution $u_{approx}$ at current time $t = τ$ , absolute error with respect to exact solution.
1:: Initialize spatial grid: $x \leftarrow linspace (0, L, n)$
2:: Compute spatial step: $h \leftarrow x_{2} - x_{1}$
3:: Initialize previous solution: $u_{prev} \leftarrow vpa (sin (π x))$
4:: Define symbolic variables for unknowns: $u = sym ([u_{1}, u_{2}, \dots, u_{n}],^{'} r e a l^{'})$
5:: Apply boundary conditions: $u_{1} = 0$ , $u_{n} = 0$
6:: Approximate Caputo derivative using L1 formula:

$L 1_{Caputo} = \frac{u - u_{prev}}{τ^{α} Γ (2 - α)}$
7:: Construct system of nonlinear equations for internal nodes $i = 2$ to $n - 1$ :

$F_{i} = L 1_{Caputo} (i) - D \frac{u_{i + 1} - 2 u_{i} + u_{i - 1}}{h^{2}}$
8:: Reduce system to internal unknowns: $F_{red} = [F_{2}, \dots, F_{n - 1}]$ , $v a r s = [u_{2}, \dots, u_{n - 1}]$
9:: Convert symbolic function to MATLAB function handle: $F_{func} (v a r s)$
10:: Initialize guess for unknowns: $r o o t s A p p r o x \leftarrow u_{prev} (2 : e n d - 1)$
11:: Initialize new roots: $n e w R o o t s \leftarrow r o o t s A p p r o x$
12:: for $i t e r = 1$ to MaxIter do
13:: Store previous iteration: $t e m p R o o t s \leftarrow r o o t s A p p r o x$
14:: Parallel Weierstrass Step:
15:: for all $i = 1$ to $n_{int}$ in parallel (parfor loop) do
16:: Initialize product term: $p r o d T e r m \leftarrow 1$
17:: Evaluate system: $F_{val} \leftarrow F_{func} (r o o t s A p p r o x)$
18:: for $j = 1$ to $n_{int}$ do
19:: if $j \neq i$ then
20:: $d i f f \leftarrow r o o t s A p p r o x (i) - r o o t s A p p r o x (j)$
21:: if $| d i f f | < ε$ then
22:: $d i f f \leftarrow ε$ ▹avoid division by zero
23:: end if
24:: $p r o d T e r m \leftarrow p r o d T e r m \cdot d i f f$
25:: end if
26:: end for
27:: Update root: $n e w R o o t s (i) \leftarrow r o o t s A p p r o x (i) - λ_{W} \cdot F_{val} (i) / p r o d T e r m$
28:: end for
29:: Convergence Check:
30:: if $max (| n e w R o o t s - r o o t s A p p r o x |) < Tol$ then
31:: $r o o t s A p p r o x \leftarrow n e w R o o t s$
32:: break
33:: end if
34:: Update roots: $r o o t s A p p r o x \leftarrow n e w R o o t s$
35:: end for
36:: Construct full approximate solution: $u_{approx} \leftarrow u_{prev}$
37:: $u_{approx} (2 : e n d - 1) \leftarrow r o o t s A p p r o x$
38:: Compute exact solution using Mittag-Leffler function:

$u_{exact} (i) = vpa (e^{- x - t^{σ}})$
39:: Compute absolute error: $E r r o r \leftarrow | u_{approx} - u_{exact} |$
40:: Output: $u_{approx}$ , $u_{exact}$ , $E r r o r$
41:: Optional: Generate 3D plots for visual comparison between $u_{approx}$ and $u_{exact}$ along with error graph.

Algorithm 2 Parallel scheme for solving (55) using MATLAB parfor parallelization on multiple cores

1:: Input: Nonlinear system $F (x)$ , initial guesses $X^{(0)} = [x_{1}^{(0)}, x_{2}^{(0)}, \dots, x_{n}^{(0)}]$
2:: Parameters: Tolerance $ε$ , maximum number of iterations $N_{max}$ , version (pointwise or diagonalized)
3:: Set $k \leftarrow 0$
4:: while $k < N_{max}$ do
5:: Compute $F (X^{(k)}) = [f_{1} (x^{(k)}), f_{2} (x^{(k)}), \dots, f_{n} (x^{(k)})]$
6:: if pointwise version then
7:: for $i = 1$ to n do
8:: Compute the product $P_{i} = \prod_{j \neq i} (x_{i}^{(k)} - x_{j}^{(k)})$
9:: Update $x_{i}^{(k + 1)} = x_{i}^{(k)} - \frac{f_{i} (x_{i}^{(k)})}{P_{i}}$
10:: end for
11:: else if diagonalized version then
12:: Construct diagonal matrix D with $D_{i i} = \prod_{j \neq i} (x_{i}^{(k)} - x_{j}^{(k)})$
13:: $X^{(k + 1)} = X^{(k)} - D^{- 1} F (X^{(k)})$
14:: end if
15:: if $∥ X^{(k + 1)} - X^{(k)} ∥ < ε$ then
16:: Converged: Return $X^{(k + 1)}$
17:: end if
18:: $k \leftarrow k + 1$
19:: end while
20:: Return: $X^{(k + 1)}$ , "Maximum iterations reached"

3.2.1. Drug Diffusion in Tissue with Nonlinear Reaction [49]

Drug diffusion in tissue with nonlinear reactions is a fundamental process in biomedical engineering, pharmacology, and treatment design. In therapeutic applications such as targeted drug delivery, cancer therapy, and tissue engineering, drugs diffuse through biological tissue while simultaneously undergoing metabolic reactions, receptor binding, or cellular uptake. These reactions are often nonlinear due to saturation effects, cooperative binding, or enzymatic kinetics (e.g., Michaelis–Menten).

To capture these dynamics, mathematical modeling is essential for describing the spatiotemporal evolution of drug concentration, predicting therapeutic efficacy, and minimizing side effects. The governing equations typically couple diffusion—the passive transport of drug molecules through the extracellular matrix—with nonlinear reaction terms representing biochemical interactions. Incorporating fractional-order derivatives enables the model to capture anomalous diffusion observed in heterogeneous tissues, where transport deviates from classical Fickian behavior due to structural barriers and memory effects.

Such models, when implemented numerically, enable realistic simulation of drug–tissue interactions, optimization of dosing strategies, and the design of treatment protocols tailored to patient-specific conditions.

A time-fractional reaction–diffusion equation is used to model the spatiotemporal evolution of the drug concentration

u (x, t)

in a one-dimensional tissue segment

x \in [0, L]

[50]:

\{\begin{matrix} \frac{\partial^{σ} u}{\partial t^{σ}} = \frac{\partial^{2} u}{\partial x^{2}} + \frac{\partial u}{\partial x} + u + λ u^{2} (1 - u) + f (x, t), & x \in [x^{[0]}, x^{[n]}], t \in [t^{[0]}, t^{[n]}], \\ u (x, 0) = e^{- x}, \\ u (0, t) = e^{- t^{σ}}, u (2, t) = e^{- 2 - t^{σ}}, \end{matrix}

(62)

where the source term

f (x, t)

is chosen such that the exact solution

u (x, t) = e^{- x - t^{σ}}

(63)

satisfies the PDE exactly. In particular, the source term is defined as

f (x, t) = λ (e^{- 3 x - 3 t^{σ}} - e^{- 2 x - 2 t^{σ}}) - e^{- x - t^{σ}} + Q_{1}^{[*]},

(64)

where

Q_{1}^{[*]} = Γ^{- 1} (- σ) \int_{0}^{t} τ^{σ - 2} e^{- x - τ^{σ}} {(t - τ)}^{- σ} d τ .

Remark 1.

The exact solution (81) is consistent with the initial and boundary conditions, while the source term (64) guarantees that the PDE residual vanishes. This construction provides a valid and unbiased benchmark for error assessment (see MATLAB symbolic script in Appendix A.1, Figure A1).

The problem setup is characterized as follows:

Spatial domain: $x \in [0, 2]$ , discretized into N intervals with spacing $h = L / N$ .
Time domain: $t \in [0, 1]$ , discretized into M intervals with spacing $τ = T / M$ .
Grid nodes: $x_{i} = i h, i = 0, \dots, N$ , and $t_{n} = n τ, n = 0, \dots, M$ .
Unknown: $u_{i}^{n} \approx u (x_{i}, t_{n})$ .

The Caputo fractional PDE is discretized in time using the L1 scheme:

\frac{\partial^{σ} u}{\partial t^{σ}} \approx \frac{1}{τ^{σ} Γ (2 - σ)} [b_{0} u_{i}^{n} - \sum_{k = 1}^{n - 1} (b_{n - k - 1} - b_{n - k}) u_{i}^{k} - b_{n - 1} u_{i}^{0}],

(65)

where

b_{k} = {(k + 1)}^{1 - σ} - k^{1 - σ}

. For convenience, this expression can be rewritten as

\frac{\partial^{σ} u}{\partial t^{σ}} \approx \frac{1}{τ^{σ} Γ (2 - σ)} \sum_{k = 0}^{n} w_{k}^{(n)} u_{i}^{k},

(66)

with weights defined as

w_{0}^{(n)} = b_{0}

,

w_{k}^{(n)} = b_{n - k - 1} - b_{n - k}

for

1 \leq k < n

, and

w_{n}^{(n)} = - b_{n - 1}

. The spatial derivatives are discretized using finite differences:

\frac{\partial^{2} u}{\partial x^{2}} \approx \frac{u_{i + 1}^{n} - 2 u_{i}^{n} + u_{i - 1}^{n}}{h^{2}},

(67)

\frac{\partial u}{\partial x} \approx \frac{u_{i + 1}^{n} - u_{i - 1}^{n}}{2 h} .

(68)

By substituting (66)–(68) into (62), the fully discretized system at each spatial point can be written as

\frac{1}{τ^{σ} Γ (2 - σ)} \sum_{k = 1}^{n - 1} w_{k}^{(n)} u_{i}^{k} = \frac{u_{i + 1}^{n} - 2 u_{i}^{n} + u_{i - 1}^{n}}{h^{2}} + \frac{u_{i + 1}^{n} - u_{i - 1}^{n}}{2 h} + u_{i}^{n} + λ {(u_{i}^{n})}^{2} (1 - u_{i}^{n}) + f_{i}^{n},

(69)

where

f_{i}^{n} = f (x_{i}, t^{n}) = λ exp (- 2 x_{i} - 2 t_{n}^{α}) (1 - exp (- x_{i} - t_{n}^{σ})) .

(70)

In matrix form, the system can be expressed as

\frac{1}{τ^{σ} Γ (2 - σ)} \sum_{k = 1}^{n - 1} w_{k}^{(n)} u^{k} = D_{1} u^{n} + D_{2} u^{n} + u^{n} + λ {(u^{n})}^{2} ⊙ (1 - u^{n}) + f^{n},

(71)

where

⊙ denotes element-wise operations;
$f^{n} = {[f_{1}^{n}, f_{2}^{n}, \dots, f_{N - 1}^{n}]}^{T}$ .

The discrete differential operators are defined as

D_{1} = \frac{1}{h^{2}} tridig (1, - 2, 1) = \frac{1}{h^{2}} [\begin{matrix} - 2 & 1 & 0 & \dots & 0 \\ 1 & - 2 & 1 & \dots & 0 \\ 0 & ⋱ & ⋱ & ⋱ & 0 \\ 0 & \dots & 1 & - 2 & 1 \\ 0 & \dots & 0 & 1 & - 2 \end{matrix}],

(72)

and

D_{2} = \frac{1}{2 h} tridig (- 1, 0, 1) = \frac{1}{2 h} [\begin{matrix} 0 & 1 & 0 & \dots & 0 \\ - 1 & 0 & 1 & \dots & 0 \\ 0 & ⋱ & ⋱ & ⋱ & 0 \\ 0 & \dots & - 1 & 0 & 1 \\ 0 & \dots & 0 & - 1 & 0 \end{matrix}] .

(73)

The numerical results obtained with the proposed parallel scheme

{CMM}_{1}^{[C_{3}]}

for solving the considered problem are summarized in Table 3. In these simulations, the initial guess vectors were chosen sufficiently close to the exact solution (within a tolerance of

0.01

) to ensure rapid convergence and to avoid the instability associated with poor initial approximations. Table 3 reports the computed solutions for different spatial grid sizes and fractional-order parameters

σ

, demonstrating the robustness and accuracy of the scheme across varying problem scales and fractional dynamics. Figure 4 and Figure 5 display the exact and approximate solutions of (71), together with the corresponding absolute errors.

To provide a comprehensive performance evaluation, Table 4 compares the proposed approach with other well-established methods from the literature for selected grid sizes (

120, 180

) and fractional parameter

σ \approx 1

, thus assessing performance in the near-integer order regime. This comparative analysis highlights the efficiency, precision, and computational advantages of the

{CMM}_{1}^{[C_{3}]}

scheme, confirming its suitability for high-accuracy fractional-order simulations in practical applications.

Table 4 presents the numerical results obtained with the proposed method, where the initial approximations were deliberately chosen close to the exact solutions to ensure stable and rapid convergence. The results confirm the accuracy and efficiency of the method across different test configurations. A comparative analysis with state-of-the-art parallel iterative schemes—

{WDM}^{[C_{2}]}

,

{ACM}^{[C_{2}]}

, and

{ELM}^{[C_{3}]}

—shows that the proposed

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

approaches consistently achieve substantially lower residual errors. This improvement underscores the enhanced stability and convergence properties of the scheme, making it a strong candidate for high-precision computations in nonlinear systems. The overall performance of the parallel schemes is summarized in Table 5, for a fractional parameter

σ = 0.9

and grid sizes of 120 and 180.

Table 5 demonstrates that, across all key performance metrics—number of iterations, maximum error, percentage convergence, total arithmetic operations, memory usage, and computational order of convergence (COC)—the proposed

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

methods significantly outperform the existing approaches

{WDM}^{[C_{2}]}

,

{ACM}^{[C_{2}]}

, and

{ELM}^{[C_{3}]}

. Random initial guess vectors were used to analyze the global convergence of parallel techniques for solving (71). This procedure ensured accuracy up to two decimal places relative to the exact solution. The adaptive selection process markedly improved robustness and efficiency across a wide range of problem instances. The random initial guess vectors employed for the bio-heat problem are reported in Table 6 and Table 7.

The results of the adaptive self-adjustment of initial guess values, as described in Algorithm 1, are presented in Table 8 for both Criteria I and II.

Table 8 demonstrates the accuracy of the proposed schemes in solving CTFPDEs for different parameter values. For grid sizes of 120 and 180, the methods

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

outperform earlier approaches (

{WDM}^{[C_{2}]}

,

{ACM}^{[C_{2}]}

, and

{ELM}^{[C_{3}]}

) in terms of accuracy. Table 8 also reports the overall consistency analysis, performed using randomly generated initial values. Furthermore, the results in Table 9 confirm that the proposed method achieves higher accuracy and stability than existing schemes. Across all evaluation metrics—including the average number of iterations, computational time (seconds), percentage convergence, and memory utilization—our approach consistently outperforms the alternatives under both Criterion I and Criterion II.

The results obtained from the MATLAB parfor parallel implementation are reported in Table 10.

The proposed MATLAB parfor-based parallelization of the schemes achieves significant acceleration across all test cases. As shown in Table 10, the parallel implementation attains a speedup ratio of

2.95

–

3 \times

on a four-core machine, indicating efficient utilization of the available cores. Notably, the maximum error and percentage convergence remain consistent with the serial approach, confirming that accuracy is preserved. Moreover, memory usage shows slight improvements due to optimized data handling within the parallel loops.

Physical Behavior of Drug Diffusion in Tissue with Nonlinear Reaction Model.

In the drug diffusion model, the fractional-order time derivative governs the temporal evolution of concentration, while the nonlinear reaction term regulates saturation effects. To ensure both accuracy and computational efficiency, numerical parameters were selected according to these structural properties, including time step and spatial discretization.

Step size and tolerance were chosen to balance accuracy and convergence.
Numerical results showed that the proposed approach preserved the model’s physical behavior over time and space.
The interaction between fractional order, nonlinearity, and numerical discretization directly affected computational cost and convergence speed.

3.2.2. Brain Signal Propagation with Nonlinear Blood Flow Effects [51]

Brain signal propagation with nonlinear blood flow effects is an emerging interdisciplinary research area at the intersection of neuroscience, biofluid mechanics, and mathematical physics. Neural signal transmission is governed by complex electrochemical processes along neuronal pathways, while cerebral blood flow delivers the oxygen and nutrients required for proper neuronal activity. The coupling between vascular hemodynamics, neuronal firing dynamics, and ionic transport produces inherently nonlinear interactions—both under physiological conditions and more prominently in pathological states such as stroke, epilepsy, or traumatic brain injury.

From a mathematical perspective, these interactions are often modeled by coupling nonlinear reaction–diffusion or fractional-order cable equations (describing axonal signal diffusion and membrane potential dynamics) with Navier–Stokes-type or Darcy–Forchheimer equations (representing nonlinear blood flow through microvasculature). Additional nonlinearities may arise from synaptic saturation, threshold-based action potentials, or rheological effects of blood such as shear-dependent viscosity. Fractional calculus has gained increasing importance in this context, as it captures memory effects in vascular and neural components, including anomalous diffusion in the extracellular space and delayed hemodynamic responses.

The resulting coupled models enable the simulation of realistic patterns of brain signal propagation, the prediction of delays induced by impaired blood flow, and the evaluation of therapeutic interventions such as modulation of neurovascular coupling. By integrating electrophysiological and hemodynamic processes into a unified mathematical framework, these models provide deeper insight into brain function and pathology, supporting advances in brain–computer interfaces, diagnostic imaging, and targeted therapeutic strategies [52].

Let

u (x, t)

denote the neural field (e.g., membrane potential, averaged activity, or signal amplitude) on a one-dimensional tissue domain

x \in [0, L]

. To capture anomalous temporal dynamics, we employ a Caputo fractional derivative of order

0 < σ \leq 1

. The governing FPDE is given by

\{\begin{matrix} \frac{\partial^{σ} u}{\partial t^{σ}} = v_{1}^{[*]} \frac{\partial^{2} u}{\partial x^{2}} + v_{2}^{[*]} \frac{\partial u}{\partial x} + u (x, t) + f (x, t), & x \in [0, 2], t \in [0, 1], \\ u (x, 0) = 2.09 + sin (π x), \\ u (0, t) = 0, u (2, t) = 2 (t^{σ} + 1.09) + sin (2 π), \end{matrix}

(74)

where the source term

f (x, t)

is chosen such that the exact solution

u (x, t) = (2 t^{σ} + 2.18) + sin (π x)

(75)

satisfies the PDE exactly (see MATLAB symbolic script in Appendix A.2, Figure A2). In particular, the source term is defined as

f (x, t) = t^{σ} ((π^{2} v_{1}^{[*]} - 1) sin (π x) - π v_{2}^{[*]} cos (π x)) - sin (π x) Q_{2}^{[*]},

(76)

where

Q_{2}^{[*]} = Γ^{- 1} (- σ) \int_{0}^{t} τ^{σ - 2} {(t - τ)}^{- σ} d τ .

Model parameters and discretization:

$v_{1}^{[*]}$ : effective diffusion coefficient of the electrical signal (axonal/dendritic spread).
$v_{2}^{[*]}$ : advective drift term (typically small; set $v_{2}^{[*]} = 0$ unless modeling directed flow).
$f (x, t)$ : external source term.
Spatial domain: $x \in [0, 2]$ , discretized into N intervals with spacing $h = L / N$ .
Temporal domain: $t \in [0, 1]$ , discretized into M intervals with spacing $τ = T / M$ .
Grid nodes: $x_{i} = i h$ , $i = 0, \dots, N$ ; $t_{n} = n τ$ , $n = 0, \dots, M$ .
Unknowns: $u_{i}^{n} \approx u (x_{i}, t_{n})$ .

Using approximations (66)–(68) in (74), we obtain the following nonlinear system of equations:

\frac{1}{τ Γ (2 - σ)} \sum_{k = 1}^{n - 1} w_{k}^{(n)} u_{i}^{k} = \frac{u_{i + 1}^{n} - 2 u_{i}^{n} + u_{i - 1}^{n}}{h^{2}} + \frac{u_{i + 1}^{n} - u_{i - 1}^{n}}{2 h} + u_{i}^{n} + f_{i}^{n},

(77)

where

f_{i}^{n} = f (x_{i}, t^{n}) = sin (u_{i}^{n}) .

(78)

In matrix form, the system can be expressed as

\frac{1}{τ Γ (2 - σ)} \sum_{k = 1}^{n - 1} w_{k}^{(n)} u^{k} = D_{1} u^{n} + D_{2} u^{n} + u^{n} + f^{n},

(79)

where

⊙ denotes element-wise operations;
$f^{n} = {[f_{1}^{n}, f_{2}^{n}, \dots, f_{N - 1}^{n}]}^{T}$ ;
$D_{1}$ and $D_{2}$ are defined in (72)–(73), respectively.

Table 11 summarizes the numerical results obtained with the proposed parallel scheme

{CMM}_{1}^{[C_{3}]}

for the considered problem. In these simulations, the initial guess vectors were chosen close to the exact solution (within a tolerance of

0.001

), ensuring efficient convergence while avoiding instabilities that may arise from poor initial approximations. To demonstrate the accuracy and robustness of the scheme across different problem scales and fractional dynamics, the table reports computed solutions for a range of spatial grid sizes and fractional-order parameters

σ

. Table 11 also provides a direct comparison between the proposed methodology and established approaches from the literature, offering a comprehensive performance assessment. Based on these results, Figure 6 and Figure 7 illustrate the exact and approximate solutions of (71), together with the corresponding absolute errors. For fractional parameters

σ \approx 1

, the comparison was carried out with grid sizes of 120 and 180, enabling evaluation in the near-integer regime. This analysis confirms the efficacy, accuracy, and computational advantages of the

{CMM}_{1}^{[C_{3}]}

scheme, validating its suitability for high-precision fractional-order simulations in practical applications.

Table 12 reports the numerical results obtained with the proposed parallel scheme

{CMM}_{2}^{[C_{3}]}

for the problem under consideration. In these simulations, the initial vector estimates were selected within

0.001

of the exact solution to ensure an efficient iterative process and to avoid instabilities that may arise from inaccurate initial approximations. To demonstrate the stability and reliability of the combined

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

framework across different problem scales and fractional dynamics, the table presents computed solutions for a range of spatial grid points and fractional-order parameters

σ

.

Table 13 further provides a performance assessment by comparing the proposed strategy against established methods from the literature. In particular, for fractional parameters

σ \approx 1

, grid sizes of 120 and 180 were considered, allowing evaluation in the near-integer regime. This comparison highlights the efficiency, accuracy, and computational advantages of the

{CMM}_{1}^{[C_{3}]}

scheme, confirming its suitability for high-precision fractional-order simulations in practical applications.

The overall performance of the proposed parallel schemes is summarized in Table 13, for a fractional parameter

σ = 0.9

and grid sizes of 120 and 180.

Table 13 clearly demonstrates that, across key performance metrics—including the number of iterations, maximum error, percentage convergence, total arithmetic operations, memory usage, and computational order of convergence (COC)—the proposed method significantly outperforms existing approaches. Random initial guess vectors were employed to analyze the global convergence of parallel techniques for solving (79). This procedure substantially enhanced robustness and efficiency across a wide range of problem instances. The random initial guess vectors used for the bio-heat problem are reported in Table 14 and Table 15.

The results of the adaptive self-adjustment of initial guess values, as outlined in Algorithm 2, are presented in Table 16 for both Criterion I and Criterion II.

Table 16 presents the accuracy of the proposed scheme in solving CTFPDEs for various parameter values. The results for grid sizes of 120 and 180 clearly demonstrate that the methods achieve higher accuracy than earlier approaches. Table 17 provides the overall consistency analysis, performed with a random set of initial values. The results in Table 17 confirm that the combined

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

scheme outperforms existing methods (

{WDM}^{[C_{2}]}

,

{ACM}^{[C_{2}]}

,

{ELM}^{[C_{3}]}

) in both accuracy and stability. Across all evaluation metrics—including the average number of iterations, computational time (seconds), percentage convergence, and memory usage—the proposed methods consistently surpass competing approaches under both Criterion I and Criterion II.

The results obtained from the MATLAB parfor-based parallel implementation are reported in Table 18.

All test cases demonstrate substantial acceleration with the proposed MATLAB parfor-based parallelization of the schemes. On a four-core machine, the parallel implementation achieved a speedup ratio of approximately

3.73

–

4 \times

, as reported in Table 18, confirming efficient utilization of the available cores. Importantly, both the maximum error and percentage convergence remained unchanged compared to the serial implementation, indicating that accuracy is fully preserved under parallel execution.

Physical Behavior of the Brain Signal Propagation Model. In the brain signal propagation model, the fractional-order derivative incorporates memory effects in neuronal and hemodynamic dynamics, while nonlinear blood flow terms account for feedback between vascular response and signal transmission. These structural properties were reflected in the choice of numerical parameters, such as time step and spatial discretization, which were selected to ensure both accuracy and computational efficiency.

Step size and tolerance were chosen to balance accuracy and convergence in the presence of fractional dynamics and nonlinear flow effects.
Numerical results showed that the proposed parallel scheme accurately reproduced signal propagation patterns while preserving key physiological features.
Fractional order, nonlinear blood flow, and discretization parameters jointly influenced computational cost and convergence rate.

3.2.3. Fractional Heart Tissue Electrical Conduction with Nonlinear Reaction [53]

The fractional heart tissue electrical conduction model with nonlinear response provides a refined mathematical framework for describing the propagation of electrical signals in cardiac tissue, incorporating memory effects and complex cell interactions through fractional calculus. Unlike classical integer-order models, this approach employs fractional derivatives—typically in the Caputo or Riemann–Liouville form—to capture anomalous diffusion and hereditary properties that reflect the heterogeneity and microstructural complexity of heart tissue.

This framework is particularly valuable for modeling the dynamics of gap junctions and ion channels, where nonlinear reaction terms reproduce the biochemical processes and voltage-dependent ionic currents responsible for generating and propagating action potentials. By coupling fractional reaction–diffusion equations with nonlinear source terms, the model effectively replicates realistic processes such as wavefront slowing, reentry phenomena, and the complex spatiotemporal patterns characteristic of cardiac arrhythmia. The fractional-order parameter serves as a tuning mechanism, interpolating between normal diffusion and sub-diffusion, thereby enabling a more accurate representation of delayed conduction and memory-dependent effects typical of diseased or fibrotic myocardium.

This enhanced modeling capability not only deepens the understanding of pathological conduction but also provides a powerful tool for designing targeted therapeutic strategies, including electrical pacing and ablation therapy in clinical electrophysiology.

In a one-dimensional tissue segment

x \in [0, 2]

, the spatiotemporal evolution of the fractional cardiac tissue electrical conduction with nonlinear reaction

u (x, t)

is modeled by a time-fractional reaction–diffusion equation [54]:

\{\begin{matrix} \frac{\partial^{σ} u}{\partial t^{σ}} = \frac{\partial^{2} u}{\partial x^{2}} + \frac{\partial u}{\partial x} + u (x, t) + u^{3} (x, t) + f (x, t), & x \in [0, 2], t \in [0, 2], \\ u (x, 0) = 0, \\ u (0, t) = t^{σ}, u (2, t) = 0 . \end{matrix}

(80)

where the source term

f (x, t)

is chosen so that the exact solution

u (x, t) = (1 - \frac{x}{2}) t^{σ}

(81)

satisfies the PDE exactly (see MATLAB symbolic script in Appendix A.3, Figure A3). In particular, the source term is defined as

f (x, t) = \frac{1}{8} ({(- 2 + x)}^{3} t^{3 σ} + 4 t^{σ} (- 1 + x)) + \frac{4 x - 8}{8} Q_{2}^{[*]},

(82)

where

Q_{2}^{[*]} = Γ^{- 1} (- σ) \int_{0}^{t} τ^{σ - 2} {(t - τ)}^{- σ} d τ .

(83)

Problem characterization:

Spatial domain: $x \in [0, 2]$ , discretized into N intervals with spacing $h = L / N$ .
Time domain: $t \in [0, 2]$ , discretized into M intervals with spacing $τ = T / M$ .
Grid nodes: $x_{i} = i h$ , $i = 0, \dots, N$ ; $t_{n} = n τ$ , $n = 0, \dots, M$ .
Unknowns: $u_{i}^{n} \approx u (x_{i}, t_{n})$ .

By applying approximations (66)–(68) to (80), the following nonlinear system of equations is obtained:

\frac{1}{τ Γ (2 - σ)} \sum_{k = 1}^{n - 1} w_{k}^{(n)} u_{i}^{k} = \frac{u_{i + 1}^{n} - 2 u_{i}^{n} + u_{i - 1}^{n}}{h^{2}} + \frac{u_{i + 1}^{n} - u_{i - 1}^{n}}{2 h} + u_{i}^{n} + {(u_{i}^{n})}^{3} + f_{i}^{n},

(84)

where

f_{i}^{n} = f (x_{i}, t^{n}) = \frac{Γ (1 + σ)}{1 + x_{i}^{2}} - {(t^{n})}^{σ} (\frac{6 x_{i}^{2} - 2}{{(1 + x_{i}^{2})}^{3}} - \frac{2 x_{i}}{{(1 + x_{i}^{2})}^{2}} + \frac{1}{1 + x_{i}^{2}}) .

(85)

The system can be written in matrix form as

\frac{1}{τ Γ (2 - σ)} \sum_{k = 1}^{n - 1} w_{k}^{(n)} u^{k} = D_{1} u^{n} + D_{2} u^{n} + u^{n} + {(u^{n})}^{3} + f^{n},

(86)

where

⊙ denotes element-wise operations;
$f^{n} = {[f_{1}^{n}, f_{2}^{n}, \dots, f_{N - 1}^{n}]}^{T},$

and the matrices

D_{1}

and

D_{2}

are defined in (72)–(73).

The numerical results obtained with the proposed parallel scheme

{CMM}_{1}^{[C_{3}]}

are summarized in Table 19. In these simulations, the initial guess vectors were selected within a tolerance of

0.001

from the exact solution to ensure efficient convergence and to avoid instabilities due to poor initialization. The table reports computed solutions for a range of spatial grid points and fractional-order parameters

σ

, illustrating the robustness and accuracy of the scheme across varying problem scales and fractional dynamics.

To assess performance, Table 19 also provides a direct comparison with well-established methods from the literature. Based on these results, Figure 8 and Figure 9 present the exact and approximate solutions of (86), together with the corresponding absolute errors, for selected grid sizes 120 and 180 with fractional parameter

σ \approx 1

. This near-integer regime highlights the efficiency, accuracy, and computational advantages of the

{CMM}_{1}^{[C_{3}]}

scheme, confirming its suitability for high-precision fractional-order simulations in practical applications.

Table 20 reports the numerical results obtained with the proposed parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

for the problem under consideration. In these simulations, the initial guess vectors were selected within a tolerance of

0.001

from the exact solution to ensure efficient convergence and to avoid instabilities that may arise from poor initial approximations. The table presents computed solutions for different spatial grid points and fractional-order parameters

σ

, demonstrating the stability and accuracy of the scheme across varying problem scales and fractional dynamics.

For a meaningful performance assessment, Table 21 provides a direct comparison with established methods from the literature, namely,

{WDM}^{[C_{2}]}

,

{ACM}^{[C_{2}]}

, and

{ELM}^{[C_{3}]}

. This comparison, conducted for grid sizes 120 and 180 with fractional parameter

σ \approx 1

, highlights the superior efficiency, precision, and computational advantages of the proposed

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

schemes, confirming their suitability for high-accuracy fractional-order simulations in practical applications.

The overall performance of the parallel schemes is summarized in Table 21, considering a fractional parameter

σ = 0.9

and grid sizes of 120 and 180.

Table 21 demonstrates that, across key performance metrics—including the number of iterations, maximum error, percentage convergence, total arithmetic operations, memory usage, and computational order of convergence (COC)—the proposed method significantly outperforms existing approaches. To analyze the global convergence of the parallel techniques for solving (86), random initial guess vectors were employed. This procedure substantially improves robustness and efficiency across a wide range of problem instances. The random initial guess vectors used for the bio-heat problem are reported in Table 22 and Table 23.

The outcomes of the adaptive self-adjustment of the initial guess values, as described in Algorithm 2, are reported in Table 24 for both Criterion I and Criterion II.

The accuracy of the proposed approaches,

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

, in solving CTFPDEs for various parameter values is reported in Table 24. For grid sizes of 120 and 180, the methods achieve higher accuracy than previously established approaches. Table 25 presents the overall consistency analysis, performed with a random set of initial values. The results clearly indicate that

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

outperform existing methods (

{WDM}^{[C_{2}]}

,

{ACM}^{[C_{2}]}

,

{ELM}^{[C_{3}]}

) in both accuracy and stability. Under both Criterion I and Criterion II, the schemes consistently deliver superior performance across all evaluation metrics, including the average number of iterations, computational time (seconds), percentage convergence, and memory usage.

The results obtained from the MATLAB parfor-based parallelization of the proposed schemes are presented in Table 26.

Physical Behavior of the Heart Tissue Electrical Conduction Model: In the fractional heart tissue model, the fractional-order time derivative captures memory effects in electrical conduction, while the nonlinear reaction term characterizes ionic current dynamics and cardiac cell excitability. To enable accurate and efficient simulation of cardiac electrical activity, numerical parameters such as the time step and spatial discretization were carefully selected in accordance with these structural properties.

Step size and tolerance were adjusted to balance precision and convergence in the presence of fractional dynamics and nonlinear ionic interactions.
Numerical results confirmed that the proposed approach accurately reproduced the spatiotemporal propagation of electrical signals in cardiac tissue.

3.3. Comparative Discussion of Biomedical Examples

To evaluate the performance of the proposed parallel approaches, we conducted a comparative study across the three biomedical models, using both deterministic closed-form initial guesses and random initial vectors, together with MATLAB’s parfor implementation. Table 2, Table 17 and Table 25 provide a summary of the main findings.

First, Table 3, Table 4 and Table 5, Table 11, Table 12 and Table 13 and Table 19, Table 20 and Table 21 show that the proposed

{CMM}_{2}^{[C_{3}]}

scheme consistently achieves lower error norms than existing iterative approaches for fixed fractional orders

σ

. In particular, when

σ \approx 1

, the method yields a notable reduction in residual error, confirming its stability in the classical limit. Moreover, the results in Table 4 indicate that the proposed methods are both faster and more accurate than earlier schemes, even without parallelization.

Second, Table 17 and Table 25 examine the effect of random initial vectors. The results demonstrate that the proposed strategies maintain robust convergence and consistency across trials, even with randomly generated initializations, highlighting improved algorithmic stability.

Finally, Table 9, Table 10, Table 16, Table 18, Table 24 and Table 26 assess the influence of MATLAB’s parfor-based parallelization. Although parfor differs from OpenMP in memory management, the implementation significantly reduces CPU time, particularly for problems with dense fractional memory terms. Importantly, accuracy is fully preserved under parallel execution, and the observed speedup confirms the scalability of the proposed methods to multi-core architectures.

Overall, the comparative results across the biomedical examples reveal several key trends:

Mathematical structure: Biomedical models with higher-order nonlinearities or significant fractional memory effects (e.g., cardiac conduction) tend to exhibit slower convergence. Nonetheless, the proposed methods ensure steady residual decay and maintain accuracy even under stiff nonlinear dynamics (see Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9).
Numerical parameters: Smaller step sizes and high-precision VPA computations reduce local truncation error and further improve accuracy. In addition, adaptive random initialization yields consistent results across all three biomedical applications.
Efficiency: The proposed parallel schemes, particularly ${CMM}_{2}^{[C_{3}]}$ , reliably reduce CPU time and maximum error compared to classical fractional iterative methods. Performance gains become increasingly pronounced for larger problem sizes and denser fractional memory terms.
Stability under perturbations: Convergence is preserved even when random initial vectors, parameter fluctuations, or noise in boundary conditions are introduced. This resilience is especially important for biomedical applications, where data uncertainty is inherent.
Parallel scalability: Although MATLAB’s parfor differs from OpenMP in memory distribution, the results confirm effective parallel speedup on multi-core architectures. The scalability of the proposed approach makes it suitable for high-dimensional and long-time fractional simulations.
Computational cost: The hierarchical formulation of the parallel schemes controls memory usage, delivering a favorable cost-to-accuracy ratio. Even for nonlinear biological PDEs with strong fractional effects, the proposed methods remain computationally competitive.
Biomedical relevance: Each biomedical application represents a distinct physiological process—drug transport, neuronal dynamics, and cardiac tissue excitation. In all cases, the proposed methods produced reliable results, underscoring their translational potential in real-world biomedical modeling.

This comparative analysis demonstrates that the proposed approaches not only enhance accuracy and efficiency but also provide a reliable foundation for solving a broad class of biomedical fractional models, thereby extending their practical applicability beyond existing methods.

4. Conclusions

In this paper, two parallel techniques for solving CTFPDEs were developed and analyzed. Theoretical results established second- and third-order convergence, while the implementation exploited diagonal and element-wise multiplications of the Weierstrass correction to efficiently approximate the solutions. Several benchmark problems from biomedical engineering were employed to evaluate the accuracy, stability, and consistency of the proposed schemes.

The numerical results, summarized in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 14, Table 15, Table 16, Table 17, Table 18, Table 19, Table 20, Table 21, Table 22, Table 23, Table 24, Table 25 and Table 26 and Figure 1, Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9, demonstrate that the new methods consistently outperform existing approaches in terms of residual errors (in both

ℓ_{2}

- and

ℓ_{\infty}

-norms), memory efficiency, computational time, and percentage convergence. The use of random initial test vectors combined with an adaptive optimization step further enhanced robustness by automatically selecting effective starting points and accelerating convergence. In addition, fractal-based dynamical analysis (Table 2 and Figure 2) confirmed the stability and reliability of the schemes across diverse problem scales.

Limitations. The present study has some limitations:
- Parallelization was implemented using MATLAB’s parfor construct rather than low-level OpenMP or GPU-based solutions, which may limit scalability for large-scale problems.
- Higher-dimensional cases may require additional stability and memory considerations; the numerical studies presented here were restricted to two-dimensional biological PDEs.
- Exact solutions were constructed for validation, whereas real biomedical data typically contain noise and parameter uncertainty, which were not considered in this work.
Future work. Several directions are envisioned for extending this study:
- Implementing the techniques in hybrid CPU–GPU environments to improve scalability and computational speed.
- Extending the framework to multidimensional biomedical models, such as three-dimensional heart wave propagation.
- Incorporating uncertainty quantification to account for variability and noise in physiological data.
- Exploring adaptive step-size control and machine learning-assisted initialization to further improve reliability and convergence.

In summary, the proposed parallel techniques provide a theoretically sound and computationally efficient framework for solving fractional-order PDEs in biomedical engineering. By combining rigorous convergence guarantees with practical efficiency, they offer a powerful tool for modeling complex physiological processes with memory effects, thereby extending the applicability of fractional models in biomedical research and clinical simulation.

Author Contributions

Conceptualization, M.S. and B.C.; methodology, M.S.; software, M.S.; validation, M.S.; formal analysis, B.C.; investigation, M.S.; resources, B.C.; writing—original draft preparation, M.S. and B.C.; writing—review and editing, B.C.; visualization, M.S. and B.C.; supervision, B.C.; project administration, B.C.; funding acquisition, B.C. All authors have read and agreed to the published version of the manuscript.

Funding

Bruno Carpentieri’s work is supported by the European Regional Development and Cohesion Funds (ERDF) 2021–2027 under Project AI4AM - EFRE1052. He is a member of the Gruppo Nazionale per il Calcolo Scientifico (GNCS) of the Istituto Nazionale di Alta Matematica (INdAM), and this work was partially supported by INdAM-GNCS under the Progetti di Ricerca 2024 program.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

In this article, the following abbreviations are used:

${CMM}_{1}^{[C_{3}]}$ – ${CMM}_{2}^{[C_{3}]}$	Newly developed schemes
n	Iterations
CPU-time	Computational time in seconds
$C O C$	Computational local convergence order

Appendix A. Symbolic Verification of Biomedical Models

This appendix presents the symbolic verification of the exact solutions corresponding to the three biomedical FPDE models discussed in the main text. The Caputo fractional derivative, spatial derivatives, and nonlinear terms were evaluated using MATLAB’s Symbolic Math Toolbox. In each case, the residuals vanished identically, confirming that the constructed source terms were consistent with the exact solutions. The prescribed initial and boundary conditions were likewise verified symbolically. All computations were carried out with MATLAB Symbolic Math Toolbox, employing the vpasolve, int, and diff operators.

Appendix A.1. Drug Diffusion in Tissue with Nonlinear Reaction

The exact solution

u (x, t) = e^{- x - t^{σ}}

was symbolically differentiated to evaluate the Caputo derivative and substituted into the governing FPDE. The resulting residual

R (x, t)

vanished identically, confirming that both the initial and boundary conditions were satisfied.

$Fractalfract 09 00607 g0a1$

Figure A1. Symbolic verification of the exact solution

u (x, t) = e^{- x - t^{σ}}

for the drug diffusion model. The computation checks the Caputo fractional derivative, PDE residual, and initial and boundary conditions using MATLAB’s symbolic toolbox.

Figure A1. Symbolic verification of the exact solution

u (x, t) = e^{- x - t^{σ}}

for the drug diffusion model. The computation checks the Caputo fractional derivative, PDE residual, and initial and boundary conditions using MATLAB’s symbolic toolbox.

$Fractalfract 09 00607 g0a1$

Appendix A.2. Brain Signal Propagation with Nonlinear Blood Flow Effects

For the exact solution

u (x, t) = (2 t^{σ} + 2.18) + sin (π x)

, symbolic evaluation of the Caputo derivative together with the spatial terms yielded

R (x, t) \equiv 0

. This confirmed that the governing FPDE was satisfied, and that the prescribed initial and boundary conditions held.

$Fractalfract 09 00607 g0a2$

Figure A2. Symbolic verification for

u (x, t) = (2 t^{σ} + 2.18) + sin (π x)

in the brain signal model.

Figure A2. Symbolic verification for

u (x, t) = (2 t^{σ} + 2.18) + sin (π x)

in the brain signal model.

$Fractalfract 09 00607 g0a2$

Appendix A.3. Fractional Heart Tissue Electrical Conduction with Nonlinear Reaction

For the exact solution

u (x, t) = (1 - \frac{x}{2}) t^{σ}

, symbolic evaluation showed that the residual vanished identically, confirming that both the initial and boundary conditions were satisfied.

$Fractalfract 09 00607 g0a3$

Figure A3. Symbolic verification for

u (x, t) = (1 - \frac{x}{2}) t^{σ}

in the heart tissue conduction model.

Figure A3. Symbolic verification for

u (x, t) = (1 - \frac{x}{2}) t^{σ}

in the heart tissue conduction model.

$Fractalfract 09 00607 g0a3$

Appendix A.4. Remarks

Across all three biomedical FPDE models, symbolic verification confirmed full consistency between the exact solutions, the constructed source terms, and the imposed initial and boundary conditions. These results provide a rigorous benchmark for validating the numerical experiments reported in the main text.

References

Rathore, A.S.; Mishra, S.; Nikita, S.; Priyanka, P. Bioprocess control: Current progress and future perspectives. Life 2021, 11, 557. [Google Scholar] [CrossRef]
Su, W.H.; Chou, C.S.; Xiu, D. Deep learning of biological models from data: Applications to ODE models. Bull. Math. Biol. 2021, 83, 19. [Google Scholar] [CrossRef]
Cruz, D.A.; Kemp, M.L. Hybrid computational modeling methods for systems biology. Prog. Biomed. Eng. 2021, 4, 012002. [Google Scholar] [CrossRef]
Enderle, J.D.; Ropella, K.M.; Kelsa, D.M.; Hallowell, B. Ensuring that biomedical engineers are ready for the real world. IEEE Eng. Med. Biol. Mag. 2002, 21, 59–66. [Google Scholar] [CrossRef] [PubMed]
Gu, X.M.; Wu, S.L. A parallel-in-time iterative algorithm for Volterra partial integro-differential problems with weakly singular kernel. J. Comput. Phys. 2020, 417, 109576. [Google Scholar] [CrossRef]
Wen, J.; Tian, Y.E.; Skampardoni, I.; Yang, Z.; Cui, Y.; Anagnostakis, F.; Mamourian, E.; Zhao, B.; Toga, A.W.; Zalesky, A.; et al. The genetic architecture of biological age in nine human organ systems. Nat. Aging 2024, 4, 1290–1307. [Google Scholar] [CrossRef]
Ghezal, A.; Al Ghafli, A.A.; Al Salman, H.J. Anomalous Drug Transport in Biological Tissues: A Caputo Fractional Approach with Non-Classical Boundary Modeling. Fractal Fract. 2025, 9, 508. [Google Scholar] [CrossRef]
Sachse, F.B.; Moreno, A.P.; Seemann, G.; Abildskov, J.A. A model of electrical conduction in cardiac tissue including fibroblasts. Ann. Biomed. Eng. 2009, 37, 874–889. [Google Scholar] [CrossRef]
Dai, X.; Wu, D.; Xu, K.; Ming, P.; Cao, S.; Yu, L. Viscoelastic Mechanics: From Pathology and Cell Fate to Tissue Regeneration Biomaterial Development. Acs Appl. Mater. Interfaces 2025, 17, 8751–8770. [Google Scholar] [CrossRef]
Peng, C.; Guo, T.; Xie, C.; Bai, X.; Zhou, J.; Zhao, X.; He, E.; Xia, F. mBGT: Encoding brain signals with multimodal brain graph transformer. IEEE Trans. Consum. Electron. 2024, 71, 5812–5823. [Google Scholar] [CrossRef]
Kaltenbacher, B.; Rundell, W. Inverse Problems for Fractional Partial Differential Equations; American Mathematical Society: Providence, RI, USA, 2023; Volume 230. [Google Scholar]
Mubaraki, A.M.; Nuruddeen, R.I.; Gomez-Aguilar, J.F. Closed-form asymptotic solution for the transport of chlorine concentration in composite pipes. Phys. Scr. 2024, 99, 075201. [Google Scholar] [CrossRef]
Lin, C.H.; Liu, C.H.; Chien, L.S.; Chang, S.C. Accelerating pattern matching using a novel parallel algorithm on GPUs. IEEE Trans. Comput. 2012, 62, 1906–1916. [Google Scholar] [CrossRef]
Zhao, Y.L.; Gu, X.M.; Ostermann, A. A preconditioning technique for an all-at-once system from Volterra subdiffusion equations with graded time steps. J. Sci. Comput. 2021, 88, 11. [Google Scholar] [CrossRef]
He, J.H.; Anjum, N.; He, C.H.; Alsolami, A.A. Beyond laplace and fourier transforms: Challenges and future prospects. Therm. Sci. 2023, 27 Pt B, 5075–5089. [Google Scholar] [CrossRef]
Haubold, H.J.; Mathai, A.M.; Saxena, R.K. Mittag-Leffler functions and their applications. J. Appl. Math. 2011, 2011, 298628. [Google Scholar] [CrossRef]
Duffy, D.G. Green’s Functions with Applications; Chapman and Hall/CRC: Boca Raton, FL, USA, 2015. [Google Scholar]
Hedin, L. New method for calculating the one-particle Green’s function with application to the electron-gas problem. Phys. Rev. 1965, 139, A796. [Google Scholar] [CrossRef]
Rainer, B.; Kaltenbacher, B. Existence, uniqueness, and numerical solutions of the nonlinear periodic Westervelt equation. ESAIM Math. Model. Numer. Anal. 2025, 59, 2279–2304. [Google Scholar] [CrossRef]
Kumar, M.; Umesh. Recent development of Adomian decomposition method for ordinary and partial differential equations. Int. J. Appl. Comput. Math. 2022, 8, 81. [Google Scholar] [CrossRef]
Nadeem, M.; He, J.H.; Islam, A. The homotopy perturbation method for fractional differential equations: Part 1 Mohand transform. Int. J. Numer. Methods Heat Fluid Flow 2021, 31, 3490–3504. [Google Scholar] [CrossRef]
Shihab, M.A.; Taha, W.M.; Hameed, R.A.; Jameel, A.; Ibrahim, S.M. Implementation of variational iteration method for various types of linear and nonlinear partial differential equations. Int. J. Electr. Comput. Eng. 2023, 13, 2131–2141. [Google Scholar] [CrossRef]
Kamil Jassim, H.; Vahidi, J. A new technique of reduce differential transform method to solve local fractional PDEs in mathematical physics. Int. J. Nonlinear Anal. Appl. 2021, 12, 37–44. [Google Scholar]
Li, C.; Zeng, F. Finite difference methods for fractional differential equations. Int. J. Bifurc. Chaos 2012, 22, 1230014. [Google Scholar] [CrossRef]
Sacchetti, A.; Bachmann, B.; Löffel, K.; Künzi, U.M.; Paoli, B. Neural networks to solve partial differential equations: A comparison with finite elements. IEEE Access 2022, 10, 32271–32279. [Google Scholar] [CrossRef]
Sheng, C.; Cao, D.; Shen, J. Efficient spectral methods for PDEs with spectral fractional Laplacian. J. Sci. Comput. 2021, 88, 4. [Google Scholar] [CrossRef]
Rieder, A. A p-version of convolution quadrature in wave propagation. SIAM J. Numer. Anal. 2025, 63, 1729–1756. [Google Scholar] [CrossRef]
Figueroa, A.; Jackiewicz, Z.; Löhner, R. Explicit two-step Runge-Kutta methods for computational fluid dynamics solvers. Int. J. Numer. Methods Fluids 2021, 93, 429–444. [Google Scholar] [CrossRef]
Salem, M.G.; Abouelregal, A.E.; Elzayady, M.E.; Sedighi, H.M. Biomechanical response of skin tissue under ramp-type heating by incorporating a modified bioheat transfer model and the Atangana–Baleanu fractional operator. Acta Mech. 2024, 235, 5041–5060. [Google Scholar] [CrossRef]
Li, J.M.; Wang, X.J.; He, R.S.; Chi, Z.X. An efficient fine-grained parallel genetic algorithm based on gpu-accelerated. In Proceedings of the 2007 IFIP International Conference on Network and Parallel Computing Workshops (NPC 2007), Dalian, China, 18–21 September 2007; pp. 855–862. [Google Scholar]
Kelley, C.T. Solving Nonlinear Equations with Newton’s Method; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2003. [Google Scholar]
Langtangen, H.P. Solving Nonlinear ODE and PDE Problems; Center for Biomedical Computing, Simula Research Laboratory and Department of Informatics, University of Oslo: Oslo, Norway, 2016. [Google Scholar]
Nofal, T.A. Simple equation method for nonlinear partial differential equations and its applications. J. Egypt. Math. Soc. 2016, 24, 204–209. [Google Scholar] [CrossRef]
Ramos, H.; Monteiro, M.T.T. A new approach based on the Newton’s method to solve systems of nonlinear equations. J. Comput. Appl. Math. 2017, 318, 3–13. [Google Scholar] [CrossRef]
Noor, M.A.; Waseem, M. Some iterative methods for solving a system of nonlinear equations. Comput. Math. Appl. 2009, 57, 101–106. [Google Scholar] [CrossRef]
Dehghan, M.; Shirilord, A. Three-step iterative methods for numerical solution of systems of nonlinear equations. Eng. Comput. 2022, 38, 1015–1028. [Google Scholar] [CrossRef]
Darvishi, M.T.; Barati, A. Super cubic iterative methods to solve systems of nonlinear equations. Appl. Math. Comput. 2007, 188, 1678–1685. [Google Scholar] [CrossRef]
Sharma, J.R.; Guha, R.K.; Sharma, R. An efficient fourth order weighted-Newton method for systems of nonlinear equations. Numer. Algorithms 2013, 62, 307–323. [Google Scholar] [CrossRef]
Cordero, A.; Martínez, E.; Torregrosa, J.R. Iterative methods of order four and five for systems of nonlinear equations. J. Comput. Appl. Math. 2009, 231, 541–551. [Google Scholar] [CrossRef]
Hueso, J.L.; Martínez, E.; Teruel, C. Convergence, efficiency and dynamics of new fourth and sixth order families of iterative methods for nonlinear systems. J. Comput. Appl. Math. 2015, 275, 412–420. [Google Scholar] [CrossRef]
George, S.; Sadananda, R.; Padikkal, J.; Argyros, I.K. On the order of convergence of the Noor–Waseem method. Mathematics 2022, 10, 4544. [Google Scholar] [CrossRef]
Solaiman, O.S.; Hashim, I. An iterative scheme of arbitrary odd order and its basins of attraction for nonlinear systems. Comput. Mater. Contin. Comput. 2021, 66, 1427–1444. [Google Scholar] [CrossRef]
Bate, I.; Murugan, M.; George, S.; Senapati, K.; Argyros, I.K.; Regmi, S. On extending the applicability of iterative methods for solving systems of nonlinear equations. Axioms 2024, 13, 601. [Google Scholar] [CrossRef]
Petković, M.; Carstensen, C.; Trajković, M. Weierstrass formula and zero-finding methods. Numer. Math. 1995, 69, 353–372. [Google Scholar] [CrossRef]
Ehrlich, L.W. A modified Newton method for polynomials. Commun. ACM 1967, 10, 107–108. [Google Scholar] [CrossRef]
Cordero, A.; Torregrosa, J.R.; Triguero-Navarro, P. Jacobian-Free Vectorial Iterative Scheme to Find Simple Several Solutions Simultaneously. Math. Methods Appl. Sci. 2025, 48, 5718–5730. [Google Scholar] [CrossRef]
Petković, M. Computational efficiency of simultaneous methods. In Iterative Methods for Simultaneous Inclusion of Polynomial Zeros; Springer: Berlin/Heidelberg, Germany, 2006; pp. 221–249. [Google Scholar]
Zhang, R.; Bai, H.; Zhao, F. L1-Finite Difference Method for Inverse Source Problem of Fractional Diffusion Equation. J. Phys. Conf. Ser. 2020, 1624, 032001. [Google Scholar] [CrossRef]
King, C. Non-Linear Reaction-Diffusion of Oxygen in Biological Systems. Ph.D. Thesis, Washington State University, Pullman, WA, USA, 2023. [Google Scholar]
Pujol, M.J.; Grimalt, P. A non-linear model of cerebral diffusion: Stability of finite differences method and resolution using the Adomian method. Int. J. Numer. Methods Heat Fluid Flow 2003, 13, 473–485. [Google Scholar] [CrossRef]
Akay, M. (Ed.) Nonlinear Biomedical Signal Processing, Volume 2: Dynamic Analysis and Modeling; John Wiley & Sons: Hoboken, NJ, USA, 2000; Volume 2. [Google Scholar]
Toronov, V.; Myllylä, T.; Kiviniemi, V.; Tuchin, V.V. Dynamics of the brain: Mathematical models and non-invasive experimental studies. Eur. Phys. J. Spec. Top. 2013, 222, 2607–2622. [Google Scholar] [CrossRef]
David, S.A.; Valentim, C.A.; Debbouche, A. Fractional modeling applied to the dynamics of the action potential in cardiac tissue. Fractal Fract. 2022, 6, 149. [Google Scholar] [CrossRef]
Magin, R.L.; Ovadia, M. Modeling the cardiac tissue electrode interface using fractional calculus. J. Vib. Control 2008, 14, 1431–1442. [Google Scholar] [CrossRef]

$Fractalfract 09 00607 g001$

Figure 1. Percentage computational efficiency of the proposed parallel schemes, plotted as a function of problem dimension. Each subplot compares one scheme against another (as indicated in the legends), showing relative efficiency growth with increasing dimensions.

$Fractalfract 09 00607 g001$

$Fractalfract 09 00607 g002$

Figure 2. Fractal behavior of the iterative schemes for solving problem (50) with different values of

μ

. Subfigures (a–e) show the fractal graphs associated with

{WDM}^{[C_{2}]}

,

{ACM}^{[C_{2}]}

,

{ELM}^{[C_{3}]}

,

{CMM}_{1}^{[C_{3}]}

, and

{CMM}_{2}^{[C_{3}]}

, respectively.

Figure 2. Fractal behavior of the iterative schemes for solving problem (50) with different values of

μ

. Subfigures (a–e) show the fractal graphs associated with

{WDM}^{[C_{2}]}

,

{ACM}^{[C_{2}]}

,

{ELM}^{[C_{3}]}

,

{CMM}_{1}^{[C_{3}]}

, and

{CMM}_{2}^{[C_{3}]}

, respectively.

$Fractalfract 09 00607 g002$

$Fractalfract 09 00607 g003$

Figure 3. Flow chart of the hybrid parallel scheme for solving problem (55), illustrating the pointwise and diagonalized update strategies.

$Fractalfract 09 00607 g003$

$Fractalfract 09 00607 g004$

Figure 4. Approximate solutions of problem (62) obtained with the parallel scheme for different values of

σ

.

Figure 4. Approximate solutions of problem (62) obtained with the parallel scheme for different values of

σ

.

$Fractalfract 09 00607 g004$

$Fractalfract 09 00607 g005$

Figure 5. Absolute error surface plots of the parallel scheme for problem (62), shown for different values of

σ

.

Figure 5. Absolute error surface plots of the parallel scheme for problem (62), shown for different values of

σ

.

$Fractalfract 09 00607 g005$

$Fractalfract 09 00607 g006$

Figure 6. Approximate solution surfaces of the parallel scheme for problem (74), shown for different values of

σ

.

Figure 6. Approximate solution surfaces of the parallel scheme for problem (74), shown for different values of

σ

.

$Fractalfract 09 00607 g006$

$Fractalfract 09 00607 g007$

Figure 7. Absolute error surfaces of the parallel scheme for problem (74), shown for different values of

σ

.

Figure 7. Absolute error surfaces of the parallel scheme for problem (74), shown for different values of

σ

.

$Fractalfract 09 00607 g007$

$Fractalfract 09 00607 g008$

Figure 8. Approximate solution surfaces of the parallel scheme for problem (80), shown for different values of

σ

.

Figure 8. Approximate solution surfaces of the parallel scheme for problem (80), shown for different values of

σ

.

$Fractalfract 09 00607 g008$

$Fractalfract 09 00607 g009$

Figure 9. Absolute error surfaces of the parallel scheme for problem (80), shown for different values of

σ

.

Figure 9. Absolute error surfaces of the parallel scheme for problem (80), shown for different values of

σ

.

$Fractalfract 09 00607 g009$

Table 1. Arithmetic operation counts per iteration for different parallel iterative methods. Here,

φ^{[*]}

denotes the dominant operation count proportional to the problem size, and ℘ indicates the percentage computational efficiency relative to

{CMM}_{2}^{[C_{3}]}

.

Table 1. Arithmetic operation counts per iteration for different parallel iterative methods. Here,

φ^{[*]}

denotes the dominant operation count proportional to the problem size, and ℘ indicates the percentage computational efficiency relative to

{CMM}_{2}^{[C_{3}]}

.

Metric	${WDM}^{[C_{2}]}$	${ACM}^{[C_{2}]}$	${ELM}^{[C_{3}]}$	${CMM}_{1}^{[C_{3}]}$
Additions/Subtractions	$3 φ^{[*]}$	$4 φ^{[*]}$	$5 φ^{[*]}$	$4 φ^{[*]}$
Multiplications	$2 φ^{[*]}$	$3 φ^{[*]}$	$2 φ^{[*]}$	$2 φ^{[*]}$
Efficiency $℘ [ς_{i}, {CMM}_{2}^{[C_{3}]}]$	$35 %$	$45 %$	$35 %$	$27 %$

Table 2. Numerical results of the dynamical analysis for solving (50) using different parallel iterative schemes. The metrics include the number of iterations (n), maximum error, percentage convergence (Per-C), arithmetic operations per iteration (

[+, -, \times, \div]

), memory usage (MB), and elapsed time (s).

Table 2. Numerical results of the dynamical analysis for solving (50) using different parallel iterative schemes. The metrics include the number of iterations (n), maximum error, percentage convergence (Per-C), arithmetic operations per iteration (

[+, -, \times, \div]

), memory usage (MB), and elapsed time (s).

Method	n	Max-Error	Per-C	Ops [ $+, -, \times, \div$ ]	Memory (MB)	Elapsed Time (s)
${WDM}^{[C_{2}]}$	20	$1.5 \times 10^{- 3}$	$11.09 %$	19	87.657	87.657
${ACM}^{[C_{2}]}$	9	$1.5 \times 10^{- 7}$	$19.76 %$	65	55.657	55.657
${ELM}^{[C_{3}]}$	12	$1.5 \times 10^{- 15}$	$55.76 %$	53	47.764	47.764
${CMM}_{1}^{[C_{3}]}$	8	$1.5 \times 10^{- 18}$	$63.54 %$	54	45.567	45.567
${CMM}_{2}^{[C_{3}]}$	7	$1.5 \times 10^{- 35}$	$87.87 %$	36	34.453	34.453

Table 3. Maximum error norms of parallel scheme

{CMM}_{2}^{[C_{3}]}

without MATLAB parfor for different fractional orders

σ

. Results are reported for varying grid sizes.

Table 3. Maximum error norms of parallel scheme

{CMM}_{2}^{[C_{3}]}

without MATLAB parfor for different fractional orders

σ

. Results are reported for varying grid sizes.

Grid Points	${∥ . ∥}_{2}$ -Norm	${∥ . ∥}_{\infty}$ -Norm	CPU Time (s)
	$σ = 0.1$
30, 50	$8.543 \times 10^{- 5}$	$5.034 \times 10^{- 6}$	0.056
60, 90	$5.024 \times 10^{- 6}$	$5.053 \times 10^{- 6}$	0.120
120, 180	$2.753 \times 10^{- 6}$	$1.235 \times 10^{- 7}$	0.250
	$σ = 0.3$
30, 50	$9.541 \times 10^{- 16}$	$5.376 \times 10^{- 11}$	0.060
60, 90	$5.540 \times 10^{- 13}$	$1.897 \times 10^{- 13}$	0.125
120, 180	$1.587 \times 10^{- 12}$	$8.760 \times 10^{- 15}$	0.260
	$σ = 0.5$
30, 50	$7.565 \times 10^{- 17 †}$	$1.008 \times 10^{- 15}$	0.065
60, 90	$1.554 \times 10^{- 18 †}$	$5.744 \times 10^{- 17 †}$	0.130
120, 180	$3.577 \times 10^{- 17 †}$	$9.898 \times 10^{- 16 †}$	0.270
	$σ = 0.7$
30, 50	$5.340 \times 10^{- 22 †}$	$1.744 \times 10^{- 21 †}$	0.070
60, 90	$6.589 \times 10^{- 23 †}$	$2.395 \times 10^{- 22 †}$	0.135
120, 180	$7.550 \times 10^{- 21 †}$	$7.890 \times 10^{- 21 †}$	0.280
	$σ = 0.9$
30, 50	$3.567 \times 10^{- 26 †}$	$1.535 \times 10^{- 29 †}$	0.075
60, 90	$4.776 \times 10^{- 27 †}$	$3.876 \times 10^{- 27 †}$	0.145
120, 180	$1.755 \times 10^{- 27 †}$	$4.890 \times 10^{- 26 †}$	0.301

^† All computations were performed using MATLAB VPA with digits = 64 and a numerical tolerance of 10⁻³⁰.

Table 4. Error comparison between parallel schemes for solving (71) with

σ \approx 1

, under Criterion I and Criterion II, without MATLAB parfor.

Table 4. Error comparison between parallel schemes for solving (71) with

σ \approx 1

, under Criterion I and Criterion II, without MATLAB parfor.

Metric	${WDM}^{[C_{2}]}$	${ACM}^{[C_{2}]}$	${ELM}^{[C_{3}]}$	${CMM}_{1}^{[C_{3}]}$	${CMM}_{2}^{[C_{3}]}$
		Criterion I
${∥ . ∥}_{2}$ -norm	$1.10 \times 10^{- 8}$	$1.35 \times 10^{- 8}$	$4.80 \times 10^{- 9}$	$1.89 \times 10^{- 19 †}$	$1.39 \times 10^{- 27 †}$
${∥ . ∥}_{\infty}$ -norm	$4.51 \times 10^{- 5}$	$6.72 \times 10^{- 6}$	$5.84 \times 10^{- 9}$	$7.06 \times 10^{- 23 †}$	$9.44 \times 10^{- 24 †}$
		Criterion II
${∥ . ∥}_{2}$ -norm	$5.70 \times 10^{- 6}$	$3.00 \times 10^{- 8}$	$1.65 \times 10^{- 7}$	$1.50 \times 10^{- 21 †}$	$1.04 \times 10^{- 25 †}$
${∥ . ∥}_{\infty}$ -norm	$9.41 \times 10^{- 6}$	$5.35 \times 10^{- 7}$	$3.00 \times 10^{- 10}$	$1.14 \times 10^{- 23 †}$	$4.10 \times 10^{- 26 †}$

^† All computations were performed using MATLAB variable precision arithmetic (VPA) with digits = 64 and a numerical tolerance of

10^{- 30}

.

Table 5. Overall performance of parallel schemes for solving (71) without MATLAB parfor. Here, “Basic Ops” denotes the number of basic arithmetic operations (

+, -, \times, \div

).

Table 5. Overall performance of parallel schemes for solving (71) without MATLAB parfor. Here, “Basic Ops” denotes the number of basic arithmetic operations (

+, -, \times, \div

).

Metric	Iter. (n)	Max Error	Conv. (%)	Basic Ops	Memory (MB)	COC
${WDM}^{[C_{2}]}$	13	$4.51 \times 10^{- 5}$	11.09	47	87.657	2.0014
${ACM}^{[C_{2}]}$	13	$6.72 \times 10^{- 6}$	19.76	51	55.657	2.0346
${ELM}^{[C_{3}]}$	11	$1.65 \times 10^{- 7}$	55.76	50	47.764	1.9993
${CMM}_{1}^{[C_{3}]}$	9	$1.89 \times 10^{- 11}$	63.54	54	45.567	3.1164
${CMM}_{2}^{[C_{3}]}$	9	$4.10 \times 10^{- 12}$	87.87	36	34.453	3.0087

Table 6. Discretization details of the bio-heat transfer problem used for generating random initial guess vectors.

Application	Domain $(x, t)$	Recorded Variables	Approx. Data Size
Bio-heat transfer	$x \in [0, 1], t \in [0, 1]$	$x_{i}, t^{k}, u_{i - 1}^{k - 1}, u_{i}^{k - 1}, u_{i + 1}^{k - 1}$	$\sim M \times N$

Table 7. Example of a random initial guess vector drawn from 100 MATLAB-generated test samples.

$x_{i}$	$t^{k}$	$u_{i - 1}^{k - 1}$	$u_{i}^{k - 1}$	$u_{i + 1}^{k - 1}$	$u_{i}^{k}$
0.03333	0.02000	0.98020	0.96722	0.95444	0.96724
0.73333	0.04000	0.47924	0.47237	0.46561	0.47240
0.46667	0.08000	0.62101	0.61653	0.61208	0.61655
⋮ (remaining entries omitted)

Table 8. Maximum error outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented via MATLAB parfor for solving (71).

Table 8. Maximum error outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented via MATLAB parfor for solving (71).

$σ$	0.1	0.3	0.5	0.7	0.9	Mem U
		Using Criterion I
${CMM}_{1}^{[C_{3}]}$	$1.09 \times 10^{- 3}$	$0.67 \times 10^{- 7}$	$2.43 \times 10^{- 11}$	$0.16 \times 10^{- 15}$	$2.52 \times 10^{- 19 †}$	$57.155$
${CMM}_{2}^{[C_{3}]}$	$3.10 \times 10^{- 4}$	$5.67 \times 10^{- 6}$	$0.45 \times 10^{- 10}$	$9.05 \times 10^{- 13}$	$0.41 \times 10^{- 17 †}$	$56.007$
		Using Criterion II
${CMM}_{1}^{[C_{3}]}$	$0.09 \times 10^{- 3}$	$2.76 \times 10^{- 5}$	$3.07 \times 10^{- 9}$	$0.69 \times 10^{- 11}$	$1.17 \times 10^{- 16 †}$	$56.644$
${CMM}_{2}^{[C_{3}]}$	$6.25 \times 10^{- 5}$	$5.13 \times 10^{- 7}$	$4.55 \times 10^{- 14}$	$1.13 \times 10^{- 19 †}$	$1.67 \times 10^{- 21 †}$	$49.177$

^† All computations were performed using MATLAB variable precision arithmetic (VPA) with digits = 64, employing a numerical tolerance of 10⁻³⁰.

Table 9. Consistency analysis using random initial guess vectors for parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented with MATLAB parfor for solving (71).

Table 9. Consistency analysis using random initial guess vectors for parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented with MATLAB parfor for solving (71).

Metric	Average Iterations	COC	CPU Time (s)	Percentage Convergence	Memory Usage (MB)
Using Criterion I
${CMM}_{1}^{[C_{3}]}$	17	$4.0975$	$1.3425$	$76.77 %$	$41.14$
${CMM}_{2}^{[C_{3}]}$	16	$6.0126$	$1.6766$	$97.56 %$	$46.60$
Using Criterion II
${CMM}_{1}^{[C_{3}]}$	15	$6.0126$	$1.6766$	$97.56 %$	$47.12$
${CMM}_{2}^{[C_{3}]}$	16	$6.0126$	$1.6766$	$97.56 %$	$46.40$

Table 10. Outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented with MATLAB parfor for solving (71). Reported errors are capped at double-precision limits; see footnote for full VPA residuals.

Table 10. Outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented with MATLAB parfor for solving (71). Reported errors are capped at double-precision limits; see footnote for full VPA residuals.

Metric	$T_{seri}$ (s)	$T_{para}$ (s)	$ϕ_{speed}$	Maximum Error	Percentage Convergence	Memory Usage (MB)
Using Criterion I
${CMM}_{1}^{[C_{3}]}$	1.45	0.52	2.79	$\leq 2.12 \times 10^{- 17 †}$	86.01%	35.03
${CMM}_{2}^{[C_{3}]}$	3.87	1.35	2.87	$\leq 2.22 \times 10^{- 17 †}$	87.56%	36.11
Using Criterion II
${CMM}_{1}^{[C_{3}]}$	8.59	2.95	2.91	$\leq 2.22 \times 10^{- 16 †}$	91.57%	29.11
${CMM}_{2}^{[C_{3}]}$	9.14	3.10	2.95	$\leq 2.22 \times 10^{- 16 †}$	92.12%	23.12

^† All computations used MATLAB VPA (digits = 64). Raw VPA residuals for the maximum error were

2.13 \times 10^{- 28}

,

9.85 \times 10^{- 29}

(Criterion I) and

1.573 \times 10^{- 27}

,

1.567 \times 10^{- 30}

(Criterion II). For consistent reporting across floating-point environments, the displayed values are

min {VPA residual, 2.22 \times 10^{- 16}}

, ensuring comparability with IEEE double precision.

Table 11. Maximum error norm of parallel scheme

{CMM}_{2}^{[C_{3}]}

without MATLAB parfor function for different

σ

values.

Table 11. Maximum error norm of parallel scheme

{CMM}_{2}^{[C_{3}]}

without MATLAB parfor function for different

σ

values.

Grid Points	${∥ . ∥}_{2}$ -Norm	${∥ . ∥}_{\infty}$ -Norm	C-Time
	$σ = 0.1$
30, 50	$7.71 \times 10^{- 4}$	$6.00 \times 10^{- 3}$	0.058
60, 90	$1.99 \times 10^{- 4}$	$7.07 \times 10^{- 3}$	0.115
120, 180	$1.87 \times 10^{- 3}$	$6.24 \times 10^{- 2}$	0.235
	$σ = 0.3$
30, 50	$1.00 \times 10^{- 6}$	$3.37 \times 10^{- 6}$	0.065
60, 90	$7.98 \times 10^{- 7}$	$1.15 \times 10^{- 5}$	0.134
120, 180	$6.48 \times 10^{- 5}$	$1.18 \times 10^{- 5}$	0.260
	$σ = 0.5$
30, 50	$7.12 \times 10^{- 9}$	$1.00 \times 10^{- 11}$	0.070
60, 90	$1.50 \times 10^{- 8}$	$5.47 \times 10^{- 9}$	0.135
120, 180	$4.55 \times 10^{- 7}$	$9.90 \times 10^{- 8}$	0.270
	$σ = 0.7$
30, 50	$1.60 \times 10^{- 15}$	$1.74 \times 10^{- 14}$	0.075
60, 90	$6.00 \times 10^{- 13}$	$2.33 \times 10^{- 11}$	0.140
120, 180	$7.55 \times 10^{- 12}$	$7.90 \times 10^{- 15}$	0.283
	$σ = 0.9$
30, 50	$5.65 \times 10^{- 29 †}$	$1.03 \times 10^{- 27 †}$	0.080
60, 90	$4.38 \times 10^{- 25 †}$	$3.18 \times 10^{- 24 †}$	0.146
120, 180	$1.76 \times 10^{- 25 †}$	$4.14 \times 10^{- 23 †}$	0.311

^† All computations were performed using MATLAB VPA with digits = 64, employing a numerical tolerance of

10^{- 30}

.

Table 12. Error comparison between parallel schemes for solving (79) with

σ \approx 1

using Criteria I and II without using the parfor function in MATLAB.

Table 12. Error comparison between parallel schemes for solving (79) with

σ \approx 1

using Criteria I and II without using the parfor function in MATLAB.

Metric	${WDM}^{[C_{2}]}$	${ACM}^{[C_{2}]}$	${ELM}^{[C_{3}]}$	${CMM}_{1}^{[C_{3}]}$	${CMM}_{2}^{[C_{3}]}$
		Using Criterion I
${∥ . ∥}_{2}$ -norm	$1.03 \times 10^{- 5}$	$8.39 \times 10^{- 9}$	$6.17 \times 10^{- 11}$	$6.00 \times 10^{- 25 †}$	$5.54 \times 10^{- 20 †}$
${∥ . ∥}_{\infty}$ -norm	$2.05 \times 10^{- 6}$	$2.63 \times 10^{- 8}$	$8.70 \times 10^{- 12}$	$1.47 \times 10^{- 22 †}$	$6.01 \times 10^{- 27 †}$
		Using Criterion II
${∥ . ∥}_{2}$ -norm	$1.53 \times 10^{- 6}$	$3.17 \times 10^{- 7}$	$1.05 \times 10^{- 12}$	$9.30 \times 10^{- 19 †}$	$5.04 \times 10^{- 23 †}$
${∥ . ∥}_{\infty}$ -norm	$6.04 \times 10^{- 4}$	$3.40 \times 10^{- 10}$	$8.50 \times 10^{- 11}$	$1.31 \times 10^{- 25 †}$	$4.50 \times 10^{- 29 †}$

^† All computations were performed using MATLAB variable precision arithmetic (VPA) with digits = 64, employing a numerical tolerance of

10^{- 30}

.

Table 13. Overall performance of parallel schemes for solving (79) without MATLAB parfor parallelization.

Metric	Iterations (n)	Max- Error	Percentage Convergence	Basic Ops $[\pm, \times, \div]$	Memory Usage (MB)	COC
${WDM}^{[C_{2}]}$	23	$4.55 \times 10^{- 5}$	35.09%	47	76.147	2.00
${ACM}^{[C_{2}]}$	21	$5.08 \times 10^{- 8}$	41.96%	51	67.347	2.03
${ELM}^{[C_{3}]}$	16	$6.51 \times 10^{- 10}$	65.13%	50	53.704	2.01
${CMM}_{1}^{[C_{3}]}$	11	$7.52 \times 10^{- 13}$	77.04%	54	49.500	3.12
${CMM}_{2}^{[C_{3}]}$	10	$8.55 \times 10^{- 15}$	93.98%	36	44.413	3.00

Table 14. Discretization details of the bio-heat transfer problem used for generating random initial guess vectors.

Application	Domain $(x, t)$	Recorded Variables	Approx. Data Size
Bio-heat transfer	$x \in [0, 1], t \in [0, 1]$	$x_{i}, t^{k}, u_{i - 1}^{k - 1}, u_{i}^{k - 1}, u_{i + 1}^{k - 1}$	$\sim M \times N$

Table 15. Example of a random initial guess vector drawn from 100 MATLAB-generated test samples.

$x_{i}$	$t^{k}$	$u_{i - 1}^{k - 1}$	$u_{i}^{k - 1}$	$u_{i + 1}^{k - 1}$	$u_{i}^{k}$
0.03333	0.02000	2.1967	2.1976	2.1984	2.1980
0.73333	0.04000	2.8693	2.8712	2.8730	2.8721
0.46667	0.08000	2.6587	2.6599	2.6610	2.6605
⋮ (remaining entries omitted)

Table 16. Maximum error outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented via MATLAB parfor for solving (79).

Table 16. Maximum error outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented via MATLAB parfor for solving (79).

$σ$	0.1	0.3	0.5	0.7	0.9	Mem U
		Using Criteria-I
${CMM}_{1}^{[C_{3}]}$	$2.50 \times 10^{- 6}$	$7.55 \times 10^{- 7}$	$4.00 \times 10^{- 13}$	$7.56 \times 10^{- 21 †}$	$4.25 \times 10^{- 27 †}$	$67.100$
${CMM}_{2}^{[C_{3}]}$	$4.50 \times 10^{- 6}$	$8.45 \times 10^{- 9}$	$9.80 \times 10^{- 18}$	$6.54 \times 10^{- 25 †}$	$6.55 \times 10^{- 30 †}$	$69.537$
		Using Criteria-II
${CMM}_{1}^{[C_{3}]}$	$2.50 \times 10^{- 4}$	$3.56 \times 10^{- 6}$	$7.50 \times 10^{- 14}$	$3.56 \times 10^{- 19 †}$	$7.52 \times 10^{- 23 †}$	$58.641$
${CMM}_{2}^{[C_{3}]}$	$5.52 \times 10^{- 5}$	$7.54 \times 10^{- 8}$	$3.65 \times 10^{- 15}$	$4.65 \times 10^{- 23 †}$	$1.14 \times 10^{- 26 †}$	$51.112$

^† All computations were performed using MATLAB variable precision arithmetic (VPA) with digits = 64, employing a numerical tolerance of

10^{- 30}

.

Table 17. Consistency analysis using random initial guess vectors for parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented with MATLAB parfor for solving (79).

Table 17. Consistency analysis using random initial guess vectors for parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented with MATLAB parfor for solving (79).

Metric	Average Iterations	COC	CPU Time (s)	Percentage Convergence	Memory Usage (MB)
Using Criterion I
${CMM}_{1}^{[C_{3}]}$	17	$2.0905$	$2.3995$	$76.47 %$	$41.14$
${CMM}_{2}^{[C_{3}]}$	16	$3.0336$	$2.1887$	$97.64 %$	$46.60$
Using Criterion II
${CMM}_{1}^{[C_{3}]}$	15	$2.0126$	$2.1166$	$91.33 %$	$47.12$
${CMM}_{2}^{[C_{3}]}$	16	$3.3268$	$2.0066$	$94.57 %$	$46.40$

Table 18. Outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented via MATLAB parfor for solving (79). Errors are reported with a double-precision cap; see footnote for VPA residuals.

Table 18. Outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented via MATLAB parfor for solving (79). Errors are reported with a double-precision cap; see footnote for VPA residuals.

Metric	$T_{seri}$ (s)	$T_{para}$ (s)	$ϕ_{speed}$	Maximum Error	Percentage Convergence	Memory Usage (MB)
		Using Criterion I
${CMM}_{1}^{[C_{3}]}$	$2.34$	$1.67$	$1.40$	$\leq 2.22 \times 10^{- 16 †}$	$91.47 %$	$25.93$
${CMM}_{2}^{[C_{3}]}$	$3.91$	$1.05$	$3.73$	$\leq 2.22 \times 10^{- 16 †}$	$93.17 %$	$20.12$
		Using Criterion II
${CMM}_{1}^{[C_{3}]}$	$7.61$	$2.45$	$3.11$	$\leq 2.22 \times 10^{- 16 †}$	$95.57 %$	$19.19$
${CMM}_{2}^{[C_{3}]}$	$8.81$	$2.81$	$3.13$	$\leq 2.22 \times 10^{- 16 †}$	$97.78 %$	$17.20$

^† All computations used MATLAB VPA (with digits = 64). Raw VPA residuals for Max-Error were

1.3 \times 10^{- 31}

,

4.1 \times 10^{- 27}

(Criterion I) and

1.1 \times 10^{- 26}

,

5.1 \times 10^{- 23}

(Criterion II). For consistent reporting across floating-point environments, the displayed Max-Error values are

min {VPA residual, 2.22 \times 10^{- 16}}

, ensuring comparability with IEEE double precision.

Table 19. Maximum error norms of parallel scheme

{CMM}_{2}^{[C_{3}]}

without MATLAB parfor for different

σ

values.

Table 19. Maximum error norms of parallel scheme

{CMM}_{2}^{[C_{3}]}

without MATLAB parfor for different

σ

values.

Grid Points	${∥ . ∥}_{2}$ -Norm	${∥ . ∥}_{\infty}$ -Norm	CPU Time (s)
	$σ = 0.1$
$30, 50$	$9.030 \times 10^{- 4}$	$1.117 \times 10^{- 3}$	$0.049$
$60, 90$	$1.051 \times 10^{- 4}$	$9.853 \times 10^{- 4}$	$0.124$
$120, 180$	$7.790 \times 10^{- 4}$	$5.201 \times 10^{- 3}$	$0.258$
	$σ = 0.3$
$30, 50$	$1.901 \times 10^{- 8}$	$9.376 \times 10^{- 9}$	$0.065$
$60, 90$	$5.587 \times 10^{- 9}$	$7.807 \times 10^{- 7}$	$0.129$
$120, 180$	$1.007 \times 10^{- 8}$	$7.056 \times 10^{- 8}$	$0.208$
	$σ = 0.5$
$30, 50$	$6.155 \times 10^{- 15}$	$1.110 \times 10^{- 14}$	$0.071$
$60, 90$	$5.009 \times 10^{- 12}$	$5.004 \times 10^{- 11}$	$0.135$
$120, 180$	$6.509 \times 10^{- 15}$	$9.008 \times 10^{- 14}$	$0.286$
	$σ = 0.7$
$30, 50$	$1.117 \times 10^{- 19 †}$	$1.727 \times 10^{- 21 †}$	$0.081$
$60, 90$	$1.009 \times 10^{- 17 †}$	$3.345 \times 10^{- 20 †}$	$0.142$
$120, 180$	$5.450 \times 10^{- 19 †}$	$2.010 \times 10^{- 19 †}$	$0.299$
	$σ = 0.9$
$30, 50$	$3.337 \times 10^{- 30 †}$	$7.598 \times 10^{- 29 †}$	$0.087$
$60, 90$	$7.060 \times 10^{- 29 †}$	$8.760 \times 10^{- 28 †}$	$0.237$
$120, 180$	$7.150 \times 10^{- 27 †}$	$4.840 \times 10^{- 25 †}$	$0.358$

^† All computations were performed using MATLAB VPA with digits = 64, employing a numerical tolerance of

10^{- 30}

.

Table 20. Error comparison of parallel schemes for solving (86) with

σ \approx 1

, using Criterion I and Criterion II without MATLAB parfor.

Table 20. Error comparison of parallel schemes for solving (86) with

σ \approx 1

, using Criterion I and Criterion II without MATLAB parfor.

Metric	${WDM}^{[C_{2}]}$	${ACM}^{[C_{2}]}$	${ELM}^{[C_{3}]}$	${CMM}_{1}^{[C_{3}]}$	${CMM}_{2}^{[C_{3}]}$
		Using Criterion I
${∥ . ∥}_{2}$ -norm	$1.05 \times 10^{- 5}$	$9.16 \times 10^{- 8}$	$5.43 \times 10^{- 11}$	$5.76 \times 10^{- 23 †}$	$6.37 \times 10^{- 29 †}$
${∥ . ∥}_{\infty}$ -norm	$7.11 \times 10^{- 4}$	$2.98 \times 10^{- 6}$	$9.75 \times 10^{- 11}$	$4.87 \times 10^{- 19 †}$	$1.07 \times 10^{- 27 †}$
		Using Criterion II
${∥ . ∥}_{2}$ -norm	$3.53 \times 10^{- 4}$	$1.41 \times 10^{- 7}$	$9.39 \times 10^{- 14}$	$2.30 \times 10^{- 23 †}$	$1.04 \times 10^{- 26 †}$
${∥ . ∥}_{\infty}$ -norm	$1.14 \times 10^{- 6}$	$7.07 \times 10^{- 13}$	$4.00 \times 10^{- 26}$	$1.40 \times 10^{- 28 †}$	$3.45 \times 10^{- 21 †}$

^† All computations were performed using MATLAB variable precision arithmetic (VPA) with digits = 64, employing a numerical tolerance of

10^{- 30}

.

Table 21. Overall performance of parallel schemes for solving (86) without MATLAB parfor parallelization.

Metric	Iterations (n)	Max- Error	Percentage Convergence	Basic Ops $[\pm, \times, \div]$	Memory Usage (MB)	COC
${WDM}^{[C_{2}]}$	23	$7.65 \times 10^{- 7}$	35.05%	47	91.147	2.00
${ACM}^{[C_{2}]}$	17	$6.50 \times 10^{- 7}$	39.56%	51	86.000	1.99
${ELM}^{[C_{3}]}$	16	$5.24 \times 10^{- 9}$	57.74%	50	68.764	1.86
${CMM}_{1}^{[C_{3}]}$	15	$3.64 \times 10^{- 10}$	76.55%	54	64.007	3.01
${CMM}_{2}^{[C_{3}]}$	9	$5.70 \times 10^{- 14}$	93.86%	36	59.453	3.11

Table 22. Discretization details of the bio-heat transfer problem used for generating random initial guess vectors.

Application	Domain $(x, t)$	Recorded Variables	Data Size (Approx.)
Bio-heat transfer	$x \in [0, 1], t \in [0, 1]$	$x_{i}, t^{k}, u_{i - 1}^{k - 1}, u_{i}^{k - 1}, u_{i + 1}^{k - 1}$	$\sim M \times N$

Table 23. Example of a random initial guess vector drawn from 100 MATLAB-generated test samples.

$x_{i}$	$t^{k}$	$u_{i - 1}^{k - 1}$	$u_{i}^{k - 1}$	$u_{i + 1}^{k - 1}$	$u_{i}^{k}$
0.03333	0.02000	0.00000	0.00000	0.00000	0.01967
0.73333	0.04000	0.00000	0.01967	0.00000	0.02533
0.46667	0.08000	0.02533	0.00000	0.00000	0.06133
⋮ (remaining entries omitted)

Table 24. Maximum error outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented via MATLAB parfor for solving (86).

Table 24. Maximum error outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented via MATLAB parfor for solving (86).

$σ$	0.1	0.3	0.5	0.7	0.9	Mem U
		Using Criterion I
${CMM}_{1}^{[C_{3}]}$	$8.70 \times 10^{- 4}$	$7.13 \times 10^{- 8}$	$6.87 \times 10^{- 11}$	$3.47 \times 10^{- 23}$	$8.11 \times 10^{- 27}$	$71.906$
${CMM}_{2}^{[C_{3}]}$	$7.64 \times 10^{- 5}$	$9.51 \times 10^{- 9}$	$3.34 \times 10^{- 15}$	$2.21 \times 10^{- 19}$	$7.00 \times 10^{- 25}$	$56.617$
		Using Criterion II
${CMM}_{1}^{[C_{3}]}$	$4.53 \times 10^{- 3}$	$4.32 \times 10^{- 6}$	$1.80 \times 10^{- 12}$	$7.98 \times 10^{- 23 †}$	$2.70 \times 10^{- 26 †}$	$66.755$
${CMM}_{2}^{[C_{3}]}$	$3.05 \times 10^{- 5}$	$1.72 \times 10^{- 7}$	$5.77 \times 10^{- 14}$	$3.60 \times 10^{- 26 †}$	$3.41 \times 10^{- 29 †}$	$51.975$

^† All computations were performed using MATLAB variable precision arithmetic (VPA) with digits = 64, employing a numerical tolerance of

10^{- 30}

.

Table 25. Consistency analysis using random initial guess vectors in parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented in MATLAB parfor for solving (86).

Table 25. Consistency analysis using random initial guess vectors in parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented in MATLAB parfor for solving (86).

Method	Average Iterations	COC	CPU Time (s)	Percentage Convergence	Memory Usage (MB)
Using Criterion I
${CMM}_{1}^{[C_{3}]}$	23	$2.0995$	$2.3005$	$74.74 %$	$87.20$
${CMM}_{2}^{[C_{3}]}$	19	$3.0746$	$2.8745$	$97.91 %$	$76.60$
Using Criterion II
${CMM}_{1}^{[C_{3}]}$	14	$2.1187$	$2.6146$	$87.06 %$	$81.12$
${CMM}_{2}^{[C_{3}]}$	8	$3.0199$	$2.6006$	$93.33 %$	$64.40$

Table 26. Outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented via MATLAB parfor for solving (86). Errors are reported with a double-precision cap; see footnote for VPA residuals.

Table 26. Outcomes of parallel schemes

{CMM}_{1}^{[C_{3}]}

–

{CMM}_{2}^{[C_{3}]}

implemented via MATLAB parfor for solving (86). Errors are reported with a double-precision cap; see footnote for VPA residuals.

Metric	$T_{seri}$ (s)	$T_{para}$ (s)	$ϕ_{speed}$	Maximum Error	Percentage Convergence	Memory Usage (MB)
		Using Criterion I
${CMM}_{1}^{[C_{3}]}$	$3.87$	$1.99$	$1.94$	$\leq 2.22 \times 10^{- 16 †}$	$80.14 %$	$45.13$
${CMM}_{2}^{[C_{3}]}$	$4.90$	$1.37$	$3.59$	$\leq 2.22 \times 10^{- 16 †}$	$83.55 %$	$39.23$
		Using Criterion II
${CMM}_{1}^{[C_{3}]}$	$7.59$	$3.15$	$2.41$	$\leq 2.22 \times 10^{- 16 †}$	$89.34 %$	$31.88$
${CMM}_{2}^{[C_{3}]}$	$8.15$	$2.17$	$3.76$	$\leq 2.22 \times 10^{- 16 †}$	$93.01 %$	$29.69$

^† All computations used MATLAB VPA (with digits = 64). Raw VPA residuals for Max-Error were

8.3 \times 10^{- 27}

,

2.0 \times 10^{- 29}

(Criterion I) and

6.0 \times 10^{- 28}

,

1.9 \times 10^{- 25}

(Criterion II). For consistent reporting across floating-point environments, the displayed Max-Error values are

min {VPA residual, 2.22 \times 10^{- 16}}

, ensuring comparability with IEEE double precision.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shams, M.; Carpentieri, B. Efficient Hybrid Parallel Scheme for Caputo Time-Fractional PDEs on Multicore Architectures. Fractal Fract. 2025, 9, 607. https://doi.org/10.3390/fractalfract9090607

AMA Style

Shams M, Carpentieri B. Efficient Hybrid Parallel Scheme for Caputo Time-Fractional PDEs on Multicore Architectures. Fractal and Fractional. 2025; 9(9):607. https://doi.org/10.3390/fractalfract9090607

Chicago/Turabian Style

Shams, Mudassir, and Bruno Carpentieri. 2025. "Efficient Hybrid Parallel Scheme for Caputo Time-Fractional PDEs on Multicore Architectures" Fractal and Fractional 9, no. 9: 607. https://doi.org/10.3390/fractalfract9090607

APA Style

Shams, M., & Carpentieri, B. (2025). Efficient Hybrid Parallel Scheme for Caputo Time-Fractional PDEs on Multicore Architectures. Fractal and Fractional, 9(9), 607. https://doi.org/10.3390/fractalfract9090607

Article Menu

Efficient Hybrid Parallel Scheme for Caputo Time-Fractional PDEs on Multicore Architectures

Abstract

1. Introduction

2. Construction and Analysis of the Next-Generation Computational Schemes

2.1. Construction of the Scheme

2.2. Theoretical Convergence Analysis

3. Computational Efficiency and Numerical Outcomes

3.1. Implementation of Methodology, Convergence Enhancement, and Result Visualization

3.2. Applications in Biomedical Engineering

3.2.1. Drug Diffusion in Tissue with Nonlinear Reaction [49]

3.2.2. Brain Signal Propagation with Nonlinear Blood Flow Effects [51]

3.2.3. Fractional Heart Tissue Electrical Conduction with Nonlinear Reaction [53]

3.3. Comparative Discussion of Biomedical Examples

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Symbolic Verification of Biomedical Models

Appendix A.1. Drug Diffusion in Tissue with Nonlinear Reaction

Appendix A.2. Brain Signal Propagation with Nonlinear Blood Flow Effects

Appendix A.3. Fractional Heart Tissue Electrical Conduction with Nonlinear Reaction

Appendix A.4. Remarks

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI