1. Introduction
Coronavirus disease 2019 (COVID-19) has become the most severe infectious lung disease affecting humanity, which has spread to more than 200 countries and regions. As of 26 February 2022, more than 443 million cases have been confirmed globally, and more than 5.9 million people have died because of this disease. Currently, the COVID-19 pandemic is considered as one of the most significant health challenges which endangers both human health and economic development. The pathogen causing the new coronavirus pneumonia is severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2, hereinafter referred to as COVID-19) [
1], which is extremely contagious. Furthermore, it has recently been discovered that this virus has mutated into a series of new strains called VUI2020/01, Delta and Omicron, with a more contagious and higher transmission rate than the original coronavirus strain [
2]. Almost all over the world, infections have been reported with such new mutant viruses, which has led to further complications to prevention and control methods. In this grim situation, the top priority should be to detect the infections as quickly as possible and exclude noncarriers so that targeted preventive measures can be implemented based on testing results. The testing speed of COVID-19 depends not only on the specific testing protocol but also on the selected testing method. The improvement of testing protocols, such as whole-genome sequencing [
3,
4], RT-PCR [
5,
6,
7,
8,
9], CRISPR [
10,
11,
12,
13], and RT-LAMP [
14,
15], can shorten the testing time of individual samples. However, widespread testing is simultaneously needed since the highly contagious coronavirus has created a large-scale urgent need for testing. Hence, relying only on improved testing protocols will not significantly improve the testing efficiency when the testing equipment and/or testing products are relatively limited. In this circumstance, the choice of testing method becomes crucial. Using appropriate testing methods based on the characteristics of COVID-19 can largely reduce testing times while ensuring testing accuracy. In the nucleic acid screening testing, the conventional test method is used to test samples one by one, and it is known as the individual-sample testing. Individual-sample testing ensures timely and effective testing results for relatively small sample sizes, but not for the large sample sizes. Adopting the individual-sample testing for screening requires considerable manpower, material and financial resources to test a large number of samples of highly infectious diseases (e.g., millions of samples). Consequently, it delays the formulation and implementation of related countermeasures because the process is time-consuming. On 6 August 2020, Liverpool, in the northwest of the United Kingdom, launched the first COVID-19 testing pilot project in England to avoid overwhelming local hospitals due to the second wave of the epidemic. The project used conventional individual-sample testing methods, which were expected to complete nucleic acid testing of 500,000 residents of Liverpool within two weeks. However, due to the enormous workload, only 200,000 nucleic acid testing samples were completed in the two weeks, which squandered an excellent opportunity to prevent the worsening of the epidemic. In this case, the pooled testing method would have been a better option, because it significantly reduces testing times and costs and it provides low prevalence of the disease [
16,
17]. Pooled testing refers to mixing multiple samples before testing. If the result is negative, it proves that none of the individuals in the sample are infected; if positive, the individual sample method needs to be implemented to identify the infected person [
18,
19]. Majid et al. [
20] recognized that pooled testing is not only feasible for low-income countries, but it also provides an efficient use of scarce testing kits. Additionally, Ball et al. [
21] indicated that pooled testing for SARS-CoV-2 could provide a solution to the UK testing strategy. China was the first country to successfully control the epidemic although Wuhan, which, with a population of 11 million, was once the most severely affected city in China. In April 2020, Wuhan’s nucleic acid testing used the individual-sample testing method. With an average daily test capacity of approximately 46,000, it was estimated to take nearly eight months to test the entire city’s population, which was unfavourable for rapid investigation of infected persons. Accordingly, in May 2020, the “five-in-one” pooled testing method was adopted to improve testing efficiency, and approximately 10 million nucleic acid testing samples were completed in 19 days. Drawing lessons from Wuhan’s successful nucleic acid testing experience, Qingdao’s nucleic acid testing adopted the “ten-in-one” pooled testing method in October 2020. The results showed that 10,899,145 nucleic acid testing samples were completed within five days. Therefore, pooled testing significantly shortens the testing time and improves testing efficiency. Instead of pooling a maximum of five samples as in Wuhan, the “ten-in-one” pooled testing method further improved the testing efficiency. Therefore, choosing a scientific, effective and rapid testing method can quickly and accurately accomplish COVID-19 nucleic acid testing and screening, providing a basis for formulating and implementing subsequent prevention and treatment measures. In this paper, Qingdao’s “ten-in-one” pool testing method is optimised, and a more practical and efficient “five-pointed star” pool testing method is proposed for the first time. Moreover, the theoretical basis and applicable conditions of pooled testing are analysed to provide theoretical references for rapid nucleic acid testing in epidemic areas.
3. Optimisation of the “Ten-in-One” Testing Method: The Pentagram Mini-Pooled Testing Method
As noticed above, the “five-in-one” pooled testing method used in Wuhan and the “ten-in-one” pooled testing method utilised in Qingdao provide new strategies for large-scale nucleic acid testing of the COVID-19 epidemic for other countries. These two methods, using different numbers of pooled samples, have significantly increased the speed of nucleic acid testing in the first round of testing. However, both methods still use “individual-sample testing” in the second round of testing. Especially in Qingdao’s “ten-in-one” pooled testing method, when a sample testing becomes positive in the first round of nucleic acid testing, all 10 individual specimens require individual-sample testing in the second round. This individual-sample testing method not only increases the cost of testing by significantly increasing the number of testing kits and the workload on the health sector, but also reduces the efficiency of the nucleic acid testing and controlling the spread of the epidemic. Accordingly, some scholars have explored the second-round testing method of pooled testing to improve the testing efficiency. As early as the 1940s, Dorfman [
23], an economist at Harvard University, proposed pooled testing for screening the syphilis carriers among soldiers in World War II. Dorfman suggested that, after the individual blood sera were drawn, they were pooled in groups of N and that the groups rather than the individual sera were subjected to chemical analysis. For positive samples in the first round of testing, the individuals constituting the pool must be retested to determine which of the members are infected. Subsequently, some scholars have improved the pooled testing method. Many people in South Africa are infected by human immunodeficiency virus (HIV) every year, and the prevalence is almost 17% among adults 15–49 years of age. Therefore, many scholars in South Africa have investigated HIV testing methods. van Zyl et al. [
24] adopted matrix strategies to reduce the costs of virologic monitoring. Specifically, using nine specimens as an example, the specimens are labelled to form a 3 × 3 matrix. Each row and column of the matrix is detected by “three-in-one” pooled testing. If the testing results of two pooled samples are positive, bearing in mind that the rows and columns intersect in the matrix platform, then the intersection identifies the positive sample. This method only requires that six testing samples be performed on nine specimens to determine an individual or more confirmed patients. Recently, scientists in Rwanda proposed a hypercube algorithm testing method suitable for areas with low infection rates to suppress infections of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [
25]. This method quickly identifies and isolates individuals infected with the virus. Taking 27 (N = 3
3 = 27) testing samples as an example, the hypercubes (3 × 3× 3) of 27 samples are sliced into 3 slices in each of the 3 principal directions. Each of the slices contains nine samples, and “nine-in-one” pooled testing is performed on each slice to identify the coordinates of the positive sample. Thus, in this example, only nine testing samples can be used to uniquely identify an individual infected person among 27 people. For the optimised testing methods proposed by the above scholars, the number of pooled samples in the first round is n
2 or n
3 (n is a natural number), usually 3
2 or 3
3. These values can be divided equally in multiple directions, and in each direction, they can be divided in a multidimensional matrix platform. Then, positive patients can be identified through the intersection of different positive pooled samples. However, in Qingdao’s “ten-in-one” pooled testing, the number of pooled samples cannot be sorted to form n×n matrix pools. To reduce the number of testing samples for the second round of “ten-in-one” pooled testing without increasing the time and cost, this paper innovatively proposes the pentagram mini-pooled testing to identify an individual positive patient, as shown in
Figure 4.
Since pooled testing is suitable for areas with low infection rates, a high number of individuals who are positive in the random “ten-in-one” testing is a small probability event from a probabilistic point of view. That is, if a “ten-in-one” pooled testing result is positive, the greatest probability is that only an individual person is infected. Therefore, the pentagram mini-pooled testing (see
Figure 4) proposed in this paper focuses on the circumstance in which only one specimen is infected among the ten specimens of the positive pooled sample. As mentioned above, when a “ten-in-one” pooled testing is positive, the ten people should be isolated and retested in a second round. Pentagram mini-pooled testing works as follows in the second round. Before testing, the people are sorted from one to ten, and two throat swab samples are collected from each person. During testing, the twenty swabs are pooled into six samples according to the pentagram mini-pooled testing method (five “three-in-one” pooled samples (S1–S5) and one “five-in-one” pooled sample (S6), as shown in
Figure 5)). Every three specimens are placed in the same collection tube according to the three serial numbers corresponding to the vertices of the five orange triangles of the pentagram. For example, one swab sample is taken from subjects 1, 6, and 7, and these three swab samples are placed into the S1 collection tube for the “three-in-one” pooled testing. By analogy, S2–S5 collection tubes are obtained. The “five-in-one” pooled sample tube (i.e., S6 collection tube) refers to the collected swab samples from subjects 1 to 5. Nucleic acid testing is then performed for the six pooled samples (S1–S6). After the testing samples are returned, the positive patients are identified by comparing the prediction testing results, as shown in
Figure 5.
Table 1 shows the pentagram mini-pooled testing prediction results. When there is only one positive testing result in S1–S6, it shows an error in the nucleic acid testing collection process (the probability should be zero), requiring swab collection and nucleic acid testing of the 10 subjects again. When two pooled samples of S1 to S6 are positive and the other four are negative, only one confirmed patient can be identified. When more than two positive pooled samples are found in S1–S6, there are at least two positive patients in this group. If more than two testing samples are positive, only the subjects in these positive mini-pools need to be subjected to the third round of individual-sample testing. The relationship between the number of positive testing samples and the number of confirmed patients is expressed by Formula (1). In summary, the pentagram mini-pooled testing method can accurately determine the only positive patient among 10 people by testing six mixed samples. Compared with the individual-sample testing utilised in the second round of Qingdao’s “ten-in-one” pooled testing method, the proposed method significantly improves the efficiency of the second round of nucleic acid testing, and the accuracy of the testing results is guaranteed.
This method is suitable for rapidly screening large-scale low-risk populations and local epidemic rebound areas. It ensures that infected people can be quickly identified in the second round of pooled testing, hence it quickly interrupts the possible spread of the epidemic. However, for detecting high-risk populations, such as symptomatic patients in fever clinics and those who were in close contact with confirmed patients, individual testing should be adopted.
4. Theoretical Basis and Applicable Conditions for Pooled Testing
There are certain applicable conditions for large-scale screening of COVID-19 using the pooled testing method. The applicability of this method and the selection of a reasonable number of pooled samples are importantly related to the infection rate of the region [
16,
26]. To verify the efficacy and applicable conditions for pooled testing, the quantitative relationship between the value of pooled samples and the virus infection rate is determined. It is assumed that
N is the total population of a certain city;
p is the infection rate;
x is the number of people in each set of pool testing samples; and Y is the total testing time for the first round of pool testing (Y
1) and the second round of individual-sample testing (Y
2).
p is an unknown and dynamically changing value, but it can be estimated based on the number of confirmed cases.
If x = 1, each person is tested once. Then, the value of Y is equal to N.
When x ≠ 1, the testing should be performed in two rounds.
The first round is pooled testing considering that each
x people represents a set, so the testing time required for the first round is Y
1 =
N/x. The first testing result is positive or negative (only if everyone in a given set is uninfected can a negative result be obtained). Therefore, (1 −
p)
x is the probability of getting a negative result for the set testing of the first round, and 1 − (1 −
p)
x is the probability that the result of the set testing is positive for the same round. Hence, the number of sets with positive testing results from the first round can be defined as follows:
The second-round testing is for the sets that showed positive testing results in the first round. The total number of people in these sets is as follows:
Therefore, the time for testing the second round can be determined as:
After the two-round testing is completed, the total testing time Y can be computed as:
This work focused on large cities, which require 10 million testing samples to quantitatively analyse the relationship among the total population of city
N, the infection rate
p and the number of samples
x. Specifically, the infection rate
p is set as 0.0001, 0.001, 0.01, and 0.1. Each of the values is round then studied with parameterisation. In the following,
Figure 6,
Figure 7,
Figure 8 and
Figure 9 show the testing time relationships (considering the first-round, second-round and total times) of different infection rates.
In
Figure 6,
Figure 7,
Figure 8 and
Figure 9, the testing time Y
1 in the first round is an inverse proportional function, meaning that, as the number of samples
x increases, the testing time Y
1 in the first round gradually decreases. It is worth noting that the testing time reduction does not correlate with the infection rate
p. On the contrary, the testing time Y
2 in the second round is an exponential function, and it is observed that it is highly correlated with the infection rate
p. When the infection rate
p is very small (
p = 0.0001), although it is not clear, the testing time Y
2 in the second round increases with increasing
x sample numbers. As the infection rate
p gradually increases, the testing time Y
2 in the second round increases sharply. Consequently, if pooled testing is adopted, when the number of infected people in a certain area increases under the constant number of samples, the time required for the second round of testing will increase with the increasing number of infected people, and it will show an obvious dramatic increase. Accordingly, this increases the number of nucleic acid testing samples and results in more resources and time being consumed. If the virus cannot be controlled in a short time, it will form a vicious circle, whereby an increase in the number of infected people will lead to a sharp increase in the testing time. Such increased testing time puts tremendous pressure on the testing personnel and medical system. Therefore, the best opportunity to screen infected people and control the epidemic is to perform nucleic acid testing when the infection rate is still low.
From the above, it can be concluded that, as the infection rate
p increases, the testing time Y
2 in the second round increases dramatically, which significantly affects the total testing time Y. The value Y is plotted when the infection rate
p is 0.0001, 0.001, 0.01, 0.1, and 0.2 to analyse the relationship between the total testing time Y and the number of samples x under different infection rates
p (see
Figure 10). When the infection rate
p is low (
p ≤ 0.001), as the number of samples
x increases, the total testing time Y drops sharply. However, as the number of samples
x continues to increase, the total testing time Y stabilises.
Figure 10 shows that a further increase in the number of samples
x when
p ≤ 0.001 barely reduces the testing time. When the infection rate
p increases (
p ≥ 0.01), it can be noticed that, as the number of samples
x increases in the initial stage, the total testing time Y decreases significantly. However, as the number of samples
x continues to increase, the total testing time Y increases until it approaches or exceeds the total testing time if the traditional individual-sample testing is used. Consequently, pooled testing is not efficient in these situations.
As can be noticed, the total testing time Y, as shown in
Figure 10, initially decreases rapidly. This is because when the number of sets is small, the first round of set testing time
N/x dramatically reduces the total testing time. Since the number of samples
x is small, the testing time for positive sets in the next round remains relatively small. Therefore, the second-round testing time Y
2 in the total testing time Y is relatively small. On the contrary, when the number of samples
x is large, the value
N/x significantly reduces the testing time. However, the testing time of positive sets in the second round also increases significantly as it becomes exponential, which adds to the total testing time. Therefore, there is a reasonable value range for the number of samples
x under different infection rates
p. From
Figure 5, it could be observed that, as the infection rate
p increases, the most reasonable number of samples
x decreases.
According to Equation (5), this study calculates the times of testing needed in pooled testing for a city with 10 million people under different infection rates, in addition to the most reasonable
x, the testing times of each round and the reduction rate relative to traditional individual-sample testing (see
Table 2).
Currently,
Figure 11 provides the optimum number of samples relative to the infection rate. Additionally, the minimum number of testing samples against the infection rate is illustrated in
Figure 12. The ratio between the second-round testing time to that of the first-round testing is provided in
Figure 13, while
Figure 14 shows the relationship between the reduction rate relative to the individual-sample testing corresponding to the infection rate. From
Figure 11, it is obvious that, when the infection rate
p is minimal, the value of
x (assumed to be the optimal value) corresponding to the minimum value of Y is increasing. As the infection rate
p gradually increases, the number of samples
x gradually decreases. When the infection rate
p reaches 0.01, the “eleven-in-one” pooled testing method becomes the most useful testing method. When the infection rate
p = 0.1, only the “four-in-one” pooled testing method can be used to reduce testing times. As the infection rate
p increases, the corresponding minimum testing times also increase (see
Figure 12). As the infection rate
p increases, the value of Y
2 increases sharply. The proportion of testing times in the second round also increases distinctly compared to that of the first round (see
Figure 13). In addition, from
Figure 14 it can be noticed that, as the infection rate
p increases, the advantages of pooled testing compared with those of individual-sample testing gradually decrease. When the infection rate
p is low (e.g., when the infection rate is 0.001), the pooled testing method should be used, which reduces the total testing time by more than 90% compared with individual-sample testing. When the infection rate
p is 0.01, the “eleven-in-one” pooled testing method is adopted, which reduces the total testing time Y by 80.44%. When the infection rate
p reaches 0.3, the pooled testing times are close to that of individual-sample testing, and the reduction rate compared to individual-sample testing is minimal. As the infection rate
p further increases, the total time required for pooled testing surpasses that of individual-sample testing, which increases the sampling and testing costs. Therefore, pooled testing has significant advantages over individual-sample testing when the infection rate
p is low; however, with increasing infection rate, testing efficiency gradually decreases, and its advantages further decrease. This also theoretically explains why the “ten-in-one” pooled testing method was used in Qingdao, while the “five-in-one” pooled testing method was used in Wuhan. Specifically, the infection rate
p in Qingdao was low, and the use of “ten-in-one” pooled testing significantly reduced the testing times. Therefore, it took only five days to conduct nucleic acid testing on more than 10 million people. On the contrary, due to the higher infection rate in Wuhan, the “five-in-one” pooled testing method was adopted with fewer combinations. From the above discussion, it could be concluded that, when the infection rate
p is low, pooled testing has undeniable advantages. However, as the infection rate
p increases, its testing advantages gradually decrease. When the infection rate
p reaches a certain value (
p > 0.3), pooled testing becomes no longer applicable. However, COVID-19 continues to spread around the world. Before the vaccine is developed and widely used, nucleic acid testing and screening for infected persons is still one of the most effective control measures. Therefore, in regions with few infected people, earlier pooled testing should be adopted to reduce the numbers and time required for nucleic acid testing, which is beneficial for controlling COVID-19. On the contrary, higher infection rate increases testing times, and the dramatic increasing trend increases the difficulty and cost of testing.