1. Introduction
Determining a patient’s blood type (blood typing) is one of the most important and essential steps to be performed before treating injured people. Such a step should be accomplished with precision and accuracy in a timely manner to save lives and avoid serious consequences, especially in emergency situations. Traditional methods of blood typing depend on observing the agglutination of blood (i.e., blood cells sticking together as clusters) by the bare eye after applying antigens and antibodies as shown in
Figure 1. However, these traditional methods are performed manually by lab technicians and are subjected to human error and delays. Hence, there is increasing interest in determining blood type using automated methods, which give faster, more precise, and more accurate results [
1,
2,
3]. Image analysis techniques have been adopted in automatic blood typing to provide fast and objective decisions [
4,
5,
6,
7,
8,
9].
The majority of automatic blood typing approaches include three main stages: preprocessing, feature extraction, and classification. In the preprocessing stage, many techniques can be used to improve the input image for further processing, such as noise reduction or removal, color space transformation, brightness correction, morphological modification, and isolation of the region of interest (ROI) (i.e., in this case, the blood spot or part of it as the agglutinated clusters) from the background, among other techniques. In the feature extraction stage, the ROI is transformed into a string of descriptors called features (mainly as a number or a sequence of numbers) to be used for identification in the classification stage, where the ROI is recognized as agglutination or not. The classification and preprocessing techniques are common between the following studies. In the literature review below, we will concentrate on the feature extraction stage because it represents the fundamental step in the automatic typing process for blood discrimination.
Ferraz et al. [
1] used the standard deviation (SD) calculated from blood images to determine their group. Four images are captured for the blood sample using a CCD camera (charge-coupled device) after mixing it with A, B, AB, and D antigens. Then, the images are analyzed through a sequence of operations using IMAQ vision software, which is an image processing software that was developed originally by National Instruments. The operations include color conversion and processing, manual thresholding, and morphological operations [
1]. The approach was improved in [
4], but was tested using only 24 blood samples. The standard deviation feature was also used by Talukder et al. [
5] and Dong et al. [
6] after being extracted from the color image and the green band of the blood, respectively. Another study used support vector machine (SVM) to detect the agglutination based on the standard deviation as well [
7]. Dhande et al. [
8] isolate the blood spot from the background based on its luminance after transforming the RGB (red, green, and blue) input image into an HSV (hue, saturation, and intensity value) color model according to static color values. Then, they detect the blobs in the ROI and classify them (i.e., into agglutination or not) according to their area. However, these approaches suffer from many limitations such as having manual operations and a relatively long processing time of 2 min. Moreover,
Section 2.2 shows that the standard deviation is not a discriminant feature; it cannot be used alone to discriminate between normal and agglutinated blood spots. Furthermore, the usage of the SVM by Panpatte et al. [
7] is not necessary because the classification of the ROI based on a single feature (i.e., the standard deviation) can be easily performed using a single threshold value. Finally, the approach presented by Dhande et al. [
8] requires a special environment and configuration for the blood slide, the light intensity, and angle at the time of photo capture because it depends on a static fixed value for the luminance.
Researchers in [
9] detect the contours in the input image and identify the agglutination based on the number of components, where a threshold of 5 is considered for the connected components. However, this technique may result in false blobs in the background. Other approaches adopted electrical circuits to perform automatic blood typing. According to the approach in [
2], a light is generated by an LED and passed through blood samples using optical fiber cables. A specific diode is used as a photo detector. The approach discriminates between different blood groups according to voltage variations from the photo detector. A similar approach was proposed in [
3] based on an infrared (IR) light source. The blood sample is located between the IR transmitter and receiver. The blood type is determined according to the intensity of the received IR light. Fernandes et al. [
10] propose a portable device for blood typing by identifying the spectral differences between agglutinated and nonagglutinated samples, where the result requires up to 5 minutes to be ready. The device was tested using 50 blood samples. A hardware implementation for blood typing system is presented by Cruz et al. [
11] using a Raspberry Pi single-board computer. The system detects the contours in the blood spot and determines the existence of the agglutination if the number of the detected contours exceeds a given threshold. A total of 75 blood samples were used in system construction and evaluation. These approaches require additional special hardware. Regardless of the additional cost, the hardware may not be available everywhere or every time.
Many other techniques can be used to classify medical images into normal and infected (i.e., with or without agglutination, in our case). Frequency domain analysis was used by Yayla et al. [
12] to classify nanometer-sized objects in images provided by a biosensor. Fourier and wavelet features are extracted from the input images and analyzed where the decision tree and random forest are used for classification. Yang et al. [
13] proposed a classification method based on the wavelet transformation for feature extraction. The classification is performed using an interpolation scheme. The wavelet decomposition was further improved by proposing an adaptive framework to improve the performance in terms of down sampling balance and signal compression [
14]. The wavelet was also used by Liu et al. [
15] to reduce the haze and enhance texture details of the images in frequency domain. Wavelet features can be represented using sophisticated mathematical models such as fractal descriptors [
16,
17,
18]. However, these techniques require domain-dependent knowledge in case of model updating or scaling.
Although the blood type determination based on feature engineering (i.e., extracting features depending on domain knowledge) is still the most prevalent in research studies, other studies used image matching algorithms such as scale invariant feature transform (SIFT) and speed-up robust feature (SURF). For example, SIFT was used in [
19] to transform the green component of the image into a collection of local feature vectors, after many preprocessing techniques such as thresholding and morphological operations. Then, the SVM algorithm is used for classification. The proposed approach was evaluated using only 30 blood samples. Furthermore, Sahastrabuddhe and Ajij [
20] employed the SURF algorithm to detect the agglutination in blood using only 84 blood samples. However, SIFT and SURF algorithms are quite slow compared with other newer image matching algorithms such as oriented fast and rotated brief (ORB) [
21]. In addition to the limitations discussed for each group of the studies above, all of them used a small number of blood samples and did not provide an objective evaluation of accuracy.
As discussed earlier, traditional blood typing is performed manually as follows: (1) taking a blood sample from the patient, (2) mixing it with different antibodies on a slide, (3) observing the agglutination, and (4) determining the blood type. In this paper, we proposed a system to automate the last two steps, (3) and (4). The system can handle a large number of input images captured by a mobile phone camera minimizing human error. Such a system has the ability to handle difficult cases that are ambiguous even for manual inspection by people who might fail to detect agglutination and/or blood type while using state-of-the-art approaches. Our contribution can be summarized as follows:
The ORB matching algorithm was adopted to provide an accurate, fast, and automated blood typing system.
The system is able to detect various agglutination patterns regardless of variations in photos brightness.
The system was evaluated using 1500 images of blood spots that cover all possible agglutination patterns.
The evaluation includes detailed analysis for the accuracy and processing time of different approaches.
We begin this paper by providing an overview of the blood typing process, problem statement, and challenges, which are explained thoroughly in
Section 2. The principle of operation of the used image matching techniques is reviewed thoroughly in
Section 3. The experimental setup, including blood image capturing and analysis, is provided in
Section 4. Results and their analysis are provided in
Section 5. Finally, conclusions are drawn in
Section 6.
3. Overview of Image Matching Techniques
Image feature points matching is the process for searching and finding effective point pairs in the corresponding images. In this research, three image matching algorithms—scale invariant feature transform (SIFT), speed-up robust feature (SURF), and oriented fast and rotated brief (ORB)—are used to determine blood type based on the presence or absence of agglutination.
SIFT was proposed in 2004 to detect features of an image. These features are invariant to image scale and rotation [
23]. SURF was introduced in 2006 as an improved algorithm with lower computation complexity compared with the SIFT algorithm by [
24]. ORB, first presented in 2011 [
25], is based on the FAST (features from accelerated segment test) keypoint detector and the visual descriptor BRIEF (binary robust independent elementary features). It provides a fast and efficient alternative to SIFT and SURF. This section provides a brief overview of these matching algorithms.
3.1. SIFT
The scale invariant feature transform (SIFT) algorithm takes an image and transforms it into a large collection of local features’ vectors. Each of these features’ vectors is invariant to any scaling or rotation of the image.
The SIFT algorithm is distinctive where individual features can be matched for a large database of objects. SIFT provides many features for even small objects. The principle of operation of the SIFT algorithm is illustrated by the flow chart shown in
Figure 8. The principle of operation of the algorithm is summarized by the following steps:
Step 1: Scale space extrema detection using DoG. The first step to extract image keypoints is to identify candidate key points and the locations at different space scales using variable-scale Gaussian kernel
, which is convoluted with the input image
to produce the scale space of an image
, as in Equation (
1)
where * is the convolution operation. All scales must be examined to identify the scale invariant feature. SIFT uses a difference of Gaussian method (DoG) function, which is defined as
, to detect efficient key points’ locations by computing the difference of two nearby scales separated by a constant multiplicative factor
k [
23], as in Equation (
2),
A group of scaled images is called an octave. To ensure that the same numbers of DoG images are generated per octave, each octave corresponds to doubling the value of , and the value of K is selected. Each pixel in a DoG image is compared to its eight neighbors at the same scale, and the nine corresponding neighbors at neighboring scales. The pixel will be selected as a keypoint if it is a local maximum or minimum among the (26) pixels surrounded.
Step 2: Keypoints localization. Too many keypoints will be generated by extrema detection; some of those keypoints are unstable. The next step aims to eliminate low contrast candidates and poorly localized candidates along edges. This is accomplished using a Taylor series expansion of DoG,
D [
26]:
where
The maxima or minima location is determined by Equation (
5),
The keypoint is acceptable if
is above a threshold value. To ignore candidates along edges, Hessian matrix
H is computed at the location and the keypoints are scaled.
H is given by Equation (
6),
The eigenvalues
of
H are computed and used to detect corners and reject keypoints edges. However, eigenvalues are not explicitly computed; instead, trace and determinants of
H are used [
27]:
where keypoints for which the ratio between
is greater than a threshold are neglected.
Step 3: Keypoints orientation assignment. This step is important to get rotation invariant keypoints. To assign rotation or more for each candidate, the magnitude and direction of
L (smoothed image) at each scale of keypoint
is calculated [
28] using Equations (
9) and (
10),
where,
and
. A weighted direction histogram for the neighborhood pixels of the keypoint is then created using 36 bins to cover
(
for each bin). The maximum direction is selected as the orientation of the keypoint in addition to the directions where the local peaks are within 80% of the maximum peak.
Step 4: Keypoints descriptors. This step makes use of gradient orientation histograms for robust representation. The descriptor of each keypoint is created by computing the relative orientation and magnitude in a
neighborhood region. This region is then divided into
regions to create weighted histograms (eight bins) for each region. Each descriptor contains a
array of 16 histograms around the keypoint. This leads to a SIFT feature vector with
elements [
29].
Step 5: Keypoints matching using Eucledian distance. Image matching is carried out by searching for corresponding features between two images, according to the nearest neighborhood procedure. Finding the nearest neighbor is accomplished by calculating the Euclidean distance for each feature and in the training database where the matching feature is that of the minimum distance. However, many features from an image will not have any correct match in the training database. Therefore, it would be useful to have a way to discard features that do not have any good match to the database. The ratio of distances between the best and second best match must be lower than a threshold. Rejecting all matches where the distance ratio is greater than 0.8 will eliminates 90% of the false matches while discarding less than 5% of the correct matches as proposed by [
23].
3.2. SURF
The speed-up robust feature (SURF) algorithm has mostly the same principles and steps of operation used in SIFT algorithm. The SURF has a faster convergence time and better performance results compared with the SIFT algorithm. The principle of the SURF algorithm operation is best illustrated by the flow chart shown in
Figure 9. Its operation can be summarized by the following steps:
Step 1: Integral image generation. SURF uses an integral image instead of the image itself for faster calculation of box type convolution filters and Haar wavelet features’ responses. The integral image (also called the summed area table) at position
is the sum of all the values of the original image above and to the left of the pixel at location
X.
where
is the original image and
is the integral image [
24].
Step 2: Keypoint detection using a fast-Hessian detector. A SURF detector differs from a SIFT detector; SURF uses fast Hessian instead of DoG. The Hessian matrix
of a given point
in the image
I at scale
is defined as follows:
where
is the convolution of the Guassian second order derivative
with image
I in point
X, and similarly,
,
. SURF approximates the second order derivatives using box filters and evaluates the image using integral images to speed calculations up [
30]. The approximation of convolution results is denoted by
,
, and
. The determinant of the approximated Hessian matrix is calculated as
where the factor 0.9 is needed for energy conservation between the Gaussian kernels and the approximated Gaussian kernels [
31]. SURF generates a scale-space image pyramid by convolving the image with an increasing size of the box filter. Each octave contains the convolution results with four upscaling box filters.
Table 2 shows the parameters of the first three octaves. Note that the new parameter scale is
.
Step 3: Keypoints localization. To localize the keypoints in the image with the accurate scale, the nonmaximum suppression method with a
neighborhood is applied [
26]. If the point has maxima determinant of the Hessian matrix, then it is selected to be a keypoint.
Step 4: Keypoints orientation assignment using Haar wavelet. To get rotation invarience keypoints, SURF calculates Haar wavelet responses within a circular neighborhood of radius 6s around the keypoint, where s is the determined scale of the keypoint. The responses in both the x direction () and the y direction () are calculated. Hence, integral images are used for fast filtering. The sum of x responses and y responses within every window of is calculated to construct the orientation vector. The dominant orientation of the keypoint is assigned by the longest vector.
Step 5: Keypoints descriptors. SURF creates the keypoint descriptor again using Haar wavelets—this time, by creating a square region of size 20
s around the keypoint toward the assigned orientation. This region is then divided to
sectors. For each sector, Haar wavelet responses in both directions are calculated. Each sector will be presented by a vector
v of length 4 that contains the sum of Haar wavelet responses in both the
x and
y directions, the sum of the absolute responses in both the
x and
y directions (
) [
32]. The resulting SURF keypoint descriptor has a length of
.
Step 6: Keypoints matching using Eucledian distance. To find the matching features between two images, the same procedure that has been used in SIFT is applied by calculating the Euclidean distance between each keypoint descriptor and the training database keypoints descriptors. The matching feature is initially that of the minimum distance. However, to discard features that do not have any good match to the database, the ratio of the distance between the best and second best match must be lower than a threshold.
3.3. ORB
The oriented fast and rotated brief (ORB) algorithm is based on the features from accelerated segment test (FAST) and binary robust independent elementary features (BRIEF) algorithms. The ORB algorithm has the advantage of rotational invariance [
25]. The ORB algorithm has the ability to reduce sensitivity to noise, and it is faster than the SIFT and SURF algorithms [
25]. The principle of operation of the ORB algorithm is best illustrated by the flow chart shown in
Figure 10. The operation of the ORB algorithm is summarized in the following steps:
Step 1: Keypoints detection using FAST. The ORB algorithm is based on FAST to find feature points quickly. FAST is performed by choosing an arbitrary pixel point
P as a center to form a circular area with nine pixels neighbored to the pixel
P, taking the intensity of the arbitrary pixel
P (
), the intensity of each neighbored pixel
n (
), and a threshold value
t into account. Each pixel (
n) in these nine pixels can have one of the three states in Equation (
14) [
33]:
P is a feature point if at least three of the neighbored pixels are brighter than
or darker than
. Otherwise,
P cannot be a feature point.
Step 2: Corners detection using Harris. ORB has large responses along edges. Therefore, the Harris algorithm is used to detect corners. The main idea is to take a small window around each feature point
P and measure the amount of change around it after shifting the window by eight directions. After that, the sum squared difference (SSD) of the pixel values before and after the shift is taken to identify the pixel windows at which the SSD is large in all directions. Finding the eigenvalues
for each window around each feature point
P will be used to determine the corners of this feature point. If both
are high, then
P is a corner [
34].
Step 3: Scale pyramid transform. A scale pyramid is used to produce multiscale features in order to obtain scale invariance. The image pyramid is a multiscale representation of a single image that consists of a series of images that are versions of the original image at different resolutions. Each image in the pyramid contains a downsampled version of the image in the previous level. When the pyramid is created by the ORB algorithm, the FAST algorithm is employed to detect keypoints in the image. By detecting keypoints in each image of the pyramid, the key points can be detected effectively at different scales. In this way, ORB is partial scale invariant.
Step 4: Orientation assignment using IC. Since FAST does not produce orientations, the intensity centroid (IC) technique is used to find the orientation of each FAST-produced feature point [
35]. Moments of a patch are defined as
With these moments, we can find the centroid, the “center of mass” of the patch (centroid) of the moments; this is calculated as
Then, a vector from the corner’s center O to the centroid
is constructed and the orientation of the patch is then given as
where
is the quadrant-aware version of
.
Step 5: Keypoints binary descriptors using BRIEF and rBRIEF. Binary robust independent elementary feature (BRIEF) converts all keypoints detected by the FAST algorithm into a binary feature vector that is known as a descriptor. BRIEF chooses two random pixels around each keypoint; then, it compares their intensity. If the first pixel is brighter than the second, it assigns the value 1 to the corresponding bit, else 0 is assigned. This process is repeated until the length of each vector that describes the keypoint becomes 255 bits. Consider an image patch around a keypoint
p; then, the binary test
is defined in Equation (
18):
and
are the intensity values at positions
x and
y. The feature is defined as a vector of
n binary tests:
The BRIEF algorithm begins analysis by filtering an image patch around the key point with a Gaussian kernel in order to prevent the descriptor from being sensitive to noise. The sensitivity to noise can be reduced significantly. The filtering step increases the stability and repeatability of the descriptors [
36]. Rotation-aware BRIEF (rBRIEF) is a modified version of BRIEF that takes the orientation of the keypoints into account. For any feature set of
n binary tests at location
, we need the
matrix where
:
ORB defines the rotation matrix with orientation
of feature point [
25] to construct a steered version
of
S:
where
Step 6: Keypoints matching using Hamming distance. The matching between images is computed using the Hamming distance. The Hamming distance for each keypoint descriptor and the descriptor in the database is computed. The Hamming distance between two descriptors
,
is computed using Equation (
23) [
37]:
The descriptors and are 256 vectors defined by and . The matching is decided based on the minimum Hamming distance. To discard features that do not have any good match to the database, the ratio of distances between the best and second best match must be lower than a threshold. Rejecting all matches where the distance ratio is greater than the threshold should eliminate most of the false matches.
6. Conclusions and Future Work
In this paper, an automated system was created adopting the ORB algorithm in order to provide an accurate diagnostic support for blood type determination. The designed system proved that it is invariant to the variations of the brightness and can detect different agglutination patterns. Furthermore, it provides the results with short time and a high accuracy of . The system was evaluated using a total of 500 different images that were partitioned into 1500 images of blood spots. It was compared with other systems based on the standard deviation, the SIFT algorithm, and the SURF algorithm. The evaluation showed that the approaches have an accuracy of , , , and , respectively, and an average processing time of , , , and s, respectively. The ORB algorithm was further optimized in order to improve accuracy from to while sacrificing the addition of 83 ms.
As for future work, a user-friendly mobile application will be developed based on the optimized ORB algorithm to completely dispense with the PC to analyze blood images. This application will help the laboratory technicians and civil defense paramedics to automatically determine blood type with high accuracy and short time avoiding human error. The application could be connected to a centralized database in order to collect medical information for people in Palestine. The database could be accessed by hospitals and medical centers to provide vital information in a short amount of time.