Deep Learning Model-Based Real-Time Inspection System for Foreign Particles inside Flexible Fluid Bags

Lim, Chae Whan; Son, Kwang Chul

doi:10.3390/app14177960

Open AccessArticle

Deep Learning Model-Based Real-Time Inspection System for Foreign Particles inside Flexible Fluid Bags

by

Chae Whan Lim

¹ and

Kwang Chul Son

^2,*

¹

Micro-Degree College, Shinhan University, 95, Hoam-ro, Uijeongbu-si 11644, Republic of Korea

²

Department of Smart Electrical and Electronics, Kwangwoon University, 20, Gwangun-ro, Nowon-gu, Seoul 01897, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(17), 7960; https://doi.org/10.3390/app14177960

Submission received: 31 July 2024 / Revised: 30 August 2024 / Accepted: 3 September 2024 / Published: 6 September 2024

Download

Browse Figures

Versions Notes

Abstract

:

Intravenous fluid bags are essential in hospitals, but foreign particles can contaminate them during mass production, posing significant risks. Although produced in sanitary environments, contamination can cause severe problems if products reach consumers. Traditional inspection methods struggle with the flexible nature of these bags, which deform easily, complicating particle detection. Recent deep learning advancements offer promising solutions in regard to quality inspection, but high-resolution image processing remains challenging. This paper introduces a real-time deep learning-based inspection system addressing bag deformation and memory constraints for high-resolution images. The system uses object-level background rejection, filtering out objects similar to the background to isolate moving foreign particles. To further enhance performance, the method aggregates object patches, reducing unnecessary data and preserving spatial resolution for accurate detection. During aggregation, candidate objects are tracked across frames, forming tracks re-identified as bubbles or particles by the deep learning model. Ensemble detection results provide robust final decisions. Experiments demonstrate that this system effectively detects particles in real-time with over 98% accuracy, leveraging deep learning advancements to tackle the complexities of inspecting flexible fluid bags.

Keywords:

flexible fluid bags; object-level background rejection; deep learning model; real-time inspection; aggregation of object patches; object tracking

1. Introduction

During the mass production of intravenous fluid bags, it is crucial to prevent contamination by foreign particles because this can lead to critical situations for patients, such as severe illness or death [1,2,3]. Foreign particles can inevitably enter the bags due to environmental factors, material handling, and assembly processes. For instance, dust, fiber fragments, hairs, print fragments from the surface of the bags, tubes or plastic pieces from the cutting process, insects infiltrating the manufacturing space, and cut pieces of the fluid bag components can all inadvertently mix into the fluid inside the bags, as shown in Figure 1. Although production occurs in highly sanitary and clean environments, such contamination, though rare, can cause significant problems if the contaminated products reach consumers. These issues include the injection of foreign substances into the human body, loss of trust in intravenous fluid products, severe damage to the manufacturer’s brand value, and even the suspension of production and sales permits.

Due to these risks, many intravenous fluid bag manufacturers worldwide are making considerable efforts to prevent such contamination [4]. A common method for preventing foreign substance contamination is to deploy numerous inspectors to manually examine each product visually, identifying and removing defective fluid bags. However, since humans handle this task, fatigue can set in after long periods, leading to errors in identifying defects. To mitigate this, inspections are typically conducted in shifts, with inspectors working for 1 h followed by a break. This manual inspection process is inefficient and unreliable, significantly increasing manufacturing costs and prompting many companies to seek automated solutions to replace manual inspections.

Consequently, there have been many studies to replace the manual inspection process with an automated one. Various sensor technologies, such as laser sensors, static division sensors, visual cameras, and optoelectronic methods, have been employed to enhance the accuracy and efficiency of the inspection process. These sensors show good performance in specific inspection tasks but also have many issues, such as high initial setup costs, material constraints, sensitivity to alignment, and sensitivity to environmental factors such as temperature and light conditions [5]. However, the techniques of computer vision and machine learning technology have shown promising results in many areas of quality inspection in various manufacturing industries due to their characteristics, such as non-destructiveness, high precision, speed, and consistency [6,7]. Due to the merits of computer vision techniques, there have been many studies in the area of particle inspection on glass bottles or vials.

Mostly, they detect the moving objects from the input sequence of images, extract features of the objects, and classify them as particles or bubbles using various machine learning algorithms. To detect moving objects, Ishii et al. [7] focused on pixel difference values between frame images, while Xiaoshi et al. [8] and Zhou et al. [9] modeled the background using a Gaussian Mixture Model to extract moving objects. Some studies tracked moving objects across frames to utilize motion trajectories as features or to make robust classifications over multiple observations [5,10,11]. Many studies sought to extract various features such as size, color, shape, velocity, the mean gray value, geometric invariant moments, the wavelet packet energy spectrum, and the histogram of oriented gradients (HOGs) to improve classification performance between bubbles and particles [8,9,10,11,12,13,14]. Lu et al. [15] introduced the strategy of using different features for slow-moving and fast-moving objects. For better classification, they used various machine learning methods, such as dimensional reduction techniques or dictionary algorithms [8,10,11,13]. Later, deep learning methods were also used for the detection and classification of particles [14,16]. Yi et al. [14] used a deep learning method with adaptive convolution and multiscale attention to locate and classify particles.

Many studies show that machine learning methods in computer vision are used to classify the objects detected during the inspection. However, the main difficulty in machine learning is finding good features that can effectively discriminate between particles and bubbles. Various features have been proposed, and significant efforts have been made to develop good features through transformation and complex manipulation. Many experiments are necessary to identify good features and their combinations. Through these studies, all the features considered for inspection are grouped into the following two categories: the behavior and the shape of the objects. The features related to the behavior of the objects inside the moving fluid are observation position, moving direction, velocity, acceleration, patterns of trajectory, etc., and these are used to discriminate heavy particles such as metal or rubber from air bubbles and to show a good detection performance in real-time [10]. However, these features are not effective when the material of the particle is not heavy, as it often behaves like air bubbles. Thus, recent studies have focused more on the shape of the particles rather than their behavior. Features such as size, aspect ratio, colors, textures, brightness, shape patterns, and so forth are considered, but the issue of finding good descriptors or features that can discriminate between particles and bubbles still exists. Contrary to this machine learning-based approach, a deep learning-based approach does not need to find good features manually; instead, more data will guarantee better performance, which is advantageous for this application because various data on bubbles can be easily provided [16,17]. If a good deep learning model is chosen, good features can be formed naturally through the deep learning training process. Thus, this paper focuses on deep learning technology and will explain more on its adaptation for particle detection.

1.1. Challenges of Flexible Fluid Bag Inspection

Detecting foreign particles inside intravenous fluid bags presents a variety of challenges that are not encountered with more rigid containers such as glass bottles or vials, which are uniformly transparent and do not change the shape of objects. Unlike bottles, which are unwrinkled, straight, and uniform, fluid bags are flexible and can change shape depending on how they are held. This flexibility can cause irregular refraction and reflection in images captured by cameras, leading to difficulties in detection. The same shape of air bubbles or foreign particles can appear distorted depending on their position, and objects moving at a constant speed can seem to accelerate or decelerate due to these refractive distortions. The light around the fluid bags is reflected on the irregular surface of the bags, introducing various shapes of saturated regions in the image.

Additionally, when we use a high-resolution camera to magnify the bags, making small particles appear bigger and clearer, the surface of a fluid bag is not as clear as glass but has fine lines, patterns, and variations in thickness, which can further distort the shapes of internal air bubbles and foreign particles, making it difficult to differentiate between them. Even when fluid bags are manufactured in cleanrooms and carefully handled, close inspection often reveals numerous dust particles, fiber fragments, and small specks on the bag’s surface. These external contaminants are shown to be similar to internal contaminants, making it hard to distinguish between them based solely on appearance.

Moreover, the flexibility of fluid bags means that, when the fluid inside sloshes around, the bag itself can move, causing particles or dust adhered to the surface to appear as moving objects, complicating detection. This leads to situations where not only internal contaminants but also numerous external dust particles are identified as potential contaminants, increasing the computational burden of subsequent processes and potentially leading to misidentification.

The print on the surface of fluid bags also poses a challenge for detection, as it can obscure or be confused with foreign particles or air bubbles when viewed with backlighting. Small air bubbles inside the fluid bag can appear as tiny dots under backlighting, similar to small contaminants, making it difficult to distinguish between them. Large numbers of these tiny air bubbles can be generated when the fluid bag is vigorously shaken, posing a significant challenge for real-time detection systems due to the sheer volume of data to be processed. Thus, detecting foreign particles inside flexible fluid bags is particularly challenging due to the bag’s irregular shape, its continuous deformation due to its flexibility, its difficulty in distinguishing between surface and internal contaminants, its interference from surface print, and the presence of numerous small air bubbles. Addressing these issues requires solving problems related to shape distortion, continuous deformation, differentiation of surface contaminants, and the increased number of air bubbles.

1.2. Illumination

Traditional computer vision inspection systems need careful lighting designs because their performance closely depends on the configuration of the lighting apparatus [6,17]. The number of light sources, the direction of light being shed on the target, and the relative position and direction of the camera all significantly affect the resultant image quality and detection performance. Due to the flexibility and glassy, polished surface of fluid bags, it is crucial to position the lights to avoid producing reflections on the surface of the bags.

1.3. Deep Learning-Based Particle Detection

In the realm of computer vision, object detection has seen significant advancements with the advent of deep learning models. These models have revolutionized the ability to accurately identify and localize objects within images. Several deep learning models have been developed for object detection, each with unique strengths and applications. Among the models, YOLO is commonly used in vision inspection applications, as it is extremely fast and suitable for real-time applications [18], but it is not chosen for particle inspection because it shows lower accuracy in detecting small objects due to spatial constraints in the grid structure used for prediction. There might be better models for small object detection [19,20], but Faster RCNN [21] is chosen for this study due to its high performance in regard to object detection, and its efficient implementation is provided by Detectron2 from Facebook AI Research (FAIR).

Even with the selection of a high-performing deep learning model, there are several issues to consider when applying it to foreign substance detection. First, to detect small foreign particles, a high-resolution camera is used, leading to a mismatch with the typical input sizes (256 × 256 to 600 × 800) [19,20,21] that many deep learning models are designed to process. Simply reducing the input image size to fit the model is not feasible because it would also reduce the size of the small foreign particles, making detection difficult.

Furthermore, there is the challenge of applying deep learning detection to video data. Traditional foreign particle detection methods involve rapidly rotating the container and then stopping it to create rotational movement of the fluid inside, allowing for the observation of moving foreign substances over several seconds. For deep learning-based detection, it is essential to complete the detection process quickly on video data spanning a specific duration. The deep learning model must be able to process at least 10 frames per second for effective foreign substance detection. This paper aims to explain how to successfully address these two challenges—real-time processing and preventing performance degradation—by developing and using object patch aggregation images (OPAIs).

This paper proposes a visual inspection system for flexible fluid bags (VISFFB) to overcome the challenges posed by their flexible nature, as illustrated in Figure 2. The system comprises several components, including a rotary inspection mechanism for loading, inspecting, and discharging five bags in parallel, allowing for moving them to the next stage by rotating the main rotor; an illumination setup designed to distinguish between foreign particles and air bubbles during inspection; and a deep learning-based algorithm for real-time detection of foreign particles. This paper explores the design of the mechanical structure to address difficulties associated with flexible fluid bags, the image acquisition system, and the deep learning detection method. Specifically, it introduces an OPAI algorithm to solve issues related to real-time processing and performance degradation, as well as a tracking and ensemble detection algorithm using OPAIs.

This paper is organized as follows. Section 2 describes the overall configuration of the VISFFB and the detailed proposals for resolving issues caused by flexible bags, including effective lighting arrangements. Section 3 explains the use of the proposed OPAIs to solve deep learning challenges. Section 4 discusses methods for enhancing detection performance through tracking and ensemble detection of particles using OPAIs. Section 5 presents experimental results obtained by applying the proposed methods. Finally, Section 6 provides the conclusion.

2. Overview of VISFFB

2.1. Mechanical Structure of VISFFB

The proposed VISFFB performs visual inspection on flexible fluid bags at the end stage of the fluid bag manufacturing process. During the manufacturing process in a production line, a new fluid bag is produced every second and dispatched on the conveyor belt. To provide enough time for the inspection of each bag, the VISFFB loads five bags from the conveyor belt at the same time, and this parallelization allows 5 s of inspection time for each bag. The main rotor, equipped with five grippers on each side of its hexagonal structure, rotates to move the bags to the next stage, where each bag is rotated for visual inspection. For the first 3 s, the system moves the bags to the inspection position while circulating each bag to make the internal fluid move. Once the fluid starts rotating, the bag stops rotating, and for the last 2 s, visual inspection is performed to detect foreign particles in the bag.

To enhance detection accuracy, two cameras are assigned to each bag, inspecting both the upper and lower parts. The inspection algorithm detects moving objects inside the fluid bags, differentiating between foreign particles and air bubbles. Each moving object is assigned a track ID for tracking, allowing for multiple analyses of objects with the same track ID across sequence images to determine if they are foreign particles.

This visual inspection process is repeated for each bag at the second visual inspection position of the main rotor. The combined results of these two inspections determine whether the bag is free of foreign particles or defective. After completing the visual inspections, the main rotor moves the bags to the discharge stage, where they are mechanically sorted based on the inspection results. Normal bags are transferred to the discharge conveyor belt, and defective bags are directed to the defective container.

2.2. Effective Gripper Structure for Flexible Fluid Bags

Flexible fluid bags can change shape significantly depending on where and how they are held. These changes are influenced by gravity, the restoring force to their original shape, pressure exerted at the gripping points, and the pressure distribution caused by the internal fluid. Such variations in the shape of the bags can lead to diverse forms of lens distortion of internal objects, visually altering their shape and distance, which negatively impacts the detection of foreign particles. Additionally, the way the bag is held can affect the rotation of the internal fluid, which is crucial for visual inspection.

Holding a bag by the upper and lower edges or grasping it at points or along lines inside the bag can hinder the rotation of the internal fluid. In this paper, a new gripper is proposed to grip the left and right edges of the fluid bag, applying linear pressure along the edges, as illustrated in Figure 3. This approach flattens the crumpled surface and gathers the fluid toward the center, shaping the fluid bag into a cylindrical form. The taut surface reduces the lens distortion of internal objects, and this shape facilitates smoother rotation of the internal fluid.

The proposed gripper in this study allows the fingers to hold the fluid bag at each edge and enables the upper part of the gripper to rotate around a central axis while holding the bag. The air cylinder opens and closes the gripper’s fingers through its vertical movement, enabling it to grasp the fluid bag. The top of the gripper is fixed to the rotary mechanism, allowing the fluid bag along with the gripper to rotate. This design ensures that the bag remains stable and can rotate freely, increasing the likelihood of internal particles moving during visual inspection. Consequently, this enhances the detection of foreign particles through visual inspection.

2.3. Illumination and Image Acquisition

In a visual inspection system, lighting configurations vary greatly depending on the target object and inspection environment. In this study, the objective is to detect floating foreign particles inside flexible fluid bags, requiring careful consideration of the bag material and the positioning of lights to effectively distinguish foreign particles from air bubbles.

Direct lighting from the camera side is problematic for this application due to severe surface reflections from the fluid bag making it difficult to view the interior. To address this, a planar light source is positioned behind the fluid bag, eliminating surface reflection issues and providing clear visibility of the bag’s interior. This setup ensures that air bubbles, which generally transmit light, appear as bright, while foreign particles, which often do not transmit light, appear as dark in the captured images.

However, even with this backlighting configuration, small air bubbles at the edges of the fluid bag can appear to be similar to dark foreign particles due to reduced light transmission resulting from the interaction between the bag’s surface and the direction of the light beam. To resolve this, as shown in Figure 4, an additional light source is placed beneath the fluid bag, directing light upwards.

This approach takes advantage of the convex lens properties of air bubbles, causing them to refract the light from a wide-angle direction and appear bright when the additional light source is positioned differently from the backlight. This differentiation from foreign particles is illustrated in Figure 5.

High-resolution cameras are installed both above and below the fluid bag to visually detect even the smallest foreign particles. The upper camera covers the top portion of the fluid bag, while the lower camera covers the bottom portion, ensuring comprehensive coverage.

As the gripper rotates the fluid bag and brings it to a halt in front of the cameras, the fluid inside continues to rotate, causing any foreign particles to move. The cameras capture a sequence of images over approximately 2 s, and then a deep learning-based detection algorithm processes these image sequences to effectively distinguish between foreign particles and air bubbles. This study demonstrates that the use of strategic lighting and high-resolution imaging, combined with advanced deep learning algorithms, enhances the detection of foreign particles in flexible fluid bags.

3. Object Detection in High-Resolution Images Using OPAIs

When the rotating flexible fluid bag is stopped, the objects moving inside it must be identified as either air bubbles or foreign particles using a deep learning object detection model. However, several issues arise during this process. Firstly, when the rapidly rotating fluid bag stops, the bag itself halts, but the internal fluid continues to rotate. This causes the flexible fluid bag, although held tightly, to wobble non-uniformly due to the fluid’s movement. Many visual inspection methods reduce computational load by removing the background and identifying candidate objects using difference images between previous and current frames or motion-compensated difference images. However, the non-uniform wobbling of the fluid bag makes it difficult to detect these target substances or candidates through simple difference or motion-compensated difference images. The entire fluid bag wobbles non-uniformly, causing even particles on the bag’s surface to be extracted as candidate objects, similar to particles moving inside the bag.

In this study, to remove the background of such a non-uniformly wobbling flexible fluid bag and accurately identify only the actual moving target or candidate regions, the algorithm first obtains candidate regions based on difference images. The algorithm uses the similarity to adjacent objects within the vibration distance range in the previous frame to determine if the object is attached to the bag’s outer surface, such as surface reflections, dust, or printed patterns on the surface. Particles or air bubbles moving significantly will show low similarity to objects within the vibration distance range in previous frames. Therefore, we apply a method to filter out non-moving objects based on this similarity, extracting only the few actual moving air bubbles or foreign particles inside the fluid bag. The similarity measure used here is the normalized correlation coefficient.

Secondly, when attempting to detect objects in high-resolution images using a deep learning model, the following issues arise:

Increased memory consumption

Convolutional Neural Network (CNN) models, which are commonly used for object detection, generate multiple feature maps due to the convolution operations, significantly increasing memory usage. Thus, many object detection models are designed to handle smaller input image sizes, such as 224 × 224, 256 × 256, 512 × 512, and 600 × 800 pixels. Handling larger resolution images directly would require substantial memory, leading to exponentially higher costs. Additionally, as memory usage increases, so does the computational complexity, resulting in longer execution times.

2.: Complicated maneuvering of input image

To process larger images, deep learning models may need to become deeper and more complex, which complicates training and tuning. Training and optimizing deeper models with larger input data can take significantly more time, making practical application challenging.

Due to these challenges, directly applying deep learning models to high-resolution images is difficult. Various methods are employed to adjust the input image size to fit the model’s capacity, such as the following:

Image resizing:

Reducing the size of the high-resolution image to match the input size for the deep learning model.

Image dividing:

Dividing the high-resolution image into smaller patches and feeding separately into the model.

Sliding Window:

A sliding window moves across the high-resolution image, processing each section independently.

However, these methods still require the entire image data to be processed by the deep learning network, leading to slow processing speeds for high-resolution images. To address this issue, this paper proposes a highly effective method for reducing the size of high-resolution input images to match the input size that deep learning models can handle, thus optimizing processing efficiency. Figure 6 shows the flowchart for the recognition of foreign particles. Key steps are discussed in the following sections.

The steps for preprocessing input frames are based on down-sampling and differentiating consecutive input images to detect candidate objects and reject backgrounds and noise. To reduce the computational burden for preprocessing and subsequent tracking, the two consecutive input frames

f_{t - 1} (i, j)

and

f_{t} (i, j)

of size

M \times N

, located at coordinates

(i, j),

are down-sampled by a factor of

k

, resulting in down-sampled images of

g_{t - 1} (m, n)

and

g_{t} (m, n)

of size

\frac{M}{k} \times \frac{N}{k}

at down-sampled coordinates

(m = \frac{i}{k}, n = \frac{j}{k})

. The differential image with blobs of significant values can then be obtained, where the blobs are thought to be moving objects.

D_{t} (m, n) = | g_{t} (m, n) - g_{t - 1} (m, n) |

(1)

O_{t} (m, n) = g_{t} (m, n) i f D_{t} (m, n) > T

(2)

B K_{t} (m, n) = g_{t} (m, n) i f D_{t} (m, n) \leq T

(3)

where

D_{t} (m, n)

is the result of the inter-frame difference,

O_{t} (m, n)

is the object area,

B K_{t} (m, n)

is the background area, and

T

is a threshold value that is greater than the noise level in input images.

When we examine

O_{t} (m, n)

, the object area can consist of various components, such as moving particles, moving bubbles, locally moving backgrounds, and noise. Thus,

O_{t} (m, n)

can be defined as the union of all these sets.

O_{t} = M P_{t - 1} \cup M P_{t} \cup M B_{t - 1} \cup M B_{t} \cup W B K_{t - 1} \cup W B K_{t} \cup N_{t - 1} \cup N_{t}

(4)

where

M P_{t}

is moving particles observed at time

t

,

M B_{t}

is moving bubbles observed at time

t

,

W B K_{t}

is wobbling background observed at time

t

, and

N_{t}

is noise observed at time

t

. In order to determine if a fluid bag is contaminated by particles, we need to verify the existence of

M P_{t}

only, as all the other components act as performance degradation factors that make detection harder.

Thus, we apply several filters that remove these performance-degrading factors for more precise detection in our algorithm. For a wobbling background,

W B K_{t - 1}

and

W B K_{t}

result from the non-uniform movement due to the flexibility of fluid bags, which leads to very short movements, such as ±1 to ±3 pixels, in diverse directions. Based on this, we apply a similarity measurement, such as the normalized correlation coefficient (NCC), to

O_{t}

, to detect and filter out

W B K_{t - 1}

and

W B K_{t}

, resulting in moving objects,

M O_{t}

, as shown below:

M O_{t} = O_{t} - {\hat{W B K}}_{t} - {\hat{W B K}}_{t - 1}

(5)

where

{\hat{W B K}}_{t}

is an estimation of the wobbling background at time

t

and can be defined as follows:

{\hat{W B K}}_{t} = {g_{t} (m, n) | \begin{matrix} (m, n) \in \arg (O_{t}), \\ r (O_{t} (m, n), O_{t - 1} (m - d x, n - d y)) > T_{W B K}, \\ d x \in [- 3, 3], \\ d y \in [- 3, 3] \end{matrix}}

(6)

where

T_{W B K}

is the threshold for wobbling background and the normalized correlation coefficient r between two images is defined as follows:

r = \frac{\sum (A (m, n) - \bar{A}) (B (m, n) - \bar{B})}{\sqrt{\sum {(A (m, n) - \bar{A})}^{2} \sum {(B (m, n) - \bar{B})}^{2}}}

(7)

where

\bar{A}

and

\bar{B}

are the mean values of images

A

and

B

, respectively.

In the case of moving objects,

M P_{t - 1} (m, n)

and

M B_{t - 1} (m, n)

are kinds of fake objects that do not exist in the current input frame

g_{t} (m, n)

. These can be descriminated and filtered out by examining the local power in

g_{t} (m, n)

and

g_{t - 1} (m, n)

around

(m, n)

using the knowledge that the local power in

g_{t} (m, n)

is not observed but is observed in

g_{t - 1} (m, n)

. The observed true moving objects,

O T M O_{t}

can be found as follows:

O T M O_{t} = M O_{t} - M P_{t - 1} - M B_{t - 1} + 2 N

(8)

Through this step by step filtering, the preprocessing will leave only the moving objects of the current input frame,

M P_{t} (m, n)

and

M B_{t} (m, n)

, if we apply a threshold based on the noise signal strength. Then, finally, we can apply tracking and a deep learning model to discriminate moving particles from bubbles and produce the inspection results.

T M O_{t} (m, n) = {\begin{matrix} 255, i f O T M O_{t} (m, n) > \sqrt{V a r (N_{t}) + V a r (N_{t - 1})} \\ 0, o t h e r w i s e \end{matrix}

(9)

T M O_{t} = M P_{t} \cup M B_{t}

(10)

After the preprocessing step, the inspection problem turns into resolving the classification problem between particles and bubbles. It is well known that this problem can be better addressed using a deep learning model if more data are acquired. In the subsequent sections, we will explain the algorithm more specifically to enhance performance.

3.1. Object Patch Aggregation Image (OPAI) Algorithm

However, these methods still require processing the entire image data with the deep learning network, resulting in slow processing speeds for high-resolution images. To address this issue, this paper proposes an effective method for reducing the size of high-resolution input images to match the input size that deep learning models can handle, optimizing processing efficiency. Typically, the size or number of objects to be detected in high-resolution images occupies a very small area compared to the entire image. Therefore, pixel data from non-target areas can waste the computational resources of the deep learning network model, and these non-target areas often constitute a significant proportion of the image. By applying a nonlinear transformation that reduces the number of pixels in non-target areas while maintaining the pixels of the objects to be detected, we can reduce the size of the high-resolution image. This approach focuses on retaining the critical object pixels, allowing for the creation of a smaller input image that the deep learning model can handle.

Figure 7 demonstrates an example where objects within a large image can be effectively gathered into smaller OPAIs. Four objects are distributed within a 637 × 885 image, and each object’s area is extracted and repositioned within a 256 × 256 image. The algorithm for creating a smaller OPAI from the original image involves using the area size of the objects to determine the paste positions within the OPAI, starting from the top-left corner.

The process involves pasting the object regions into the OPAI sequentially and updating the start position for each subsequent object. If the next object fits within the remaining space of the OPAI, it is copied to that location. If the space is insufficient, the start position is moved down by the maximum height of the pasted objects and the x-coordinate is reset to the left edge to continue the process. If the new start position exceeds the height of the OPAI, a new OPAI is created and the process continues from the top-left corner of the new image. The algorithm can be summarized as follows:

(1): Calculate the paste positions (Stick_pos) for each object within the original image.
(2): Sequentially paste object regions into the OPAI.
(3): Update Stick_pos after each object is pasted.
(4): If the space is insufficient, move down and reset the x-coordinate of Stick_pos.
(5): If Stick_pos exceeds the OPAI height, create a new OPAI and continue.

In the example, the size ratio of the OPAI is 12% compared to the original large image, which shows a drastic reduction in pixel data, offering a significant possibility of computational reduction in deep learning model calculations.

3.2. Enhancing the Performance of Object Detection in OPAIs

The OPAI generation algorithm is used to create OPAIs for foreign particles and air bubbles inside a flexible fluid bag, as shown in Figure 8. These images are created by extracting patches from different parts of various images and sequentially placing them onto a blank OPAI canvas. This results in OPAIs that contain foreign particles and air bubbles observed under various conditions of focus, lighting, and background.

However, when examining these generated OPAIs, a discontinuity at the patch boundaries is noticeable due to the black background. This abrupt change in characteristics at the boundaries creates noise that can interfere with the features of foreign particles or air bubbles during the convolution filtering process in the CNN. This noise can severely affect the network’s ability to distinguish between the characteristics of air bubbles and particles, ultimately degrading the model’s performance in differentiating between the two.

To mitigate the impact of these discontinuities on object detection performance, it is crucial to set the patch size as significantly larger than the objects to prevent the discontinuity noise from affecting the actual object features. By considering the size of the CNN convolutional kernel, the patch boundaries are designed to ensure that the discontinuity does not encroach upon the objects themselves.

Additionally, certain considerations must be made to prevent performance degradation when using OPAIs for deep learning training and inference. When training the object detection network with OPAIs, the network is optimized for such a structured input. Consequently, during inference, the input images must be formatted similarly to OPAIs to maintain performance. This approach ensures the mitigation of the discontinuities’ impact on the CNN model’s ability to accurately detect and distinguish between foreign particles and air bubbles.

4. Object Detection in High-Resolution Images Using OPAI

Among the tracking algorithms used in vision applications like remote surveillance, the Simple Online and Realtime Tracking (SORT) algorithm stands out for its simplicity, real-time processing capability, and stable tracking performance [22]. SORT uses the Kalman filter to predict the new position and size of objects based on past tracking information, such as position, velocity, and size. It then uses the Intersection over Union (IoU) metric to compare these predictions with newly observed objects, matching them with the highest IoU value to continue the tracking process. Although SORT is advantageous for real-time applications due to its low computational load and straightforward implementation, its performance can degrade in complex environments due to frequent ID switching. This issue arises because SORT relies solely on information about the object’s position, velocity, and size without considering the object’s appearance.

To address this limitation, the Deep Simple Online and Realtime Tracking (DeepSORT) algorithm incorporates a deep appearance descriptor derived from a CNN-based model [23]. This descriptor captures the visual features of objects, enhancing tracking performance by differentiating between objects based on their appearance. Consequently, DeepSORT provides more robust and accurate tracking, especially in complex environments. However, using deep learning inference to generate appearance descriptors for candidate objects in each frame can be computationally expensive.

In this study, we modify DeepSORT to leverage appearance information without the computational overhead of deep learning inference. Instead of using a deep appearance descriptor, we employ the normalized correlation coefficient to measure the similarity of object shapes. This adjustment allows us to maintain the benefits of appearance-based tracking while reducing the computational load, making the algorithm more suitable for real-time applications.

4.1. Modification of DeepSORT for Particle Tracking

The DeepSORT tracking algorithm associates newly observed objects with existing tracked targets based on both their position and movement information, as well as the similarity of their appearance. This similarity is assessed by computing deep learning feature maps for both the candidate object and the tracked target. The cosine similarity between these feature maps is then calculated and used to link the observed object to the appropriate track.

To understand the computational cost of this process, consider the following:

The output feature map size

Φ

is given by the following:

Φ = (\frac{M + 2 p - l}{s} + 1) \times (\frac{N + 2 p - l}{s} + 1)

(11)

where

M

is the height of the input image,

N

is the width of the input image,

l

is kernel size of the convolution,

s

is the stride, and

p

is padding.

For each convolution operation,

l \times l

multiplications and additions are performed for each position in the feature map. Therefore, for one filter, the number of computations is as follows:

Φ \times l^{2}

(12)

For F filters, the total number of computations is as follows:

Φ \times l^{2} \times F

(13)

Assuming

s = 1

and

p = \frac{l - 1}{2}

, the output size

Φ

approximates

M \times N

. Therefore, the computational cost for calculating the CNN-based feature map is as follows:

O (M \times N \times l^{2} \times F)

(14)

In contrast, the computational cost for calculating the normalized correlation coefficient (NCC)

r

between two images is given in Equation (7). In order to compute the NCC, the following operations are needed:

-: Mean calculation: $2 \times M \times N$ operations;
-: Subtractions: $2 \times M \times N$ operations;
-: Multiplications for the numerator: $M \times N$ operations;
-: Squaring for the denominator: $2 \times M \times N$ operations;
-: Summations: $2 \times M \times N$ operations;
-: Square roots: one operation;
-: Final division: one operation;
-: Total operations for the NCC: $9 \times M \times N$ operations.

The computational cost for calculating the CNN-based feature map is significantly higher than that for computing the normalized correlation coefficient. Consequently, while DeepSORT uses a deep learning appearance descriptor, this study proposes using the normalized correlation coefficient instead to assess object similarity, reducing the computational burden while maintaining an effective object tracking performance. This approach leverages the benefits of appearance-based tracking without the high computational cost associated with deep learning inference in each frame, making it more suitable for real-time applications.

Furthermore, the IoU (Intersection over Union) metric used in DeepSORT to calculate the cost between the predicted position and size of a target, based on the Kalman filter, and the observed object is not suitable for objects moving inside a flexible fluid bag. The movement distance of objects such as foreign particles or air bubbles is significantly larger than their size, making it highly unlikely for the predicted and actual observed object regions to overlap, resulting in an IoU value close to zero most of the time. Therefore, the IoU cannot be used to associate the tracked trajectory with the newly observed target object.

To address this, this paper introduces a cost function

C_{s i z e} (A, B)

based on the object’s size, considering that the size of moving objects does not change significantly between movements. The cost function decreases when the sizes of the two objects are similar and increases when the sizes differ. This approach is presented as follows:

C_{s i z e} (A, B) = | \frac{(S i z e {(A)}^{2} - S i z e {(B)}^{2})}{S i z e (A) \cdot S i z e (B)} |

(15)

where

A

and

B

are two images to be compared and

S i z e ()

is a function that returns the size of an image. This cost function ensures that, when the difference in the sizes of two images is small, the cost becomes small. Size changes in smaller objects are more sensitively accounted for, while size changes in larger objects result in a relatively smaller increase in cost.

4.2. Re-Identification of the Targets

To determine whether a moving object inside a flexible fluid bag is a foreign particle or an air bubble, continuous tracking allows for multiple observations of the target object, leading to a more stable and accurate assessment.

To achieve this, each newly observed object from the input frame

g_{t}

in every frame is compared with the tracked objects and linked to the correct track by assigning a track ID. The observed object patch images are cropped from the original input frame

f_{t}

and registered under the track information. The registered object patch images are from the original-sized input frame because they need to be recognized by the deep learning inference even for small objects.

Tracking is performed over a sequence of frames, typically around 2 s long. Tracks that are too short, having fewer than three frames of observation, are removed from the list as they are not reliable for judgment. Only tracks of a certain length are retained for further analysis using a deep learning model to determine whether they are foreign particles or air bubbles.

After 2 s of tracking, all the collected object patch images are aggregated and converted into OPAIs. These OPAIs are then fed into the deep learning model to identify whether the objects are foreign particles or air bubbles. The deep learning model’s inference results for each object patch can indicate either an air bubble, a foreign particle, or no detection.

The judgment for each track is made by grouping the inference results of the object patches within the OPAI by track ID. If the number of patches identified as air bubbles or foreign particles exceeds a specific ratio of the total track length, the track is ultimately judged as containing air bubbles or foreign particles. The decision logic for each track is as follows:

If Bubble_Count < Enough_Track_Length:

If there is at least one bubble:

Declare the track is a bubble

Else if there is no bubble:

Declare the track is a particle

Else:

If Bubble_Count/Track_Length < Bubble_Ratio

Declare the track is a particle

Else

Declare the track is a bubble

If all tracks within a particular flexible fluid bag are judged to contain only air bubbles, the fluid bag is considered normal. Conversely, if even one track is identified as containing foreign particles, the fluid bag is deemed to be contaminated. This approach enhances the robustness and accuracy of judgment by leveraging the power of deep learning and continuous observation across multiple frames. Using OPAIs and deep learning models ensures precise differentiation between air bubbles and foreign particles, which is crucial for maintaining the quality and safety of medical products.

5. Experimental Results and Discussions

To evaluate the performance of the proposed method, the VISFFB was configured, as shown in Figure 2. For acquiring high-resolution images of the flexible fluid bags, two MX-A201R-26 CMOS sensor cameras with a resolution of 3627 × 5488, each equipped with a V3528-MPY lens, were positioned at each inspection location. These cameras capture 70 × 100 mm² areas of the upper and lower parts of the bags. The cameras are connected to a single PC, where detection of moving objects and differentiation between foreign particles and air bubbles are performed using deep learning-based object detection algorithm, Faster RCNN. The PC is equipped with an Intel Core i9-10900KF @3.70GHz CPU, 64 GB RAM, and an nVidia RTX-3090 GPU, running on Windows 10 OS. High-resolution sequence images are acquired from both cameras through a frame grabber, allowing the foreign particle detection algorithms to run simultaneously and in real-time for the sequence images from both cameras.

The test bags were 500 mL flexible non-PVC transparent bags filled with 5% dextrose solution. The dataset included 35 defective fluid bags containing a few foreign particles of 150–200 µm in size and 100 normal fluid bags with no foreign particles. The fluid bags were loaded in sets of five for inspection, with each fluid bag being examined individually. The main rotor’s two inspection positions allow for parallel examination of these sets of fluid bags. Figure 9 illustrates the network setup and communication structure for aggregating inspection results and controlling the main rotor. Results from each PC are sent to the main control PC, which compiles the inspection results to identify defective fluid bags. The control PC then manages the discharge paths for normal and defective fluid bags and rotates the main rotor to direct each fluid bag to the correct discharge conveyor or defective product container.

As shown in Figure 10, the system includes two cameras at each of the five inspection positions, with lights being positioned opposite the cameras and beneath the fluid bags. This setup ensures a clear distinction between foreign particles and air bubbles by minimizing surface reflections and enhancing the visibility of internal objects. The rear lighting reduces surface reflections, while the lower lighting highlights air bubbles. This causes the air bubbles to refract light and appear bright in the camera images, whereas foreign particles do not refract light, making them more distinguishable.

In Figure 11, image (a) shows a fluid bag captured with only rear lighting, while image (b) shows the same fluid bag with both rear and lower lighting. Comparing these images reveals that lower lighting makes the air bubbles more visible by refracting the lower light. Image (c) further demonstrates this effect, highlighting small air bubbles even within dark printed areas, showing that lower lighting effectively distinguishes air bubbles from foreign particles. Foreign particles do not refract or transmit the lower light, resulting in minimal change between images with and without lower lighting, thereby improving the distinction between foreign particles and air bubbles.

To extract moving objects, binary differencing or motion-compensated differencing methods are commonly used for bottles or vials. However, for flexible fluid bags, these methods often identify more regions than the actual moving objects due to their non-uniform movements. This results in decreased processing speed and particle detection performance.

Figure 12 shows the results of applying these methods to consecutive frames of a flexible fluid bag. The candidate regions identified as moving objects are labeled on the actual input image for comparison. The comparison includes simple differencing, global motion-compensated differencing, and the proposed near-similarity-based object-level background removal for candidate moving objects. The number of candidate regions extracted from simple differencing is 1249, while global motion-compensated differencing reduces this number to 763, primarily in the central part of the image. However, many candidate regions still appear in the peripheral areas. The proposed method, which applies near-similarity-based object-level background removal, dramatically reduces the number of candidate moving objects to five. By performing visual inspections only on these filtered candidate regions, the processing time is significantly reduced, enhancing the feasibility of real-time processing.

If we construct OPAIs by collecting images of moving objects observed during the tracking process of the internal movement of the fluid bag, we can effectively convert the vast amount of data from high-resolution input frame sequences into smaller OPAIs that consist only of the data necessary for actual inspection.

To effectively enhance the detection performance for OPAIs, data for training were collected using the VISFFB system by capturing inspection images of fluid bags (both those without internal foreign particles and those containing various types of internal foreign particles). Under the same conditions as the actual inspection, with back and bottom lighting setup, videos of the fluid bags were recorded while rotating the bags to introduce the movement of the internal fluid. These recordings were then processed to detect and track the moving objects, resulting in the creation of an OPAI dataset. Out of the 941 OPAIs thus generated, 851 were designated as training data, while the remaining 90 were set aside for validation. The images were used to train the system to distinguish between bubbles and foreign particles. In the training dataset, 11,541 bubbles and 8208 foreign particles were labeled, while, in the validation set, 1077 bubbles and 814 foreign particles were labeled.

Figure 13 shows the results of creating OPAIs for fluid bags containing foreign particles and another fluid bag without foreign particles, showing air bubbles. In the left column, the results of the constructed OPAIs from each fluid bag are displayed, and above each object patch image, the tracking ID for the registered object during the tracking process is indicated. The deep learning inference results for the object images are shown as circles proportional to the size of the object. A blue circle indicates that the object is inferred to be a foreign particle, while a white circle indicates an air bubble. If there is no circle, it indicates that the object was not inferred to be either a foreign particle or an air bubble.

As shown in the right column, by comprehensively judging the presence of foreign particles based on the proportion of inference results from various images of moving objects observed during the tracking of each fluid bag, we can achieve a more stable and superior performance in determining the presence of foreign particles. In other words, even if the moving object is out of focus or there are observational images that are difficult to judge due to lens distortion from the crumpling of the fluid bag, we can make better decisions by combining all the judgment results throughout the tracking process based on multiple inference results.

Additionally, the input frame sequence used in this experiment was a high-resolution video input at 10 fps for 2 s, which was compiled into 600 × 800 OPAIs through object tracking. Therefore, the data reduction rate for deep learning inference is calculated as (600 × 800)/(3627 × 5488 × 20) = 0.12%, resulting in a data compression effect of 829.374 times. By performing deep learning model inference on the accumulated and reduced data throughout the tracking period in a very short time, real-time processing of the identification of moving objects within the fluid bag becomes possible.

The proposed system operated in a multi-threaded environment, allowing it to simultaneously perform video acquisition from two cameras, moving object detection, tracking, OPAI generation, and deep learning inference on a single computer. In this multi-threaded environment, within a two-second timeframe, high-resolution videos from each camera were processed to extract and track moving objects, resulting in the generation of 600 × 800 OPAIs. For each generated OPAI, Faster RCNN took 250 to 300 ms to perform inference in the multi-threaded environment, and processing OPAIs from both cameras one by one took 500 to 600 ms. When two OPAIs were generated from each camera, the inference process took 1 to 1.2 s. As a result, it was confirmed that a total of 3 to 3.4 s was required to determine the presence of foreign particles inside a fluid bag. This enabled the proposed system, which was configured to inspect five bags simultaneously in parallel, to output inspection results within 5 s, allowing for the inspection of each bag to be completed in less than 1 s. Consequently, it was possible to perform real-time inspections of fluid bags moving on a conveyor belt at a rate of one bag per second.

For the evaluation of the performance of the proposed algorithm, we tested it on 35 fluid bags (500 mL each) containing easily floating foreign particles sized 150–200 µm and 100 normal fluid bags without any foreign particles. For quantitative performance comparison, we examined the test results using metrics such as Recall, Precision, and Accuracy, as shown below:

R e c a l l = \frac{T P}{T P + F N}

(16)

P r e c i s i o n = \frac{T P}{T P + F P}

(17)

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(18)

where

T P

is the number of true positives,

T N

is the number of true negatives,

F P

is the number of flase positives, and

F N

is the number of false negatives.

The results of the foreign particle detection for test samples are summarized in a confusion matrix, as shown in Figure 14. Based on the results, with

T P

= 34,

F P

= 1,

T N

= 98, and

F N

= 2, the algorithm achieved a Recall of 97.1%, a Precision of 94.4%, and an Accuracy of 97.8%. The proposed system tracked moving objects across multiple frames, allowing it to repeatedly perform deep learning inference on the observed objects in each frame to determine whether they were air bubbles. This approach ensured that, even if the shape of an object was not clearly distinguishable in some frames due to printing artifacts or surrounding structures, the system achieved very high accuracy and reliability in determining the presence of foreign particles by making a comprehensive judgment based on observations from other perspectives.

The performance of foreign particle detection observed in previous research is shown in Table 1, alongside our results. All prior studies involved rigid body forms such as bottles, vials, or ampoules, which are generally small in size. These small fluid containers allow foreign particles to remain within a focal range, clearly showing the shape of the moving object in every sequential frame. In contrast, the flexible fluid bags used in this study have a thickness of 5 to 6 cm in the center, resulting in areas that are out of the camera’s focus. This often causes the moving objects to appear blurry and unclear depending on their position. Additionally, unlike the clean surfaces of small bottles that do not have printed content and do not fold or crumple, the flexible fluid bags have crumpled surfaces and printed text, leading to lens distortion. This makes it extremely challenging to achieve a good detection performance for foreign particles in flexible fluid bags. Despite these difficulties, we compared our research results with those of previous studies. Even under these challenging conditions, the test results of the proposed algorithm demonstrated comparable or superior performance in terms of accuracy compared to existing algorithms, proving the effectiveness of the proposed algorithm.

6. Conclusions

In this paper, we present a deep learning-based system for detecting foreign particles in flexible fluid bags. For effective inspection, we proposed a gripper structure that securely holds the bags and a mechanical structure that can load five bags at a time and rotate the main rotor for real-time inspections. The flexibility of the fluid bags causes non-uniform fluid movement, making it difficult to detect moving objects in the images. To address this, we introduced an algorithm that identifies moving objects by removing the background based on the near-similarity of objects.

We also proposed the use of OPAIs, which consist only of data necessary for inspection by capturing images of moving object areas from high-resolution sequences, thus eliminating unnecessary data. This approach addresses the challenge of matching input image sizes for deep learning inference and the real-time processing of numerous frames.

To enhance the reliability and performance of foreign particle detection, we tracked the target across multiple frames instead of relying on a single observation. By compiling images observed during tracking into OPAIs and conducting deep learning inference, we enabled comprehensive judgment of the target, thereby improving detection performance. This approach enhances the reliability of the final detection. The test results demonstrated that the proposed algorithm achieved a stable detection performance for flexible fluid bags even when compared with various existing research results on bottles and vials.

The following areas are considered potential directions for future research related to this study. In this research, we focused on determining the presence of foreign particles in moving objects and proposed the structure and operation of a gripper designed to facilitate the movement of foreign particles inside fluid bags. However, depending on the characteristics of the foreign particles, some may be too heavy to move easily with the rotation of the fluid inside the fluid bag or may adhere to the inner surface of the bag and not detach easily. Therefore, additional research on the design of a gripper structure that can effectively induce the movement of these types of foreign particles inside the fluid bag will be necessary. Moreover, since the OPAI structure proposed in this study has demonstrated the effectiveness of deep learning-based real-time processing, further research should focus on developing deep learning models optimized for the detection performance and inference speed of OPAIs. Additionally, research on the optimal OPAI generation methods that could further enhance detection performance would also be valuable.

Author Contributions

Conceptualization, C.W.L.; methodology, C.W.L. and K.C.S.; software, C.W.L.; validation, C.W.L. and K.C.S.; formal analysis, C.W.L. and K.C.S.; investigation, C.W.L.; resources, C.W.L.; data curation, C.W.L.; writing—original draft preparation, C.W.L.; writing—review and editing, C.W.L.; visualization, C.W.L.; supervision, K.C.S.; project administration, K.C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Acknowledgments

The present research was conducted with a Research Grant from Kwangwoon University in 2024.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Langille, S.E. Particulate Matter in Injectable Drug Products. PDA J. Pharm. Sci. Technol. 2013, 67, 186–200. [Google Scholar] [CrossRef] [PubMed]
Bowen, J.H.; Woodard, B.H.; Barton, T.K.; Ingram, P.; Shelburne, J.D. Infantile Pulmonary Hypertension Associated with Foreign Body Vasculitis. Am. J. Clin. Pathol. 1981, 75, 609–614. [Google Scholar] [CrossRef] [PubMed]
Dewan, P.R.; Ehall, H.; Edwards, G.; Middleton, D.J.; Terlet, J. Plastic Particle Migration During Intravenous Infusion Assisted by a Peristaltic Finger Pump in an Animal Model. Pediatr. Surg. Int. 2002, 18, 310–314. [Google Scholar] [PubMed]
Doessegger, L.; Mahler, H.-C.; Szczesny, P.; Rockstroh, H.; Kallmeyer, G.; Langenkamp, A.; Herrmann, J.; Famulare, J. The Potential Clinical Relevance of Visible Particles in Parenteral Drugs. J. Pharm. Sci. 2012, 101, 2635–2644. [Google Scholar] [CrossRef] [PubMed]
Zhang, H.; Li, X.; Zhong, H.; Yang, Y.; Wu, Q.M.J.; Ge, J.; Wang, Y. Automated Machine Vision System for Liquid Particle Inspection of Pharmaceutical Injection. IEEE Trans. Instrum. Meas. 2018, 67, 1278–1297. [Google Scholar] [CrossRef]
Shirmohammadi, S.; Ferrero, A. Camera as the Instrument: The Rising Trend of Vision Based Measurement. IEEE Instrum. Meas. Mag. 2014, 17, 41–47. [Google Scholar] [CrossRef]
Ishii, A.; Mizuta, T.; Todo, S. Detection of Foreign Substances Mixed in a Plastic Bottle of Medicinal Solution Using Real-time Video Image Processing. In Proceedings of the Fourteenth International Conference on Pattern Recognition, Brisbane, QLD, Australia, 20 August 1998; Volume 2, pp. 1646–1650. [Google Scholar]
Shi, X.; Tang, Z.; Wang, Y.; Xie, H.; Xu, L. HOG-SVM Impurity Detection Method for Chinese Liquor (Baijiu) Based on Adaptive GMM Fusion Frame Difference. Foods 2022, 11, 1444. [Google Scholar] [CrossRef] [PubMed]
Zhou, F.; Su, Z.; Chai, X.; Chen, L. Detection of Foreign Matter in Transfusion Solution Based on Gaussian Background Modeling and an Optimized BP Neural Network. Sensors 2014, 14, 19945–19962. [Google Scholar] [CrossRef] [PubMed]
Kappeler, A.; Katsaggelos, A.K.; Bertos, G.A.; Ashline, K.A.; Zupec, N.A. Visual Inspection System for Automated Detection of Particulate Matter in Flexible Medical Containers. U.S. Patent 20150213616A1, 30 July 2015. [Google Scholar]
Ge, J.; Xie, S.; Wang, Y.; Liu, J.; Zhang, H.; Zhou, B.; Weng, F.; Ru, C.; Zhou, C.; Tan, M.; et al. A system for Automated Detection of Ampoule Injection Impurities. IEEE Trans. Autom. Sci. Eng. 2017, 14, 1119–1128. [Google Scholar] [CrossRef]
Wang, Y.; Zhou, B.; Zhang, H.; Ge, J. A Vision-based Intelligent Inspector for Wine Production. Int. J. Mach. Learn. Cybern. 2012, 3, 193–203. [Google Scholar] [CrossRef]
Feng, M.; Wang, Y.; Wu, C. Foreign Particle Inspection for Infusion Fluids via Robust Dictionary Learning. In Proceedings of the 10th International Conference on Intelligent Systems and Knowledge Engineering, ISKE 2015, Taipei, Taiwan, 24–27 November 2015; pp. 24–27. [Google Scholar]
Yi, J.; Zhang, H.; Mao, J.; Chen, Y.; Zhong, H.; Wang, Y. Pharmaceutical Foreign Particle Detection: An Efficient Method Based on Adaptive Convolution and Multiscale Attention. IEEE Trans. Emerg. Top. Comput. Intell. 2022, 6, 1302–1313. [Google Scholar] [CrossRef]
Lu, G.; Zhou, Y.; Yu, Y.; Du, S. A Novel Approach for Foreign Substances Detection in Injection Using Clustering and Frame Difference. Sensors 2011, 11, 9121–9135. [Google Scholar] [CrossRef] [PubMed]
Kazmi, M.; Hafeez, B.; Aftab, F.; Shahid, J.; Qazi, S.A. A Deep Learning-Based Framework for Visual Inspection of Plastic Bottles. IEEE Access 2023, 11, 125529–125542. [Google Scholar] [CrossRef]
Prunella, M.; Scardigno, R.M.; Buongiorno, D.; Brunetti, A.; Longo, N.; Carli, R.; Dotoli, M.; Bevilacqua, V. Deep Learning for Automatic Vision-Based Recognition of Industrial Surface Defects: A Survey. IEEE Access 2023, 11, 43370–43423. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
Huang, B.; Liu, J.; Zhang, Q.; Liu, K.; Liu, X.; Wang, J. Improved Faster R-CNN Network for Liquid Bag Foreign Body Detection. Processes 2023, 11, 2364. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
Bewley, A.; Ge, Z.; Ott, L.; Ramos, F.T.; Upcroft, B. Simple Online and Realtime Tracking. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3464–3468. [Google Scholar]
Wojke, N.; Bewley, A.; Paulus, D. Simple Online and Realtime Tracking with a Deep Association Metric. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3645–3649. [Google Scholar]

Figure 1. Examples of foreign particles observed during the manufacturing process of fluid bags.

Figure 2. The proposed system for foreign particle inspection.

Figure 3. The proposed gripper designed for holding a flexible fluid bag.

Figure 4. The proposed configuration of the illumination and image acquisition system.

Figure 5. Comparison of the refractive and blocking effects based on the substance: (a) bubble showing refraction of light; (b) particle blocking light.

Figure 6. Flowchart of detecting foreign particle using a deep learning model on OPAIs: (a) preprocessing of input frames; (b) tracking and constructing OPAI; (c) detection of particles or bubbles.

Figure 7. Example of the original image and its OPAI. (a) Original big image with objects (637 × 885); (b) OPAI of the left (256 × 256).

Figure 8. Examples of OPAI. (a) OPAI example of particles; (b) OPAI example of bubbles.

Figure 9. Network and communication structure for information and control.

Figure 10. The proposed configuration of cameras and lights for visual inspection of flexible fluid bags.

Figure 11. Comparison of the effect of lower light for a flexible fluid bag. Lower light improves the distinguishability between bubbles and particles. (a) Image captured using only backlight; (b) image captured using both backlight and lower light. (c) Comparison of the bubble and particle regions without lower light and with lower light: The left images are copied from (a), and the right images are copied from (b) at the same positions. The red circles indicate bubbles, and the blue rectangle indicates a particle.

Figure 12. Comparison of the background rejection using different differential methods. (a) Consecutive input frames of a flexible fluid bag. (b) Comparison of binarization results from frame differencing: from the top, without compensation, with global motion compensation, and with the proposed near-similarity-based background removal. (c) Comparison of the number of objects labeled: based on the results shown on the left, there are 1249 objects, 763 objects, and 5 objects for each method from the top.

Figure 13. Examples of the constructed OPAIs and the inference results of each OPAI.

Figure 14. Confusion matrix of particles and bubbles.

Table 1. Performance comparison of the proposed system and other inspection systems.

Author	Year	Inspection Object	Body Condition	Body Size	Object Type	Accuracy
Ishii [7]	1998	Plastic bottles (500 samples)	Evenly transparent and clean	Big (1000 mL)	Rigid body	90%
Lu [15]	2011	Ampoules (200 samples)	Evenly transparent and clean	Small	Rigid body	97.18%
Wang [12]	2012	Wine bottles (200 samples)	Evenly transparent and clean	Big	Rigid body	96.60%
Zhou [9]	2014	Plastic bottles (200 samples)	Evenly transparent and clean	Medium (250 mL)	Rigid body	98.904%
Feng [13]	2015	Bottles	Evenly transparent and clean	Medium	Rigid body	97.18%
Ge [11]	2017	Ampoules (1000 samples)	Evenly transparent and clean	Small	Rigid body	94.14%
Zhang [5]	2018	Plastic bottles (500 samples)	Evenly transparent and clean	Small	Rigid body	97.25%
Yi [14]	2022	Vials, bottles	Evenly transparent and clean	Small	Rigid body	94.4%
Xiaoshi [8]	2022	Wine bottles (1584 samples)	Evenly transparent and clean	Big	Rigid body	97.5%
Ours	2024	Flexible bags (135 samples)	Unevenly transparent and printed on the surface	Big (500 mL)	Flexible body	97.8%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lim, C.W.; Son, K.C. Deep Learning Model-Based Real-Time Inspection System for Foreign Particles inside Flexible Fluid Bags. Appl. Sci. 2024, 14, 7960. https://doi.org/10.3390/app14177960

AMA Style

Lim CW, Son KC. Deep Learning Model-Based Real-Time Inspection System for Foreign Particles inside Flexible Fluid Bags. Applied Sciences. 2024; 14(17):7960. https://doi.org/10.3390/app14177960

Chicago/Turabian Style

Lim, Chae Whan, and Kwang Chul Son. 2024. "Deep Learning Model-Based Real-Time Inspection System for Foreign Particles inside Flexible Fluid Bags" Applied Sciences 14, no. 17: 7960. https://doi.org/10.3390/app14177960

APA Style

Lim, C. W., & Son, K. C. (2024). Deep Learning Model-Based Real-Time Inspection System for Foreign Particles inside Flexible Fluid Bags. Applied Sciences, 14(17), 7960. https://doi.org/10.3390/app14177960

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning Model-Based Real-Time Inspection System for Foreign Particles inside Flexible Fluid Bags

Abstract

1. Introduction

1.1. Challenges of Flexible Fluid Bag Inspection

1.2. Illumination

1.3. Deep Learning-Based Particle Detection

2. Overview of VISFFB

2.1. Mechanical Structure of VISFFB

2.2. Effective Gripper Structure for Flexible Fluid Bags

2.3. Illumination and Image Acquisition

3. Object Detection in High-Resolution Images Using OPAIs

3.1. Object Patch Aggregation Image (OPAI) Algorithm

3.2. Enhancing the Performance of Object Detection in OPAIs

4. Object Detection in High-Resolution Images Using OPAI

4.1. Modification of DeepSORT for Particle Tracking

4.2. Re-Identification of the Targets

5. Experimental Results and Discussions

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI