Optimization of the Factory Layout and Production Flow Using Production-Simulation-Based Reinforcement Learning

Choi, Hyekyung; Yu, Seokhwan; Lee, DongHyun; Noh, Sang Do; Ji, Sanghoon; Kim, Horim; Yoon, Hyunsik; Kwon, Minsu; Han, Jagyu

doi:10.3390/machines12060390

Open AccessArticle

Optimization of the Factory Layout and Production Flow Using Production-Simulation-Based Reinforcement Learning

by

Hyekyung Choi

¹,

Seokhwan Yu

¹

,

DongHyun Lee

¹

,

Sang Do Noh

^1,*

,

Sanghoon Ji

²,

Horim Kim

²

,

Hyunsik Yoon

²,

Minsu Kwon

² and

Jagyu Han

²

¹

Department of Industrial Engineering, Sungkyunkwan University, Seobu-ro, Jangan-gu, Suwon-si 2066, Gyeonggi-do, Republic of Korea

²

Samsung Display, 1 Samsung-ro, Giheung-gu, Yongin-si 11773, Gyeonggi-do, Republic of Korea

^*

Author to whom correspondence should be addressed.

Machines 2024, 12(6), 390; https://doi.org/10.3390/machines12060390

Submission received: 29 April 2024 / Revised: 24 May 2024 / Accepted: 31 May 2024 / Published: 5 June 2024

(This article belongs to the Special Issue Digital Twin-Driven Smart Production, Logistics, and Supply Chains)

Download

Browse Figures

Versions Notes

Abstract

:

Poor layout designs in manufacturing facilities severely reduce production efficiency and increase short- and long-term costs. Analyzing and deriving efficient layouts for novel line designs or improvements to existing lines considering both the layout design and logistics flow is crucial. In this study, we performed production simulation in the design phase for factory layout optimization and used reinforcement learning to derive the optimal factory layout. To facilitate factory-wide layout design, we considered the facility layout, logistics movement paths, and the use of automated guided vehicles (AGVs). The reinforcement-learning process for optimizing each component of the layout was implemented in a multilayer manner, and the optimization results were applied to the design production simulation for verification. Moreover, a flexible simulation system was developed. Users can efficiently review and execute alternative scenarios by considering both facility and logistics layouts in the workspace. By emphasizing the redesign and reuse of the simulation model, we achieved layout optimization through an automated process and propose a flexible simulation system that can adapt to various environments through a multilayered modular approach. By adjusting weights and considering various conditions, throughput increased by 0.3%, logistics movement distance was reduced by 3.8%, and the number of AGVs required was reduced by 11%.

Keywords:

production simulation; reinforcement learning; factory layout design; automated guided vehicles; simulation analysis

1. Introduction

The integrated optimization of the factory layout and logistics flow in design is critical for increasing manufacturing efficiency and productivity and has attracted considerable research attention. Analyzing and monitoring smart manufacturing systems and factories developed to respond flexibly to customer demands and market fluctuations using simple simulations remain challenging. The rapid evolution of technology and changing customer demands have resulted in product diversification and frequent product and design changes. Consequently, the frequency of production line rearrangements and reconfigurations has increased considerably [1,2]. To analyze such dynamic manufacturing systems accurately and reliably, simulation models should be modified based on changes in manufacturing system design and production line configurations. However, modifying and reconstructing a simulation model can be challenging. Considerable data obtained from manufacturing sites are lost during refinement and transmission processes. Representing these data in real time in the virtual environment of a simulation model is difficult. Consequently, addressing these problems using this method is challenging.

Advancements in this domain are consistent with Industry 5.0 principles, which emphasize human-centric and sustainable manufacturing systems. Industry 5.0 is focused on enhancing human–machine collaboration, promoting sustainable production processes, and integrating advanced technologies such as artificial intelligence and machine vision [3]. Integrating machine vision systems is crucial for achieving zero-defect manufacturing, ensuring high-quality and defect-free production through real-time monitoring and adaptive control in manufacturing processes [4]. For example, a systematic literature review on machine vision in the automotive industry highlights its role in various applications, including autonomous robots, augmented reality, and predictive maintenance, contributing to the development of an intelligent factory [5].

The proposed technology achieves operational efficiency by incorporating data-driven changes into the design of manufacturing systems and production line configurations. This study involves reviewing, validating, and analyzing operational situations to predict problems in advance. Additionally, optimization systems that analyze and evaluate the design of manufacturing systems quantitatively and realistically based on various simulation results are increasingly being preferred over conventional theories and experience-based designs.

Production simulation, virtual models, and the comprehensive representation of a physical system are widely used to understand performance parameters, enhance processes, and effectively elevate value-added activities [6]. As a digital counterpart to the physical system, production simulation is particularly pertinent during the design phase for optimizing efficiency [7]. Digital information plays a crucial role in asset management for optimizing production performance. Marginal improvements in throughput, product quality, and equipment reliability in high-productivity manufacturing environments can prove to be crucial [8]. Production simulation leverages collected information to simulate the manufacturing environment to empower stakeholders to make decisions on efficiency improvement, accuracy enhancement, and economies of scale [9]. Moreover, the interoperability among emerging technologies has considerable potential for the development of effective platforms and applications. This interoperability allows for the monitoring and control of manufacturing systems and facilitates their transition to sustainable manufacturing systems [10,11]. Modern manufacturing landscapes are characterized by the necessity for optimized layouts in production facilities to enhance efficiency and mitigate both short- and long-term costs. Unjustified layout designs can impede production workflows, which results in suboptimal performance. To address these challenges, in this study, we focused on the development of an advanced approach for factory layout optimization using production simulation technology and reinforcement learning.

The proposed methodology incorporates a cyber range, which allows for the comprehensive testing and validation of production simulations under various scenarios. This holistic approach encompasses the modeling and simulation of both facility layout and logistics flow, considering crucial components such as automated guided vehicles (AGVs), to ensure a factory-wide perspective. Reinforcement learning, a powerful machine-learning technique, was used in a multilayer manner to optimize various layout components. The results of optimizing each element were applied to production simulation for verification to create a closed-loop system that continuously refined the layout based on real-time data. A flexible simulation system was constructed to address the multifaceted objectives of layout analysis and optimization. This system facilitates the alternative scenarios, which emphasizes the redesign and reuse of simulation models. By considering both facility and logistics layouts, users can efficiently review and execute alternative scenarios to ensure adaptability to various manufacturing environments. Conceptualization, methodology, job allocation, and reinforcement learning are associated with this study. This study is consistent with the broad themes of Industry 4.0 and investigates the integration of production simulation technology to optimize layouts for discrete manufacturing workshops. The proposed approach contributes to the advancement of layout modeling, simulation, and optimization, and provide a flexible and adaptive solution for the evolving requirements of modern manufacturing.

The rest of this paper is organized as follows: Section 2 reviews the existing literature and identifies research gaps related to factory layout optimization and production simulation; Section 3 details the proposed methodology, including the application of reinforcement learning and the optimization framework; Section 4 presents a case study and the resulting improvements in factory efficiency; Section 5 discusses the findings and compares them with existing methods to provide insights for generalization; and Section 6 presents a conclusion for the paper by summarizing key contributions, implications, and future research directions.

2. Related Work

Cyber-physical systems that connect physical and virtual spaces are critical for implementing smart manufacturing systems during Industrial Revolution 4.0. Production simulations are key to implementing such systems, and highly accurate representations are possible through connections for manufacturing execution systems (MESs) [12]. Production simulations reduce the gaps between sites through the vertical integration and horizontal co-ordination of instances at manufacturing sites and enhance process and system efficiency by improving inefficient processes at manufacturing sites. Optimal scenarios are analyzed and predicted during the design process based on situations that can occur during operations, including representing complex operational situations, predicting problems, establishing response measures when problems occur, and determining optimal design plans [13,14]. Therefore, production simulation technology is used to obtain layout design plans that respond flexibly to various requirements.

2.1. Utilizing Production Simulation for Factory Layout Design Optimization

Design errors lead to significant delays in the design and engineering of complex production processes, such as assembly lines [15]. In interviews with manufacturing engineers, Biesinger et al. detailed how production simulation technology can reduce layout design costs and improve product quality [16]. Conventional design methods have a limited flexible adjustment capability for optimizing layouts that have been designed. In layout analysis, which incorporate production simulations, the design efficiency is improved for obtaining optimal layout designs for manufacturing worksites consisting of multiple areas by facilitating adjustments [17]. However, this approach is time-consuming. Liu et al. used the configuration–motion–control–optimization methodology to reduce design times and create designs for improving resource optimization and equipment utilization [18]. Guo et al. reported the effectiveness of analyzing and discovering flaws in designed layouts using a modular approach for flexible factory designs, which were categorized into data, equipment, buffers, storage, and AGVs [7]. These techniques reduce the layout design time in the initial design stage and support a systematic design. Furthermore, they verified the design plans in advance to minimize problems that could occur when adjusting or changing due to improper designs. Production simulations help discover problems early in the design stage and provide opportunities for improvement by evaluating the performance of the design plans in advance [19]. Problems associated with operating a factory can be minimized by reducing the expected costs and time spent in the production stage through layout improvements. Choi et al. used an integrated intelligent layout design framework to design the line balancing and cell/buffer locations of assembly line layouts and verified whether these designs were optimized [20]. Layouts are designed considering equipment arrangement costs and worksite area utilization rates to improve the production line balance (LOB) in large-scale mixed production lines in which product changes are frequent [21]. Therefore, layout designs that use production simulations in the design stage can address the problems that can occur at manufacturing sites by optimizing material handling, buffer locations, and logistics paths.

2.2. Advantages of Production Simulation in Manufacturing Optimization

Production simulations differ from conventional simulations because production simulations exchange and synchronize operation data collected on-site through sensors and IoT information technology. This difference allows future predictions to reflect the current situation through synchronization with the analysis target to overcome the limitations of system performance evaluations and analyses based on historical data [22]. If operating data generated by the factory are inadequately reflected in the simulation model, then discrepancies can occur in key performance indicator (KPI) predictions. Production simulations provide an alternative for overcoming disruptions in production schedules, including complex and urgent production processes. Production simulation models for complex aerospace part production lines can verify the production capacity of designed layouts and improve production capacity through real-time monitoring and production schedule adjustments [23]. Agostino et al. modified production plans by incorporating real-time operating data into a production simulation model to reduce the number of delayed tasks, demonstrating that production simulations can be used for optimizing manufacturing sites [24]. In another study, a model was synchronized with MES data to reduce uncertainty in task schedule predictions. Production schedules were adjusted according to the status of materials and assembly factory sequences, and feasible schedules were sent to the worksite [25]. Production simulations synchronize operations data and reflect modified layout conditions. Production simulations are advantageous when the locations of industrial robots in production lines are changed, and models that reflect these changes are automatically generated [26]. On production lines in which products change frequently, engineering costs can be reduced by automatically generating production simulation models that reflect updated equipment layout plans [27]. Thus, the automatic generation of models allows continuous collaboration among designers, on-site engineers, and layout engineers [28]. Production simulations considerably reduced the production process cost and time to pre-emptively resolve defects and problems that are difficult to predict during factory operations. Mykoniatis and Harris used a hybrid modular production simulation emulator approach to achieve optimal control system process variable settings and productivity before producing actual products [29]. Kumbhar et al. converted automated manufacturing sites into production simulations to pre-emptively detect bottlenecks in unspecified locations and provide diagnosed predictions to on-site managers [30]. Bottlenecks could be prevented by providing prediction information to support on-site managers’ decision-making. Furthermore, researchers increased the efficiency of logistics operations by incorporating CPS-based manufacturing site layout information and logistics movement equipment information into production simulations [31]. They enhanced the process management capacities of manufacturing sites by incorporating data collected from production line sensors and finding factors that cause lead time delays [32].

2.3. Leveraging Reinforcement Learning for Advanced Manufacturing Factory Layout Optimization

Artificial intelligence (AI) in the manufacturing industry is a continuously evolving field and is increasingly being used to enhance the efficiency of dynamically changing manufacturing site operations and support decision-making. To design manufacturing cell-level layouts, genetic algorithms, which are a heuristic algorithm, have been used to examine the workload, equipment capacity, and demand [33], and mixed-integer models for multilayer layout designs of cellular manufacturing systems have been presented to improve the design times and costs of layouts that consider process flows, productivity, and equipment utilization [34]. However, the layout arrangement problem is an NP-hard problem in which finding an optimal solution in polynomial time is difficult. Furthermore, solving this problem using the aforementioned heuristic algorithms is difficult. Layout arrangement optimization is a time-consuming problem, and applying a reinforcement-learning algorithm that obtains optimal solutions by interacting with the environment becomes critical [35]. Furthermore, when reinforcement learning is compared with metaheuristic approaches, it allows for design-time reductions through suitable environmental settings [36]. A Markov decision process is a concept in which an agent that acts with intention is inserted into a Markov reward process in which rewards are provided for the set of all possible states and state transition probabilities [37]. Thus, a series of design-making processes were developed to allow the agent to send and receive information from the environment and learn to make superior choices. Because of the Markov property, the current state is independent of past states, and states that can change in the future are independent of the current state. In a factory, various parameters exist, and the state spaces that comprise these parameters are independent of each other. Therefore, to optimize the factory layout, modeling should ensure that past states do not influence or become influenced stochastically during state transitions over time. Unger and Börner proposed a method in which a Markov decision process algorithm is used to design layouts with optimal material flows according to arrangements based on the location and rotation of equipment [38]. Klar et al. used a double-deep Q-learning algorithm to determine optimal equipment arrangement plans while minimizing logistics transport times [39] and transport costs between equipment to generate optimal paths in situations in which equipment was arranged [40]. Feldkamp et al. revealed that logistics transport and improved production lead times can be optimized using data obtained from simulations and a deep-reinforcement-learning algorithm [41]. The learning outcomes vary depending on the method used to define rewards, thus using simulations to reflect dynamic changes at manufacturing sites. To overcome this problem, this study proposed an optimized layout analysis method in which production simulations and reinforcement learning are used to optimize layers consisting of equipment locations, logistics flows, and AGV quantities.

To overcome these limitations, this study proposed an optimized layout analysis method that leverages production simulations and reinforcement learning. By incorporating layers that optimize equipment locations, logistics flows, and AGV quantities, the proposed method provides a comprehensive framework for factory layout and logistics optimization. This approach not only addressed the gaps in the existing research but also provided a robust solution that can adapt to various manufacturing environments and operational conditions to ensure enhanced efficiency and productivity.

3. Optimal Design Systems That Use Design-Stage Production-Simulation Simulations to Design, Analyze, and Diagnose Logistics and Process Layouts

The proposed DT-based layout and logistics optimization analysis and design system support decision-making when designing novel production lines or improving existing production lines. If sufficient preliminary verification of the production line design plans is not performed in the manufacturing preparation stage, then problems such as layout changes or process rearrangements can occur during the manufacturing operation stage. Relocation costs occur during layout rearrangement in the manufacturing operations stage [42], and analyzing indicators for future periods, such as growth, demand, and productivity, is difficult because of the uncertainty of future predictions. Factory productivity can be improved by rationalizing existing individual workplaces and plans that are based on the intuition and experience of manufacturing engineers. However, because equipment/logistics control and operations are organically fused, and the scalability of factories increases, pre-emptive analysis and evaluation methodologies that can analyze the interactions between major elements and ascertain the optimal design plans and conditions. To strengthen competitiveness in terms of productivity, this study proposed a system that provides support to adequately verify in advance the problems that can occur during the process design stage and minimize waste in the operation stage.

3.1. Framework for Process and Production Flow Optimization

In the proposed layout design and analysis optimization system, production simulations are used to analyze various scenarios regarding the equipment arrangement and logistics paths that are considered during the design stage. Furthermore, the optimal quantity and utilization of logistics movement vehicles were devised to minimize logistics costs. Figure 1 depicts the proposed DT-based analysis and optimization framework. The framework consists of an information model, an interface module, a production simulation module, and an optimization module.

Reference Information. Data required in the manufacturing stage are categorized into plant, process, product, layout, equipment, and logistics information to prevent the database from becoming disordered and to efficiently manage available resources during the automatic generation and reuse of production simulations. Furthermore, the layout and logistics line information are integrated into a single set of information based on the reference information, and a single complete design scenario is sent to the interface module.
Interface Module. This module provides an interface for the design information of the production simulation and optimization modules. The production simulation interface sends the simulation master data to the DT simulation, receives simulation result data, and sends the data to the optimization interface module. The optimization interface module sends the master data to the optimization module, receives the optimization results, and sends the data to the DT simulation module.
Production Simulation Module. DT simulation is a discrete-event simulation model; when the module receives data, it generates an event in the DT base model and automatically designs a layout scenario based on the internal API. The DT library handles information regarding equipment, logistics paths, and AGVs, whereas the facility object handles information for each type of equipment. Synchronization was performed in the simulation engine, and the results were sent to the DT interface module at the end of the simulation.
Optimization Module. The optimization module has three layers to satisfy multiple objectives and optimize each layout element, as well as the entire layout. The topmost layer explores the location layout of the equipment in the available space, and the middle layer explores the logistics path rearrangements based on the results of the aforementioned layout. The bottommost layer determines the optimal AGV utilization to reduce logistics costs based on the aforementioned layout.

3.2. Production-Simulation-Based Layout Design and Analysis Sub-Framework

Universal design production simulation is critical for a system that performs layout design analysis and optimizes various layout scenarios. However, when existing simulation tools are used to create a model for a particular layout, reusing for a different layout or changing to satisfy new requirements incurs costs. In this study, the data in the initial base model were reset based on the master data, and an internal API was used to construct a data-based layout design system to ensure the flexibility of the simulation design and usage. As depicted in Figure 2, when the simulation master data are sent from the DT interface module to the DT, the space information of the base model is initialized, and the layout analysis decision-making elements are constructed within the base model through the objects stored in the DT library. The elements that constitute the layout include equipment locations, logistics tracks, suitable quantities of AGVs within the factory, and logistics flows. When the layout is constructed, data synchronization and simulation are executed. The simulation results are sent outside the DT and passed through the DT interface model to the master data of the optimization module. Thus, engineers can verify various KPI and set suitable weight values. This process is performed each time the data are sent from the DT interface module to the DT simulation module, and various layouts can be analyzed and optimized by repeating the DT simulation process.

3.3. Production Simulation Internal Production System Logic

The system of the production simulation consists of in-line equipment and an AGV-based logistics system. Equipment is generated by importing parameters that determine the twin’s equipment elements, such as the equipment-related process types, process times, equipment efficiency, MTTR, and physical dimensions, from external data storage. Furthermore, AGV-based logistics has rules based on the built-in ordering system and the shortest-path driving system, and tasks are assigned starting with the top AGV that is in an idle state to determine the optimal quantity of AGVs during operations. To move a product to the next piece of equipment when the work was completed, the task is assigned to an idle AGV, and the AGV moves the product to the next piece of equipment while maintaining a uniform equipment workload. The AGV movement path is based on a straight-line path, and the movement is performed to allow for the shortest movement distance to the objective. Furthermore, speed adjustments were made to prevent collisions at logistics intersections, and the movement direction was set in the AGV object settings. Accurate analyses and evaluations are possible and user convenience was ensured by constructing a production system within the production simulation.

3.4. Reinforcement-Learning-Based Process and Production Flow Optimization

Production-simulation-based optimization was investigated using various methods, and, in particular, reinforcement-learning-based optimization has attracted considerable research attention. In particular, studies have focused on various methods of multi-objective reinforcement learning, and decision-making systems are required for resolving the complex problems of production systems. Reinforcement learning is a machine-learning approach for policy decisions that consists of a series of tasks in which one or more agents explore an environment, ascertain the current state, and maximize reward accumulation. This study optimized process equipment locations and logistics movement paths and reduced logistics costs by using a reinforcement-learning algorithm known as Q-learning. In Algorithm 1, the Main Control Center initializes the episode and Q-table, calls the optimization layers, and manages the results. As depicted in Figure 1, three sub-frameworks (factory equipment layout optimization, logistics path optimization, and logistics utilization optimization) exists, and an overall system is used for improving the irrational equipment layouts and logistics path inefficiencies. Algorithm 2 learns optimization by considering equipment selection, equipment locations, distances between similar equipment, and layout constraints. Equipment of the same type was set up to form partial clusters within constraints such as pillars, and a single cluster includes two or three pieces of equipment. An action that explores equipment cluster arrangements is performed, and the optimal logistics movement tracks equipment cluster arrangements are analyzed, as depicted in Algorithm 3. After analyzing the optimal logistics movement paths, the minimum number of AGVs was analyzed based on AGV utilization, as presented in Algorithm 4.

Algorithm 1 Multi-Objective Reinforcement Learning.

1: Initialize episode counter and Q-table
2: for each episode do
3: Reset environment and obtain initial state S
4: while not terminal do
5: Choose action A based on policy derived from Q (e.g., epsilon-greedy)
6: Take action A, observe reward R and next state S′
7: Update Q-table using Q-learning update rule:
Q(S, A) ← Q(S, A) + α(R + γ max Q(S′, a′) − Q(S, A))
8: S ← S′
9: end while
10: end for

Algorithm 2 Facility Layout Optimization.

1: Input: Initial layout configuration
2: for each iteration do
3: Evaluate current layout state
4: Choose layout adjustment action (relocate equipment)
5: Apply action and observe resulting state
6: Update Q-table with new state and reward
7: end for

Algorithm 3 Logistic Path Optimization.

1: Input: Optimized equipment layout
2: for each iteration do
3: Evaluate current logistics path state
4: Choose path adjustment action (reroute path)
5: Apply action and observe resulting state
6: Update Q-table with new state and reward
7: end for

Algorithm 4 AGV Utilization Optimization.

1: Input: Optimized layout and logistics paths
2: for each iteration do
3: Evaluate current AGV utilization state
4: Choose AGV adjustment action (adjust number/reallocate tasks)
5: Apply action and observe resulting state
6: Update Q-table with new state and reward
7: end for

In the Q-learning framework, agents operate hierarchically across three layers, each layer responsible for optimizing different aspects of the factory layout and operations. First, in the facility layout optimization layer, the state of the agent is defined by the current configuration of equipment on the factory floor, specifically their positions. The actions involve relocating the pieces of equipment to different positions within the available space to enhance space utilization and operational efficiency. This layer ensures that equipment is placed in a manner that minimizes unnecessary movement and maximizes accessibility. Next, in the logistics routing optimization layer, the state encompasses the existing logistics paths used for material movement within the factory. This phenomenon includes the routes that materials take from one piece of equipment to another. The actions in this layer involve modifying these logistics paths to optimize material flows. Thus, paths could be rerouted to reduce travel distances, avoid congestion points, and improve overall material handling efficiency. Finally, in the AGV utilization optimization layer, the state includes the current utilization status of AGVs, such as the number of AGVs in use, their task assignments, and their distribution throughout the factory. The actions involve adjusting the number of AGVs in operation and reallocating their tasks to ensure balanced workload distribution and efficient material transport. This layer is focused on optimizing the use of AGVs to minimize idle time and enhance productivity. Throughout these layers, the Q-learning algorithm iteratively updates Q-values based on the rewards received from performing these actions. The hierarchical structure allows each layer to concentrate on specific optimization tasks, contributing to an overall optimized factory layout and operation.

Figure 3 depicts the class architecture of the optimization algorithm. The main control center manages the overall episodes of the algorithm and stores the layout information and layout evaluation scores. The state of the algorithm is defined as the arrangement of each equipment type, and the location co-ordinates of the logistics lines are converted into matrix form. The equipment arrangement, logistics paths, and number of AGVs were optimized by the three layers through optimization actions, and a production simulation was used to calculate the reward of the factory layout derived from the results of the actions.

To address the trade-off between multiple objective functions in our multi-objective reinforcement learning framework, we used a weighted sum approach in which each objective function is assigned a specific weight reflecting its relative importance. This phenomenon allows us to aggregate the objectives into a single scalar reward function. To address with Pareto optimal solutions, we generated a Pareto front by varying the weights and exploring various combinations of objective trade-offs. This result enables us to identify a set of optimal solutions that provide a balanced trade-off among conflicting objectives, giving decision-makers the flexibility to select the most suitable solution based on their specific priorities and constraints.

The KPIs that evaluate the layout through simulation include the throughput (

P

), area utilization (

A

), AGV utilization (

O

), logistics movement distance (

L

), and minimum AGV quantity (

Q_{m i n}

). Here,

P

,

A

,

O

, and

L

were calculated using Equations (1)–(4), and the

Q_{m i n}

reward is calculated using Equation (5), to ensure that an additional reward (0.1) is received each time the quantity is reduced by one AGV. The calculated reward equations are given by Equation (6). In the simulation system, the logistics movement distance is calculated based on the movement of AGVs. By tracking AGVs’ movement within the simulation, the system measures the time taken by AGVs to travel and converts this into the total distance covered. This approach ensures that the measured logistics movement distance accurately reflects the actual operational conditions and can be effectively optimized for enhanced efficiency. The simulation results are stored in data storage, and layouts with higher evaluated rewards can be found as the learning progresses. From the first state (

S_{t}

), an optimal action (

A_{t}

) was performed for the equipment layout, production flow, and three AGV elements. The next state (

S_{t + 1}

) and reward (

R

) are calculated by simulating the changed layout. The update is performed according to the epsilon greedy policy, as presented in Equation (7), and the action with the maximum

Q (s, a)

value was performed in accordance with the greedy policy. This process was repeated, and the Q-table was updated. The formulae are given by Equation (8). The learning was repeated until the final condition was achieved.

P = \frac{p - p_{m i n}}{p_{m i n} - p_{m a x}} \times 10

(1)

A = \frac{a - a_{m i n}}{a_{m i n} - a_{m a x}} \times 10

(2)

O = \frac{o - o_{m i n}}{o_{m i n} - o_{m a x}} \times 10

(3)

L = \frac{l - l_{m i n}}{l_{m i n} - l_{m a x}} \times 10

(4)

P_{1} = w_{1} (\frac{1}{p_{m i n} - p_{m a x}} \times 10) \geq 0.1, s u b j e c t t o 0 \leq w_{1} \leq 1

(5)

R = [(P \cdot w_{1}) + (A \cdot w_{2}) + (O \cdot w_{3}) + (L \cdot w_{4})] + (10 - Q_{m i n}) \cdot 0.1, s u b j e c t t o \sum_{i = 1}^{4} W_{i} = 1

(6)

π (s) = {a r g m a x}_{a \in A} Q (s, a)

(7)

Q (s_{t}, a_{t}) = Q (s_{t}, a_{t}) + 0.5 (R + γ \max_{a} Q (s_{t + 1}, a) - Q (s_{t}, a_{t}))

(8)

The update process for the learning’s states, actions, rewards, and Q-values is described in Algorithm 1. For the state of each layout, the equipment cluster locations and track co-ordinates were converted into a matrix form, and the layout states were classified using clustering algorithms. Table 1 lists the clustering algorithms used in this study.

3.5. Factory Layout Analysis and Optimization System Process

As depicted in Figure 4, when data were requested from the DT control center, they were sent to the equipment cluster optimization layer based on the design scenario data. In the equipment cluster optimization layer, simulations and an optimization algorithm were used to optimize equipment arrangement and layout within a factory. Thus, efficient production and resource use can be achieved. Equipment cluster design simulates and optimizes resource movement and workflow within the factory and evaluates the efficiency of the production lines. The results of this simulation are sent to the logistics logic optimization layer to determine the optimal path and strategy for logistics and material movements. The logistics logic optimization layer determines the optimal paths and strategy for logistics and material movement and optimizes the movement paths of AGVs and other means of transport. Therefore, the efficiency and safety of logistics and material movement can be improved. Next, the results are sent to the AGV quantity optimization layer, and the AGV utilization was optimized. This phenomenon adjusts the AGV quantity and arrangement to ensure the efficient operation of the AGVs in the production environment and resolves bottlenecks in the production process. An iterative optimization exploration was performed for the tasks, and the design layout data that are returned from all layers were output as an optimal factory design layout scenario. Thus, a factory layout that can maximize the production efficiency can be designed to optimally use resources.

4. Case Study

The target factory was used to produce small-sized panels through a modular display manufacturing process and contained large-scale equipment such as processing, inspection, and conveyance equipment. In the modular process, the next task is performed only when the process for the production product has been conducted. Therefore, a subsequent process first cannot be performed, and the product’s processes should be tracked. When a product moves on to the next process, analyzing the complex building steps involving a loader that receives the object, a device that sends the object receipt signal to the MES, and an unloader that sets the object down should be carried out. Therefore, significant costs are incurred when the designed factory layout is changed, and a decision-making support method that can handle dynamic situations such as various changes to processes, production planning, mixed production flows, and logistics should be devised.

4.1. DT Simulation and DT-Based Analysis

The system components included the interface, DT simulation, and optimization modules. Table 1 summarizes the development environment information. The DT interface module was written in the VBA version 7.1.1056 for Microsoft Office Excel-based VBA language, and the optimization interface module was loaded in a Python environment. Figure 5 depicts the DT simulation user-screen configuration. The simulation-based model simulates the operating logic and properties of each object and synchronizes reference information, such as the object’s types, properties, and specifications, based on the data schema, and the models are automatically generated. Therefore, the operating state of the logistics system was simulated in a virtual environment, and the simulation results were statistically analyzed to derive layout design KPIs, such as equipment utilization, AGV utilization, and cost reduction. Siemens Plant Simulation 16.1 was used for the DT simulator, and the internal API was designed in SimTalk. In Figure 5, ① depicts the library that manages the information that includes the production-simulation simulation’s equipment, processes, and operating logic; ② presents the production line’s task information and constraints and the base model; and ③ displays the menu that controls and operates the simulation model and the menu for viewing the analysis of the simulation results. Equipment and logistics tracks were generated in the DT simulation, and the simulation results were analyzed, as depicted in Figure 6. The utilization rate for each piece of equipment, utilization of each AGV, overall productivity, and AGV movement distances were analyzed; throughput, area utilization, AGV operation rate (%), logistics total distance(s), and AGV quantity were considered to be KPIs. Moreover, to connect to the optimization algorithm, a separate DT simulation version that includes a system that automates model generation and result analysis was constructed and used in the optimization process.

4.2. Optimizing Layouts Using Artificial Intelligence: Leveraging DT for Factory Layout Optimization

For manufacturing systems with complex and organically linked production line designs, equipment/logistics control, and operations, comprehensive analysis and evaluation technology are necessary to ascertain optimal operation plans and conditions in advance. Additionally, this technology can improve the overall production process and enhance productivity by providing efficient logistics movement paths. To construct such an analysis and optimization system, this study derived optimal AI-based equipment/logistics layout design plans and demonstrated production simulation technology. The factory layout optimization component performs logistics layout optimization, which includes equipment arrangements and logistics paths that minimize AGV movement times and maximize the operating efficiency of the overall system. Furthermore, to minimize factory operating costs, this study calculated the optimal number of AGVs to be assigned to the layout. Figure 7 depicts the layout of the initial factory. In Figure 7, the AGVs follow a one-way circular path, moving in a counterclockwise direction. The equipment for each process was arranged in a row, considering the material logistics flows. The AGV movement path for logistics was designed as a clockwise circular track allowing only one-way movement. Spatial constraints such as pillars exist, and equipment of the same type is placed between pillars in clusters of two or three.

4.3. Approach for Multi-Objective Optimization: Reward Formulation Case

In the case of predefined factory layout evaluation indicators, improving the five KPI indicators (throughput, area utilization, AGV operation, logistics movement distance, and AGV quantity) uniformly is critical for achieving optimization. In contrast to single-objective optimization, which has a unique optimal solution, multi-objective optimization determines the optimal solutions by considering various aspects of the optimal solution set. Throughput is a crucial indicator directly related to a company’s profitability, and the area and AGV utilization influence cost reduction. Furthermore, the logistics movement distance is an important indicator of resource efficiency. Thus, optimal solutions can be obtained while simultaneously considering various objectives by setting appropriate weight values for each KPI. A decision that ultimately satisfies the objectives can be performed by analyzing each case while considering various aspects. In this study, an optimization algorithm was applied by assigning appropriate weight values according to the relative importance of the indicators, and the results for various scenarios were analyzed. Therefore, optimal strategies were developed for each case, and a data-based optimal layout design system was constructed.

4.3.1. Reward Formulation Case Study 1

In Case 1, the multi-objective reinforcement learning that was described in Section 4.3 was used to assign high weight values to throughput, which is the KPI directly related to profitability, and learning was performed. Table 2 lists the weight values and Figure 8 depicts the layout. As presented in Figure 8, a row arrangement result is obtained for the equipment arrangement when the equipment, logistics movement paths, and AGV cost minimization are considered in the existing layout. Regarding logistics movement, a central logistics track was added to improve movement inefficiencies that occurred on the one-way circulating track. To evaluate the layout derived as a result of reinforcement learning, each KPI was analyzed, as presented in Table 3. The throughput increased by 0.3%, and the logistics movement distance decreased by 3.8% in Table 4. However, in terms of factory area costs and logistics costs, area utilization and AGV operation efficiency were reduced.

4.3.2. Reward Formulation Case Study 2

In Case 2, high weight values were assigned to logistics KPIs, and multi-objective reinforcement learning was performed. However, when learning was focused on productivity, logistics-related indicators, including the logistics movement distance, logistics costs, and logistics efficiency were critical for learning. As in Case 1, the layout derived from reinforcement learning was the result of considering equipment arrangements, logistics movement paths, and AGV cost minimization for the existing layout. As presented in Figure 9, an interspersed arrangement of A- and C-type equipment existed in the row-arranged logistics system, which increased movement efficiency. Additionally, a central logistics path was added to reduce movement distances caused by the interspersed arrangement. The KPI analysis results for evaluating the layout are presented in Table 5, which confirmed that the logistics operations are efficient because of the smaller quantity of AGVs compared with the existing layout. The interspersed arrangement of A- and C-type equipment was ineffective in improving overall throughput.

4.3.3. Reward Formulation Case Study 3

In Case 3, even weight values were assigned to the throughput and logistics KPIs to perform multi-objective reinforcement learning. Weight values were assigned for throughput along with the logistics KPIs, as presented in Table 6. Similar to the previous cases derived through reinforcement learning, in this case, equipment arrangement, logistics movement paths, and AGV cost minimization were considered. Accordingly, the equipment arrangement consisted of an interspersed arrangement of B- and C-type equipment, and a central logistics track was added to increase logistics movement efficiency. Figure 10 is the layout optimization result for Case 3. Table 7 presents the results of the analysis of each KPI. It can be observed, as in Table 8, that high efficiency can be achieved with a small number of AGVs. The area utilization indicator decreased because of the addition of the logistics movement path; however, compared with Cases 1 and 2, KPIs improved uniformly, and achieved excellent performance improvements in terms of both productivity and logistics.

5. Discussion

In this section, a comparative analysis of the three case studies presented in Section 4 is presented and the strengths and weaknesses of each approach are discussed comprehensively.

In the first case study, the focus was maximizing throughput. The reinforcement-learning algorithm assigned high weights to throughput, which resulted in a layout that slightly increased throughput by 0.3%. However, this reduces area utilization and AGV operational efficiency. The logistics movement distance was shortened by 3.8%, which indicated improved efficiency in the material flow.

In the second case study, logistics-related indicators such as logistics movement distance and costs were prioritized. This approach improved logistics operations, as evidenced by a 20% increase in AGV operational efficiency and a 11% reduction in the number of AGVs required. However, the throughput decreased by 1.3%, indicating that the trade-off for improved logistics results in a slight reduction in productivity.

The third case study balanced the weights of the throughput and logistics KPIs. This balanced approach yielded uniform improvements across various KPIs, with an 18% increase in AGV operational efficiency and a 5% reduction in the logistics movement distance. Although throughput decreased slightly by 0.5%, the performance improvements were evenly distributed compared with the other two cases. Table 9 presents a detailed comparison of the KPIs across the three case studies.

6. Conclusions

This study proposed a system that optimizes manufacturing processes and production flows using production simulation technology. A layout design and analysis framework was constructed using a reinforcement-learning-based multi-objective optimization method, which was applied to cases to analyze its usefulness. The proposed system enables the optimization of equipment arrangements and logistics paths, which improves productivity, reduces logistics costs, and maximizes operational efficiency. Moreover, this method provides clear and quantitative analyses compared with conventional design methods that rely on intuition and experience.

Based on data, the proposed production-simulation-based factory layout optimization framework can verify production line design plans in advance, thereby improving productivity by preventing various problems that can occur in the production preparation stage and minimizing inefficiencies during the operation stage. The RL-based optimization method can be an effective approach for resolving complex problems in production line designs and logistics systems. By addressing multiple objectives through three case analyses, this approach provided manufacturing companies with the flexibility to respond to and utilize various operating objectives.

The results of this study demonstrated that the proposed approach can be effectively applied for designing processing and logistics systems in the manufacturing field and various other industrial sectors. The combination of production simulation technology and reinforcement learning provides novel possibilities for productivity improvement, operation optimization, and cost reduction. In the future, the scope of applications for this technology will be extended, which can contribute to the development of optimization solutions across diverse industrial fields.

This study addressed these problems by creating production simulations that focus on design data. In the future, developing plans is essential for addressing problems in dynamic manufacturing environments by integrating real-time data. Therefore, a real-time data collection and analysis methodology should be developed by linking IoT technology to production simulation systems. The development of custom optimization algorithms that consider the unique characteristics of various manufacturing processes and logistics systems is another critical area of research. Sophisticated and practical optimization solutions can be realized by developing algorithms that can effectively account for various constraint conditions and objectives in complex manufacturing environments.

Finally, a systematic analysis and evaluation of the proposed system’s application at industrial sites and its success cases highlighted the effectiveness of the system. This process is vital for laying the foundation for the application of this system in the industry. Establishing standards for selecting and analyzing application cases, deriving practical improvement plans for each case, and constructing a continuous monitoring system are crucial for applying the proposed system to various industrial fields and maximizing its value and effectiveness.

Author Contributions

Funding acquisition, S.J., H.K., H.Y., M.K. and J.H.; investigation, S.Y. and D.L.; methodology, H.C.; supervision, S.D.N. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by Samsung Display Co., Ltd. (Project No. IAP2212004).

Data Availability Statement

The datasets generated and/or analyzed during the current study are not publicly available due to the reason of non-availability. These data are available from the corresponding author upon request.

Acknowledgments

This research was financially supported by the Ministry of Trade, Industry, and Energy (MOTIE) and Korea Institute for Advancement of Technology (KIAT) through the International Cooperative Research and Development Program. (Project No. 0022929). And, this work was supported by project for Smart Manufacturing Innovation R&D funded Korea Ministry of SMEs. (Project No. RS-2022-00140261).

Conflicts of Interest

The authors declare no conflicts of interest. Sanghoon Ji, Horim Kim, Hyunsik Yoon, Minsu Kwon, and Jagyu Han report the following conflicts of interest. They are employees of SAMSUNG DISPLAY, which may have an interest in the research reported as part of this manuscript.

References

Lee, S.; Park, S.; Yang, J.; Son, Y.H.; Ko, M.J.; Lim, J.W.; Kong, H.; Jung, Y.; Kim, S.; Noh, S.D.; et al. Modeling, Simulation-Based Assessments of Reconfigurability and Productivity for Automotive Module Assembly Lines. J. Comput. Des. Eng. 2019, 24, 233–247. [Google Scholar] [CrossRef]
Kang, H.S.; Lee, J.Y.; Choi, S.; Kim, H.; Park, J.H.; Son, J.Y.; Kim, B.H.; Noh, S.D. Smart manufacturing: Past research, present findings, and future directions. Int. J. Precis. Eng. Manuf.-Green Technol. 2016, 3, 111–128. [Google Scholar] [CrossRef]
Konstantinidis, F.K.; Myrillas, N.; Tsintotas, K.A.; Mouroutsos, S.G.; Gasteratos, A. A technology maturity assessment framework for industry 5.0 machine vision systems based on systematic literature review in automotive manufacturing. Int. J. Prod. Res. 2023, 1–37. [Google Scholar] [CrossRef]
Psarommatis, F.; May, G.; Azamfirei, V.; Konstantinidis, F. Optimizing efficiency and zero-defect manufacturing with in-process inspection: Challenges, benefits, and aerospace application. Procedia Comput. Sci. 2024, 232, 2857–2866. [Google Scholar] [CrossRef]
Konstantinidis, F.K.; Myrillas, N.; Mouroutsos, S.G.; Koulouriotis, D.; Gasteratos, A. Assessment of industry 4.0 for modern manufacturing ecosystem: A systematic survey of surveys. Machines 2022, 10, 746. [Google Scholar] [CrossRef]
Park, K.T.; Nam, Y.W.; Lee, H.S.; Im, S.J.; Noh, S.D.; Son, J.Y.; Kim, H. Design and implementation of a production simulation application for a connected micro smart factory. Int. J. Comput. Integr. Manuf. 2019, 32, 596–614. [Google Scholar] [CrossRef]
Guo, J.; Zhao, N.; Sun, L.; Zhang, S. Modular based flexible production simulation for factory design. J. Ambient Intell. Humaniz. Comput. 2019, 10, 1189–1200. [Google Scholar] [CrossRef]
Kamble, S.S.; Gunasekaran, A.; Gawankar, S.A. Sustainable Industry 4.0 framework: A systematic literature review identifying the current trends and future perspectives. Process Saf. Environ. Prot. 2018, 117, 408–425. [Google Scholar] [CrossRef]
Kamble, S.S.; Gunasekaran, A.; Parekh, H.; Mani, V.; Belhadi, A.; Sharma, R. Production simulation for sustainable manufacturing supply chains: Current trends, future perspectives, and an implementation framework. Technol. Forecast. Soc. Change 2022, 176, 121448. [Google Scholar] [CrossRef]
Cai, Y.; Starly, B.; Cohen, P.; Lee, Y.S. Sensor data and information fusion to construct digital-twins virtual machine tools for cyber-physical manufacturing. Procedia Manuf. 2017, 10, 1031–1042. [Google Scholar] [CrossRef]
He, B.; Bai, K.J. Production simulation-based sustainable intelligent manufacturing: A review. Adv. Manuf. 2021, 9, 1–21. [Google Scholar] [CrossRef]
Anderl, R.; Haag, S.; Schützer, K.; Zancul, E. Digital twin technology–An approach for Industrie 4.0 vertical and horizontal lifecycle integration. IT-Inf. Technol. 2018, 60, 125–132. [Google Scholar] [CrossRef]
Park, K.T.; Lee, D.; Noh, S.D. Operation procedures of a work-center-level digital twin for sustainable and smart manufacturing. Int. J. Precis. Eng. Manuf.-Green Technol. 2020, 7, 791–814. [Google Scholar] [CrossRef]
Zheng, Y.; Yang, S.; Cheng, H. An application framework of digital twin and its case study. J. Ambient Intell. Humaniz. Comput. 2019, 10, 1141–1153. [Google Scholar] [CrossRef]
Caputo, F.; Greco, A.; Fera, M.; Macchiaroli, R. Digital twins to enhance the integration of ergonomics in the workplace design. Int. J. Ind. Ergon. 2019, 71, 20–31. [Google Scholar] [CrossRef]
Biesinger, F.; Kraß, B.; Weyrich, M. A survey on the necessity for a digital twin of production in the automotive industry. In Proceedings of the 2019 23rd International Conference on Mechatronics Technology (ICMT), Salerno, Italy, 23–26 October 2019; pp. 1–8. [Google Scholar]
Guo, H.; Zhu, Y.; Zhang, Y.; Ren, Y.; Chen, M.; Zhang, R. A digital twin-based layout optimization method for discrete manufacturing workshop. Int. J. Adv. Manuf. Technol. 2021, 112, 1307–1318. [Google Scholar] [CrossRef]
Liu, Q.; Leng, J.; Yan, D.; Zhang, D.; Wei, L.; Yu, A.; Zhao, R.; Zhang, H.; Chen, X. Digital twin-based designing of the configuration, motion, control, and optimization model of a flow-type smart manufacturing system. J. Manuf. Syst. 2021, 58, 52–64. [Google Scholar] [CrossRef]
Zúñiga, E.R.; Moris, M.U.; Syberfeldt, A.; Fathi, M.; Rubio-Romero, J.C. A simulation-based optimization methodology for facility layout design in manufacturing. IEEE Access 2020, 8, 163818–163828. [Google Scholar] [CrossRef]
Choi, S.H.; Kim, B.S. Intelligent factory layout design framework through collaboration between optimization, simulation, and digital twin. J. Intell. Manuf. 2024, 1–15. [Google Scholar] [CrossRef]
Zhao, R.; Zou, G.; Su, Q.; Zou, S.; Deng, W.; Yu, A.; Zhang, H. Digital twins-based production line design and simulation optimization of large-scale mobile phone assembly workshop. Machines 2022, 10, 367. [Google Scholar] [CrossRef]
Segovia, M.; Garcia-Alfaro, J. Design, modeling and implementation of digital twins. Sensors 2022, 22, 5396. [Google Scholar] [CrossRef] [PubMed]
Jiang, H.; Qin, S.; Fu, J.; Zhang, J.; Ding, G. How to model and implement connections between physical and virtual models for digital twin application. J. Manuf. Syst. 2021, 58, 36–51. [Google Scholar] [CrossRef]
Agostino, Í.R.S.; Broda, E.; Frazzon, E.M.; Freitag, M. Using a digital twin for production planning and control in industry 4.0. In Scheduling in Industry 4.0 and Cloud Manufacturing; Springer: Berlin/Heidelberg, Germany, 2020; pp. 39–60. [Google Scholar]
Wang, Y.; Wu, Z. Model construction of planning and scheduling system based on digital twin. Int. J. Adv. Manuf. Technol. 2020, 109, 2189–2203. [Google Scholar] [CrossRef]
Braun, D.; Biesinger, F.; Jazdi, N.; Weyrich, M. A concept for the automated layout generation of an existing production line within the digital twin. Procedia CIRP 2021, 97, 302–307. [Google Scholar] [CrossRef]
Biesinger, F.; Meike, D.; Kraß, B.; Weyrich, M. A case study for a digital twin of body-in-white production systems general concept for automated updating of planning projects in the digital factory. In Proceedings of the 2018 IEEE 23rd International conference on Emerging Technologies and Factory Automation (ETFA), Turin, Italy, 4–7 September 2018; Volume 1, pp. 19–26. [Google Scholar]
Choi, S.; Woo, J.; Kim, J.; Lee, J.Y. Digital twin-based integrated monitoring system: Korean application cases. Sensors 2022, 22, 5450. [Google Scholar] [CrossRef] [PubMed]
Mykoniatis, K.; Harris, G.A. A digital twin emulator of a modular production system using a data-driven hybrid modeling and simulation approach. J. Intell. Manuf. 2021, 32, 1899–1911. [Google Scholar] [CrossRef]
Kumbhar, M.; Ng, A.H.; Bandaru, S. A digital twin based framework for detection, diagnosis, and improvement of throughput bottlenecks. J. Manuf. Syst. 2023, 66, 92–106. [Google Scholar] [CrossRef]
Bottani, E.; Cammardella, A.; Murino, T.; Vespoli, S. From the Cyber-Physical System to the Digital Twin: The process development for behaviour modelling of a Cyber Guided Vehicle in M2M logic. In XXII Summer School Francesco TurcoIndustrial Systems Engineering; University of Washington: Washington, DC, USA, 2017; pp. 1–7. [Google Scholar]
Banerjee, A.; Dalal, R.; Mittal, S.; Joshi, K.P. Generating digital twin models using knowledge graphs for industrial production lines. In Proceedings of the Workshop on Industrial Knowledge Graphs, Co-Located with the 9th International ACM Web Science Conference 2017, Troy, NY, USA, 25 June 2017. [Google Scholar]
Wu, X.; Chu, C.H.; Wang, Y.; Yan, W. A genetic algorithm for cellular manufacturing design and layout. Eur. J. Oper. Res. 2007, 181, 156–167. [Google Scholar] [CrossRef]
Kia, R.; Khaksar-Haghani, F.; Javadian, N.; Tavakkoli-Moghaddam, R. Solving a multi-floor layout design model of a dynamic cellular manufacturing system by an efficient genetic algorithm. J. Manuf. Syst. 2014, 33, 218–232. [Google Scholar] [CrossRef]
Burggraef, P.; Wagner, J.; Heinbach, B. Bibliometric study on the use of machine learning as resolution technique for facility layout problems. IEEE Access 2021, 9, 22569–22586. [Google Scholar] [CrossRef]
Klar, M.; Glatt, M.; Aurich, J.C. Performance comparison of reinforcement learning and metaheuristics for factory layout planning. CIRP J. Manuf. Sci. Technol. 2023, 45, 10–25. [Google Scholar] [CrossRef]
Barto, A.G.; Bradtke, S.J.; Singh, S.P. Learning to act using real-time dynamic programming. Artif. Intell. 1995, 72, 81–138. [Google Scholar] [CrossRef]
Unger, H.; Börner, F. Reinforcement learning for layout planning–modelling the layout problem as MDP. In Proceedings of the Advances in Production Management Systems. Artificial Intelligence for Sustainable and Resilient Production Systems: IFIP WG 5.7 International Conference, APMS 2021, Nantes, France, 5–9 September 2021; Proceedings, Part III. Springer International Publishing: Berlin/Heidelberg, Germany, 2021; pp. 471–479. [Google Scholar]
Klar, M.; Glatt, M.; Aurich, J.C. An implementation of a reinforcement learning based algorithm for factory layout planning. Manuf. Lett. 2021, 30, 1–4. [Google Scholar] [CrossRef]
Unger, H.; Börner, F.; Fischer, D. Reinforcement Learning for Layout Planning–Automated Pathway Generation for Arbitrary Factory Layouts. In Proceedings of the International Conference on Flexible Automation and Intelligent Manufacturing, Porto, Portugal, 18–22 June 2023; Springer Nature: Cham, Switzerland, 2023; pp. 1031–1039. [Google Scholar]
Feldkamp, N.; Bergmann, S.; Strassburger, S. Simulation-based deep reinforcement learning for modular production systems. In Proceedings of the 2020 Winter Simulation Conference (WSC), Orlando, FL, USA, 14–18 December 2020; pp. 1596–1607. [Google Scholar]
Palekar, U.S.; Batta, R.; Bosch, R.M.; Elhence, S. Modeling uncertainties in plant layout problems. Eur. J. Oper. Res. 1992, 63, 347–359. [Google Scholar] [CrossRef]

Figure 1. Production-simulation-based factory layout optimization framework.

Figure 2. Production-simulation-based factory layout analyses sub-framework.

Figure 3. System optimization UML class diagram.

Figure 4. Production-simulation-based factory layout optimization processor.

Figure 5. Factory layout optimization simulation model.

Figure 6. Simulation output: facility and AGV utilization reports.

Figure 7. Initial factory layout diagram.

Figure 8. Case 1 factory layout diagram.

Figure 9. Case 2 factory layout diagram.

Figure 10. Case 3 factory layout diagram.

Table 1. Clustering algorithm hyperparameter.

	Hyperparameter	Optimal Hyperparameter
KMeans	Hyperparameter	Optimal Hyperparameter
	Board_Size	(4 × 8 matrix)
	cluster	391
DBSCAN	Board_Size	(4 × 8 matrix)
	eps	22.00
	Min_samples	2
	cluster	391
SOM	Board_Size	(4 × 8 matrix)
	Learning_rate	0.5
	Sigma	0.3
	cluster	26

Table 2. Software components and development environments.

Component		Item	Contents
Interface module	Production simulation Interface	Production simulation Interface	Development environment
	Production simulation Interface	Programming language	Programming language
	Optimization Interface	Optimization Interface	Development environment
	Optimization Interface	Programming language	Programming language
DT simulation module		Development environment	Siemens Plant Simulation 16.1
DT simulation module		Programming language	SimTalk 2.0
Optimization module		Development environment	PyCharm 2022.1.2
Optimization module		Programming language	Python 3.10.7

Table 3. Key performance indicators (KPIs) and weights for Case Study 1.

KPI	Unit	Weight
Throughput	ea	1
Area utilization	m	0
AGV operation	%	0
Logistics distance	sec	0
AGV quantity	ea	Supplementary reward

Table 4. Performance comparison and improvement ratios for Case Study 1.

KPI	Unit	Initial Layout	Case 1	Improvement Ratio (%)
Simulation time	min	44,200	44,200	-
Throughput	ea	512	514	0.3
Area utilization	m²	1650	1760	−6.6
AGV operation	%	45.47	39.34	−13
Logistics distance	sec	613,961	590,076	3.8
AGV quantity	ea	9	10	−11
Composite score	-	9.51	10.32	8.5

Table 5. Key performance indicators (KPIs) and weights for Case Study 2.

KPI	Unit	Weight
Throughput	ea	0.3
Area utilization	m	0.1
AGV operation	%	0.2
Logistics distance	sec	0.4
AGV quantity	ea	Supplementary reward

Table 6. Performance comparison and improvement ratios for Case Study 2.

KPI	Unit	Initial Layout	Case 2	Improvement Ratio (%)
Simulation time	min	44,200	44,200	-
Throughput	ea	512	505	−1.3
Area utilization	m²	1650	1760	−6.6
AGV operation	%	45.47	54.72	20
Logistics distance	sec	613,961	656,664	−6.9
AGV quantity	ea	9	8	11
Composite score	-	9.787	9.819	0.3

Table 7. KPIs and weights for Case Study 3.

KPI	Unit	Weight
Throughput	ea	0.6
Area utilization	m	0.1
AGV operation	%	0.1
Logistics distance	sec	0.2
AGV quantity	ea	Supplementary reward

Table 8. Performance comparison and improvement ratios for Case Study 3.

KPI	Unit	Initial Layout	Case 3	Improvement Ratio (%)
Simulation time	min	44,200	44,200	-
Throughput	ea	512	509	−0.5
Area utilization	m²	1650	1760	−6.6
AGV operation	%	45.47	53.75	18
Logistics distance	sec	613,961	645,021	−5.0
AGV quantity	ea	9	8	11
Composite score	-	9.893	10.153	2.6

Table 9. Performance comparison and improvement ratios of case studies.

Case	Evaluation Criteria	Productivity	Area Utilization	AGV Operation	Logistics Distance	AGV Quantity	Composite Score
Case 1	Weight	1	0	0	0	+0.1/ea	-
	Initial layout	512	1650	45.47	613,961	9	9.787
	Optimized layout	514	1760	39.34	590,076	10	10.32
	Improvement ratio (%)	+0.3	−6.6	−13	+3.8	−11	+8.5
Case 2	Weight	0.3	0.1	0.3	0.3	+0.1/ea	-
	Initial layout	512	1650	45.47	613,961	9	9.779
	Optimized layout	505	1760	54.72	656,664	8	9.819
	Improvement ratio (%)	−1.3	−6.6	+20	−6.9	+11	+0.3
Case 3	Weight	0.5	0.1	0.2	0.2	+0.1/ea	-
	Initial layout	512	1650	45.47	613,961	9	9.893
	Optimized layout	509	1760	53.75	645,021	8	10.153
	Improvement ratio (%)	−0.5	−6.6	+18	−5	+11	+2.6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Choi, H.; Yu, S.; Lee, D.; Noh, S.D.; Ji, S.; Kim, H.; Yoon, H.; Kwon, M.; Han, J. Optimization of the Factory Layout and Production Flow Using Production-Simulation-Based Reinforcement Learning. Machines 2024, 12, 390. https://doi.org/10.3390/machines12060390

AMA Style

Choi H, Yu S, Lee D, Noh SD, Ji S, Kim H, Yoon H, Kwon M, Han J. Optimization of the Factory Layout and Production Flow Using Production-Simulation-Based Reinforcement Learning. Machines. 2024; 12(6):390. https://doi.org/10.3390/machines12060390

Chicago/Turabian Style

Choi, Hyekyung, Seokhwan Yu, DongHyun Lee, Sang Do Noh, Sanghoon Ji, Horim Kim, Hyunsik Yoon, Minsu Kwon, and Jagyu Han. 2024. "Optimization of the Factory Layout and Production Flow Using Production-Simulation-Based Reinforcement Learning" Machines 12, no. 6: 390. https://doi.org/10.3390/machines12060390

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimization of the Factory Layout and Production Flow Using Production-Simulation-Based Reinforcement Learning

Abstract

1. Introduction

2. Related Work

2.1. Utilizing Production Simulation for Factory Layout Design Optimization

2.2. Advantages of Production Simulation in Manufacturing Optimization

2.3. Leveraging Reinforcement Learning for Advanced Manufacturing Factory Layout Optimization

3. Optimal Design Systems That Use Design-Stage Production-Simulation Simulations to Design, Analyze, and Diagnose Logistics and Process Layouts

3.1. Framework for Process and Production Flow Optimization

3.2. Production-Simulation-Based Layout Design and Analysis Sub-Framework

3.3. Production Simulation Internal Production System Logic

3.4. Reinforcement-Learning-Based Process and Production Flow Optimization

3.5. Factory Layout Analysis and Optimization System Process

4. Case Study

4.1. DT Simulation and DT-Based Analysis

4.2. Optimizing Layouts Using Artificial Intelligence: Leveraging DT for Factory Layout Optimization

4.3. Approach for Multi-Objective Optimization: Reward Formulation Case

4.3.1. Reward Formulation Case Study 1

4.3.2. Reward Formulation Case Study 2

4.3.3. Reward Formulation Case Study 3

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI