Chip-Level Defect Analysis with Virtual Bad Wafers Based on Huge Big Data Handling for Semiconductor Production
Abstract
:1. Introduction
- Extremely imbalanced data;
- Significant big data processing.
2. Related Work
2.1. Semiconductor Defect
2.2. Huge Data Handling
3. Proposed Methodology
3.1. Data Extraction
3.2. Data Filtering
- Chips with no history of defects in the EDS test process;
- Chips belong to the same product as the defective chip.
3.3. Data Sampling
3.4. Over-Sampling
3.5. Classification
- LR provides measures that quantify the correlation.
- LR is easier to interpret and explain.
- LR is resistant to overfitting in low-dimensional data.
- LR works well with continuous and numerical variables.
3.6. Critical Test Item Selection
- Exclude test items where 80% or fewer of all defective chips have values (e.g., excluding test items with only 7 out of 10 defective chips).
- Exclude test items where defective chips are concentrated in the lower positions, particularly for smaller-is-better test items (e.g., fail bit count).
- Rank the resulting test items based on the harmonic mean of their and accuracy values.
Algorithm 1 Critical Test Item Selection |
Input: Defective chips Output: Top 10 Test item’s Rank and Statistical Result
|
3.7. Latent Defect Wafer Discovery
Algorithm 2 Latent Defective Wafer Discover |
Input: Selected test item , defective chips C Output: Latent Fail Wafer List
|
4. Proposed System Architecture
4.1. Distributed and Asynchronous Architecture
4.2. Data Multicast
5. Performance Evaluation
5.1. Experimental Design
5.2. Experimental Results
5.2.1. Case Study: Metal Bridge Defect
5.2.2. Time Complexity
5.2.3. System Performance Evaluation
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
FAB | fabrication |
EDS | Electrical Die Sorting |
vBADs | virtual bad wafers |
MPP | Massive Parallel Processing |
EFA | Electrical Failure Analysis |
PFA | Physical Failure Analysis |
LR | Logistic Regression |
SMOTE | Synthetic Minority Over-Sampling Technique |
WBM | Wafer Bin Map |
ETL | Extract–Transform–Load |
DCI | Defect Correlation Index |
UDF | User-Defined Function |
MRR | mean reciprocal rank |
References
- Price, D.W.; Sutherland, D.G. Process Watch: The Most Expensive Defect; Part 2. Solid State Technology (On-Line and Print Editions). 2015. Available online: https://sst.semiconductor-digest.com/2015/07/process-watch-the-most-expensive-defect-part-2/ (accessed on 1 June 2020).
- Robinson, J.C.; Sherman, K.; Price, D.W.; Rathert, J. Inline Part Average Testing (I-PAT) for automotive die reliability. In Proceedings of the Metrology, Inspection, and Process Control for Microlithography XXXIV, San Jose, CA, USA, 24–27 February 2020; Adan, O., Robinson, J.C., Eds.; International Society for Optics and Photonics, SPIE: Bellingham, WA, USA, 2020; Volume 11325, pp. 50–59. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Yi, H.; Jiang, Q.; Yan, X.; Wang, B. Imbalanced Classification Based on Minority Clustering SMOTE with Wind Turbine Fault Detection Application. IEEE Trans. Ind. Inform. 2020, 17, 5867–5875. [Google Scholar] [CrossRef]
- Patel, A.; Birla, M.; Nair, U. Addressing big data problem using Hadoop and Map Reduce. In Proceedings of the 2012 Nirma University International Conference on Engineering (NUiCONE), Ahmedabad, India, 6–8 December 2012; pp. 1–5. [Google Scholar]
- Tsakalozos, K.; Verroios, V.; Roussopoulos, M.; Delis, A. Time-Constrained Live VM Migration in Share-Nothing IaaS-Clouds. In Proceedings of the 2014 IEEE 7th International Conference on Cloud Computing, Anchorage, AK, USA, 27 June–2 July 2014; pp. 56–63. [Google Scholar]
- Wang, X.; Yang, L.; Xie, X.; Jin, J.; Deen, M. A Cloud-Edge Computing Framework for Cyber-Physical-Social Services. IEEE Commun. Mag. 2017, 55, 80–85. [Google Scholar] [CrossRef]
- Lee, D.H.; Yang, J.K.; Lee, C.H.; Kim, K.J. A data-driven approach to selection of critical process steps in the semiconductor manufacturing process considering missing and imbalanced data. J. Manuf. Syst. 2019, 52, 146–156. [Google Scholar] [CrossRef]
- Lee, H.; Kim, H. Semi-Supervised Multi-Label Learning for Classification of Wafer Bin Maps with Mixed-Type Defect Patterns. IEEE Trans. Semicond. Manuf. 2020, 33, 653–662. [Google Scholar] [CrossRef]
- Nam, W.S.; Kim, S.B. A Prediction of Wafer Yield Using Product Fabrication Virtual Metrology Process Parameters in Semiconductor Manufacturing. J. Korean Inst. Ind. Eng. 2015, 41, 572–578. [Google Scholar] [CrossRef]
- Xu, H.W.; Qin, W.; Lv, Y.L.; Zhang, J. Data-Driven Adaptive Virtual Metrology for Yield Prediction in Multibatch Wafers. IEEE Trans. Ind. Inform. 2022, 18, 9008–9016. [Google Scholar] [CrossRef]
- Dean, J.; Ghemawat, S. MapReduce: Simplified Data Processing on Large Clusters. In Proceedings of the OSDI’04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, USA, 6–8 December 2004; pp. 137–150. [Google Scholar]
- Shvachko, K.; Kuang, H.; Radia, S.; Chansler, R. The Hadoop Distributed File System. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Lake Tahoe, NV, USA, 6–7 May 2010; pp. 1–10. [Google Scholar]
- Borthakur, D. The Hadoop Distributed File System: Architecture and Design. Hadoop Proj. Website 2007, 11, 21. [Google Scholar]
- Liu, B.; Li, J.; Chen, C.; Tan, W.; Chen, Q.; Zhou, M. Efficient motif discovery for large-scale time series in healthcare. IEEE Trans. Ind. Inform. 2015, 11, 583–590. [Google Scholar] [CrossRef]
- Waas, F. Beyond Conventional Data Warehousing—Massively Parallel Data Processing with Greenplum Database—(Invited Talk). In Proceedings of the BIRTE, Auckland, New Zealand, 24 August 2008. [Google Scholar]
- Lyu, Z.; Zhang, H.H.; Xiong, G.; Guo, G.; Wang, H.; Chen, J.; Praveen, A.; Yang, Y.; Gao, X.; Wang, A.; et al. Greenplum: A Hybrid Database for Transactional and Analytical Workloads. In Proceedings of the 2021 International Conference on Management of Data, Virtual Event, 20–25 June 2021; pp. 2530–2542. [Google Scholar]
- Gollapudi, S. Getting Started with Greenplum for Big Data Analytics; Packt Publishing Ltd.: Birmingham, UK, 2013. [Google Scholar]
- Ma, S.; Xiao, H.; Xu, B.; Tao, R.; Xie, F.; Zeng, D.; Wang, T. Bank Big Data Architecture Based on Massive Parallel Processing Database. In Proceedings of the 2018 15th International Symposium on Pervasive Systems, Algorithms and Networks (I-SPAN), Yichang, China, 16–18 October 2018; pp. 93–99. [Google Scholar]
Category | Parameter | Value |
---|---|---|
MPP DB | The number of physical servers | 3 ea |
MPP DB | The number of data nodes (segments) | 250 ea |
MPP DB | Clusters total SSD size | 210 TB |
MPP DB | Clusters total memory size | 72 TB |
MPP DB | The number of cluster’s total CPU cores | 512 core |
Hadoop | The number of physical servers | 16 ea |
Hadoop | The number of data nodes | 50 ea |
Hadoop | Cluster’s total SSD size | 4.8 TB |
Hadoop | Cluster’s total memory size | 4 TB |
Hadoop | The number of cluster’s total CPU cores | 360 cores |
Hadoop | External storage (NAS) | 686 TB |
Rank | Test Step | Test Item | p-Value | Accuracy | ||
---|---|---|---|---|---|---|
1 | Hot Test | Voltage X | 0.99 | 0.001 | 0.99 | 0.99 |
2 | Repair Test | Test A | 0.54 | 0.683 | 0.91 | 0.68 |
3 | Hot Test | Test B | 0.53 | 0.004 | 0.94 | 0.68 |
4 | Hot Test | Test C | 0.47 | 0.769 | 0.90 | 0.61 |
5 | Repair Test | Test D | 0.47 | 0.010 | 0.87 | 0.61 |
Rank | FAB Step | Equip. | Bad Ratio (%) | Good Ratio (%) | Ratio Gap (%) |
---|---|---|---|---|---|
402 | 70 | E073 | 100 (1/1) | 23 (23/100) | 77.00 |
403 | 88 | E198 | 100 (1/1) | 23 (23/100) | 77.00 |
404 | 10 | E051 | 100 (1/1) | 23 (23/100) | 77.00 |
405 | 26 | E032 | 100 (1/1) | 24 (24/100) | 76.00 |
406 | 92 | E017 | 100 (1/1) | 23 (23/100) | 77.00 |
Rank | FAB Step | Equip. | Bad Ratio (%) | Good Ratio (%) | Ratio Gap (%) |
---|---|---|---|---|---|
1 | 10 | E051 | 100 (90/90) | 23 (23/100) | 77.00 |
2 | 03 | E489 | 98.88 (89/90) | 32 (32/100) | 66.88 |
3 | 05 | E131 | 100 (90/90) | 33 (33/100) | 67.00 |
4 | 77 | E123 | 100 (90/90) | 52.52 (52/99) | 47.48 |
5 | 36 | E001 | 100 (90/90) | 62 (62/100) | 38.00 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, J.; Joe, I. Chip-Level Defect Analysis with Virtual Bad Wafers Based on Huge Big Data Handling for Semiconductor Production. Electronics 2024, 13, 2205. https://doi.org/10.3390/electronics13112205
Kim J, Joe I. Chip-Level Defect Analysis with Virtual Bad Wafers Based on Huge Big Data Handling for Semiconductor Production. Electronics. 2024; 13(11):2205. https://doi.org/10.3390/electronics13112205
Chicago/Turabian StyleKim, Jinsik, and Inwhee Joe. 2024. "Chip-Level Defect Analysis with Virtual Bad Wafers Based on Huge Big Data Handling for Semiconductor Production" Electronics 13, no. 11: 2205. https://doi.org/10.3390/electronics13112205
APA StyleKim, J., & Joe, I. (2024). Chip-Level Defect Analysis with Virtual Bad Wafers Based on Huge Big Data Handling for Semiconductor Production. Electronics, 13(11), 2205. https://doi.org/10.3390/electronics13112205