Acquisition Method of User Requirements for Complex Products Based on Data Mining
Abstract
:1. Introduction
- There is a lot of product design knowledge in the product life cycle, such as technical reports, patent documents, maintenance records, and user evaluation [13]. The hidden design information is weak in structure, huge in total, and low in information utilization. How can we select and identify valid data and mine it?
- In user-oriented product evaluation, it is difficult for users to grasp the relationship between demand importance and customer satisfaction with subjective experience [14]. How can we objectively reflect the importance of product requirements to determine user requirements?
- Patent data is an important carrier of research and development achievements [15]. The mining of complex product text patents can quickly obtain professional and technical knowledge. In this study, patent data were used to effectively extract available information and transform it into design knowledge. The knowledge base of product design was constructed. The Latent Dirichlet Allocation (LDA) topic model was used to identify product keywords. The similarity of product demand characteristics was calculated based on document–topic probability distribution, and the keywords with similar attributes were clustered. Then, the K-means algorithm was used to conduct the secondary clustering of subject words with similar characteristics, refine the key functional requirements categories, and realize the explicit design requirements.
- The subject words extracted by patent data mining are often too specialized, which leads to a lack of regularity and the systematization of product functional requirements obtained by the clustering method, which is not conducive to innovative design identification. Rough set theory does not need any prior information to determine the importance, nor does it depend on people’s subjective judgment. The calculation results are stable and can objectively and effectively determine the importance of design requirements. In this paper, rough set theory was used to calculate the importance of product innovation design requirements, and the purpose of rapid screening and sorting of requirements was realized. Combined with user satisfaction, importance–performance analysis (IPA) was carried out to obtain complex product innovation design decisions.
2. Literature Review
2.1. Patent Data Mining
2.2. User Demand Acquisition
3. Acquisition Method of Data-Driven User Requirements
3.1. Acquisition Process of Data-Driven User Requirements
3.2. Information Collection and Preprocessing
- Obtain the initial corpus. The target product is determined from many complex devices, and the patent text is selected as the data collection sample in the Web data source to construct the initial corpus.
- Eliminate the stop words. The information content of the preliminary collected product patent abstract text is mixed, and the value density is low. It contains some verbs, adjectives, adverbs, quantifiers, pronouns, and so on. It has a high frequency but does not reflect the subject content, such as “proposed, constructed, reliable, one, based”, and so on. In order to reduce the interference information and ensure the accuracy of the word segmentation results, according to the characteristics of the product, a stop words list is built and saved in the TXT text format of UTF-8 using natural language processing software jieba to filter text information, removing redundant words, and improving data analysis results.
- Construct a custom dictionary. Since word segmentation software is a word segmentation mechanism based on popular vocabulary, a misclassification of professional terms may occur. The text data used in this paper is the patent text of complex products, which contains a large number of professional terms. Therefore, in order to improve the accuracy of the word segmentation, this paper constructs a custom dictionary. Taking a gun drill machine as an example, before using the custom dictionary, the word segmentation software divides “ball screw, coupling sleeve” into “ball, screw, coupling, sleeve”. The result loses the meaning contained in the professional terms. Therefore, adding the professional vocabulary to the custom dictionary before word segmentation not only avoids the lack of key needs, but also improves the accuracy of the acquisition. Additionally, it is more in line with professional background knowledge.
- Conduct word segmentation and acquire available knowledge base. The patent abstract text, after cleaning and denoising, is segmented to obtain the available corpus. If the vocabulary does not meet the professional requirements, the initial corpus needs to be reprocessed. Constantly update the stop-words dictionary and the custom dictionary to achieve the purpose of accurate word segmentation. Follow the above steps to iterate, obtain the target data set, and finally obtain the available knowledge base.
3.3. Text Clustering Algorithm and Requirement Topic Extraction
3.4. Demand Importance Based on Rough Set Theory
3.5. Importance Performance Analysis
- Collect users’ performance evaluation values of the design requirements, and assign any number between 0–1 according to users’ satisfaction with the design requirements.
- For each design requirement, calculate the average according to all user rating values as the vertical axis input of the IPA diagram. For the ith design requirement, average the sum of the satisfaction ratings of n users to obtain the average performance of a single design requirement, as shown in Formula (6).
- 3.
- Use the design requirements’ importance based on rough set in Section 3.4 as the horizontal axis input of the IPA diagram. Reasoning the relationship between design requirements data reasonably avoids subjective interference.
- 4.
- Build a biaxial coordinate system based on the average score of all users for a single feature, forming an R-IPA diagram, as shown in Figure 3. According to the region where the scattered points representing the demand for innovative design in the R-IPA figure are located, analyze the direction of the product innovation, and make reasonable design and development decisions.
4. Case Study
4.1. Information Collection and Preprocessing
4.2. Topic Clustering of Requirements Information
4.3. Requirements Importance Determination
4.4. Analysis and Discussion
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhan, Y.; Tan, K.H.; Huo, B. Bridging customer knowledge to innovative product development: A data mining approach. Int. J. Prod. Res. 2019, 57, 6335–6350. [Google Scholar] [CrossRef]
- Zhang, Z.; Peng, Q.; Gu, P. Improvement of user involvement in product design. Procedia CIRP 2015, 36, 267–272. [Google Scholar] [CrossRef]
- Li, Y.; Sha, K.; Li, H.; Wang, Y.; Dong, Y.N.; Feng, J.; Zhang, S.; Chen, Y. Improving the elicitation of critical customer requirements through an understanding of their sensitivity. Res. Eng. Des. 2023, 16, 1–20. [Google Scholar] [CrossRef] [PubMed]
- Iwasaki, K.; Kuriyama, Y.; Kondoh, S.; Shirayori, A. Structuring engineers’ implicit knowledge of forming process design by using a graph model. Procedia CIRP 2018, 67, 563–568. [Google Scholar] [CrossRef]
- Li, J.; Nie, Y.; Zhang, X.; Wang, K.; Tong, S.; Eynard, B. A framework method of user-participation configuration design for complex products. Procedia CIRP 2018, 70, 451–456. [Google Scholar] [CrossRef]
- Tao, F.; Cheng, Y.; Zhang, L.; Nee, A.Y.C. Advanced manufacturing systems: Socialization characteristics and trends. J. Intell. Manuf. 2017, 28, 1079–1094. [Google Scholar] [CrossRef]
- Peng, Y.; Huang, X.; Zhao, Y. An Overview of Cross-media Retrieval: Concepts, Methodologies, Benchmarks and Challenges. IEEE Trans. Circuits Syst. Video Technol. 2017, 99, 2372–2385. [Google Scholar] [CrossRef]
- Salminen, J.; Rao, R.G.; Jung, S.G.; Chowdhury, S.A.; Jansen, B.J. Enriching Social Media Personas with Personality Traits: A Deep Learning Approach Using the Big Five Classes. In Proceedings of the International Conference on Human-Computer Interaction, Copenhagen, Denmark, 19–24 July 2020. [Google Scholar]
- Li, H.; Mi, S.; Li, Q.; Wen, X.; Qiao, D.; Luo, G. A scheduling optimization method for maintenance, repair and operations service resources of complex products. J. Intell. Manuf. 2020, 31, 1673–1691. [Google Scholar] [CrossRef]
- Guo, Q.; Xue, C.; Yu, M.; Shen, Z. A new user implicit requirements process method oriented to product design. J. Comput. Inf. Sci. Eng. 2019, 19, 11. [Google Scholar] [CrossRef]
- Xie, N.; Chen, D.; Fan, Y.; Zhu, M. The acquisition method of the user’s Kansei needs based on double matrix recommendation algorithm. J. Intell. Fuzzy Syst. 2021, 41, 2. [Google Scholar] [CrossRef]
- Han, X.; Li, R.; Wang, J.; Qin, S.; Ding, G. Identification of key design characteristics for complex product adaptive design. Int. J. Adv. Manuf. Technol. 2018, 95, 1215–1231. [Google Scholar] [CrossRef]
- Cong, Y.; Yu, S.; Chu, J.; Su, Z.; Huang, Y.; Li, F. A small sample data-driven method: User needs elicitation from online reviews in new product iteration. Adv. Eng. Inform. 2023, 56, 101953. [Google Scholar] [CrossRef]
- Zhou, Q.; He, L. Research on customer satisfaction evaluation method for individualized customized products. Int. J. Adv. Manuf. Technol. 2019, 104, 3229–3238. [Google Scholar] [CrossRef]
- Choi, J.; Lee, J.; Yoon, J. Anticipating promising services under technology capability for new product-service system strategies: An integrated use of patents and trademarks. Comput. Ind. 2021, 133, 103542. [Google Scholar] [CrossRef]
- Liu, L.; Li, Y.; Xiong, Y.; Cavallucci, D. A new function-based patent knowledge retrieval tool for conceptual design of innovative products. Comput. Ind. 2020, 115, 103154. [Google Scholar] [CrossRef]
- Jia, L.; Wu, C.; Zhu, X.; Tan, R. Design by analogy: Achieving more patentable ideas from one creative design. Chin. J. Mech. Eng. 2018, 31, 10. [Google Scholar] [CrossRef]
- Bai, Y.; Chou, L.; Zhang, W. Industrial innovation characteristics and spatial differentiation of smart grid technology in China based on patent mining. J. Energy Storage 2021, 43, 103289. [Google Scholar] [CrossRef]
- Kim, J.; Lee, J.; Kim, G.; Park, S.; Jang, D. A Hybrid Method of Analyzing Patents for Sustainable Technology Management in Humanoid Robot Industry. Sustainability 2016, 8, 474. [Google Scholar] [CrossRef]
- Wu, Y.; Ji, Y.; Gu, F. Identifying firm-specific technology opportunities in a supply chain: Link prediction analysis in multilayer networks. Expert. Syst. Appl. 2023, 213, 119053. [Google Scholar] [CrossRef]
- Pölzlbauer, G.; Auer, E. Applied patent mining with topic models and meta-data: A comprehensive case study. World Pat. Inf. 2021, 67, 102065. [Google Scholar] [CrossRef]
- Kim, S.; Yoon, B. Patent infringement analysis using a text mining technique based on SAO structure. Comput. Ind. 2021, 125, 103379. [Google Scholar] [CrossRef]
- Srinivasan, V.; Song, B.; Luo, J.; Subburaj, K.; Elara, M.R.; Blessing, L.; Wood, K. Does analogical distance affect performance of ideation? J. Mech. Des. 2018, 140, 71101. [Google Scholar] [CrossRef]
- Zhang, Z.; Guo, J.; Zhang, H.; Zhou, L.; Wang, M. Product selection based on sentiment analysis of online reviews: An intuitionistic fuzzy TODIM method. Complex. Intell. Syst. 2022, 8, 3349–3362. [Google Scholar] [CrossRef]
- Blei, D.; Ng, A.Y.; Jordan, M.L. Latent Dirichlet Allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Leng, B.; Zeng, J.; Yao, M.; Xiong, Z. 3D object retrieval with multitopic model combining relevance feedback and LDA model. IEEE Trans. Image Process. 2015, 24, 94–105. [Google Scholar] [CrossRef]
- Lienou, M.; Maitre, H.; Datcu, M. Semantic annotation of satellite images using latent dirichlet allocation. IEEE Geosci. Remote Sens. Lett. 2010, 7, 28–32. [Google Scholar] [CrossRef]
- Likas, A.; Vlassis, N.; Verbeek, J.J. The global k-means clustering algorithm. Pattern Recogn. 2003, 36, 451–461. [Google Scholar] [CrossRef]
- Qi, J.; Zhang, Z.; Jeon, S.; Zhou, Y. Mining customer requirements from online reviews: A product improvement perspective. Inf. Manag. 2016, 53, 951–963. [Google Scholar] [CrossRef]
- Ireland, R.; Liu, A. Application of data analytics for product design: Sentiment analysis of online product reviews. CIRP J. Manuf. Sci. Technol. 2018, 23, 128–144. [Google Scholar] [CrossRef]
- Kim, H.; Noh, Y. Elicitation of design factors through big data analysis of online customer reviews for washing machines. J. Mech. Sci. Technol. 2019, 33, 2785–2795. [Google Scholar] [CrossRef]
- Chen, Z.; Ming, X.; Zhang, X.; Yin, D.; Sun, Z. A rough-fuzzy DEMATEL-ANP method for evaluating sustainable value requirement of product service system. J. Clean. Prod. 2019, 228, 485–508. [Google Scholar] [CrossRef]
- Ziarko, W. Variable precision rough set model. J. Comput. Syst. Sci. 1993, 46, 39–59. [Google Scholar] [CrossRef]
- Lzak, D. Rough. Sets and Bayes Factor; Springer: Berlin/Heidelberg, Germany, 2005; pp. 202–229. [Google Scholar]
- Martilla, J.A.; James, J.C. Importance-Performance Analysis. J. Mark. 1977, 1, 77–79. [Google Scholar] [CrossRef]
- Mikulić, J.; Prebežac, D. Accounting for dynamics in attribute-importance and for competitor performance to enhance reliability of BPNN-based importance–performance analysis. Expert. Syst. Appl. 2012, 39, 5144–5153. [Google Scholar] [CrossRef]
- DiPietro, R.B.; Levitt, J.A.; Taylor, S.; Nierop, T. First-time and repeat tourists’ perceptions of authentic Aruban restaurants: An importance-performance competitor analysis. J. Destin. Mark. Manag. 2019, 14, 100366. [Google Scholar] [CrossRef]
- Cao, B.; Liu, X.F.; Liu, J.; Tang, M. Domain-aware Mashup service clustering based on LDA topic model from multiple data sources. Inf. Softw. Tech. 2017, 90, 40–54. [Google Scholar] [CrossRef]
- Jiang, Y.; Li, M.; Dennis, A.; Liao, X.; Ampaw, E.M. The Hotspots and Trends in the Literature on Cleaner Production: A Visualized Analysis Based on Citespace. Sustainability 2022, 14, 9002. [Google Scholar] [CrossRef]
- Chen, W.; Shi, X.; Fang, X.; Yu, Y.; Tong, S. Research Context and Prospect of Green Railways in China Based on Bibliometric Analysis. Sustainability 2023, 15, 5773. [Google Scholar] [CrossRef]
- Blei, D.; Ng, A.; Jordan, M. Latent Dirichlet Allocation. Advances in Neural Information Processing Systems 14. In Proceedings of the Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, Vancouver, BC, Canada, 3–8 December 2001. [Google Scholar]
- Yang, Q. LDA-based Topic Mining Research on China’s Government Data Governance Policy. Soc. Secur. Adm. Manag. 2022, 3, 2. [Google Scholar]
- Triayudi, A.; Haerani, R. Data Mining K-Means Algorithm for Performance Analysis. J. Phys. Conf. Ser. 2022, 2394, 1. [Google Scholar] [CrossRef]
- Tang, C.; Wen, T.; Liang, Z.; Xu, X.; Mou, W. Fast acquisition method using modified PCA with a sparse factor for burst DS spread-spectrum transmission. ICT Express 2022, (in press). [Google Scholar] [CrossRef]
- Nguyen, H.T.; Safder, U.; Kim, J.; Heo, S.; Yoo, C. An adaptive safety-risk mitigation plan at process-level for sustainable production in chemical industries: An integrated fuzzy-HAZOP-best-worst approach. J. Clean. Prod. 2022, 10, 339. [Google Scholar] [CrossRef]
- Mukherjee, P.; Pattnaik, P.K.; Al-Absi, A.A.; Kang, D.-K. Recommended System for Cluster Head Selection in a Remote Sensor Cloud Environment Using the Fuzzy-Based Multi-Criteria Decision-Making Technique. Sustainability 2021, 13, 10579. [Google Scholar] [CrossRef]
Algorithm | Precision | Recall | F-Measures |
---|---|---|---|
LDA | 0.74 | 0.70 | 0.719 |
K-means | 0.62 | 0.59 | 0.604 |
LDA + K-means | 0.83 | 0.80 | 0.814 |
Innovative Design Features | Correlation Basis |
---|---|
Adaptability | High degree of automation, strong adaptability to processing objects |
Reliability | Maintaining cutting accuracy, low failure, feed rate, spindle speed |
Economy | Low energy consumption, low maintenance cost, easy disassembly and assembly |
Environmental protection | Environmentally friendly, reconfigurable, and recyclable materials |
Economy | DR1 | DR2 | DR3 | DR4 | DR5 | DR6 | DR7 | DR8 |
---|---|---|---|---|---|---|---|---|
DR1 | 0 | 6 | 9 | 7 | 9 | 9 | 8 | 6 |
DR2 | 4 | 0 | 7 | 6 | 8 | 8 | 6 | 5 |
DR3 | 2 | 4 | 0 | 4 | 6 | 6 | 5 | 2 |
DR4 | 3 | 5 | 6 | 0 | 7 | 7 | 7 | 5 |
DR5 | 2 | 3 | 4 | 4 | 0 | 5 | 5 | 2 |
DR6 | 2 | 3 | 4 | 4 | 5 | 0 | 6 | 2 |
DR7 | 3 | 5 | 4 | 4 | 5 | 4 | 0 | 3 |
DR8 | 7 | 6 | 8 | 6 | 9 | 9 | 8 | 0 |
Innovative Design Requirements | DR1 | DR2 | DR3 | DR4 | DR5 | DR6 | DR7 | DR8 | PNi | ωi |
---|---|---|---|---|---|---|---|---|---|---|
DR1 | [0, 0] | [6, 7] | [7, 8] | [6, 7] | [7, 8] | [7, 8] | [7, 8] | [6, 7] | [46, 53] | 0.177419 |
DR2 | [4, 5] | [0, 0] | [5, 6] | [4, 5] | [6, 7] | [5, 6] | [5, 6] | [3, 4] | [32, 45] | 0.137993 |
DR3 | [2, 3] | [4, 5] | [0, 0] | [4, 5] | [5, 6] | [4, 5] | [5, 6] | [3, 4] | [27, 34] | 0.109319 |
DR4 | [4, 5] | [5, 6] | [4, 5] | [0, 0] | [5, 6] | [5, 6] | [4, 5] | [4, 5] | [31, 38] | 0.123656 |
DR5 | [2, 3] | [4, 5] | [3, 4] | [4, 5] | [0, 0] | [4, 5] | [4, 5] | [3, 4] | [24, 31] | 0.098566 |
DR6 | [2, 3] | [4, 5] | [4, 5] | [4, 5] | [4, 5] | [0, 0] | [5, 6] | [3, 4] | [26, 33] | 0.105735 |
DR7 | [2, 3] | [4, 5] | [4, 5] | [3, 4] | [3, 4] | [3, 4] | [0, 0] | [3, 4] | [22, 29] | 0.091398 |
DR8 | [4, 5] | [5, 6] | [6, 7] | [5, 6] | [6, 7] | [7, 8] | [7, 8] | [0, 0] | [40, 47] | 0.155914 |
User Requirements | Average Performance | User Requirements | Average Performance |
---|---|---|---|
DR1 | 0.506403 | DR5 | 0.508715 |
DR2 | 0.546499 | DR6 | 0.488646 |
DR3 | 0.450823 | DR7 | 0.547021 |
DR4 | 0.434433 | DR8 | 0.506403 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hao, J.; Gao, X.; Liu, Y.; Han, Z. Acquisition Method of User Requirements for Complex Products Based on Data Mining. Sustainability 2023, 15, 7566. https://doi.org/10.3390/su15097566
Hao J, Gao X, Liu Y, Han Z. Acquisition Method of User Requirements for Complex Products Based on Data Mining. Sustainability. 2023; 15(9):7566. https://doi.org/10.3390/su15097566
Chicago/Turabian StyleHao, Juan, Xinqin Gao, Yong Liu, and Zhoupeng Han. 2023. "Acquisition Method of User Requirements for Complex Products Based on Data Mining" Sustainability 15, no. 9: 7566. https://doi.org/10.3390/su15097566
APA StyleHao, J., Gao, X., Liu, Y., & Han, Z. (2023). Acquisition Method of User Requirements for Complex Products Based on Data Mining. Sustainability, 15(9), 7566. https://doi.org/10.3390/su15097566