Patent Keyword Extraction for Sustainable Technology Management
Abstract
:1. Introduction
2. Related Work and Literature Review
2.1. Patent Analysis
2.2. Keyword Extraction
2.3. Papers vs. Patents
2.4. Document Similarity
3. Patent Keyword Extraction for Sustainable Technology Management
3.1. Quantitative Analysis for Sustainable Technology Management
3.2. Proposed Methodology for Keyword Extraction
4. Experimental Results and Analysis
4.1. Papers and Patents Data Acquisition
4.2. Verification of Document Similarity Between Papers and Patents
4.3. Performance Evaluation of Existing Method of Keyword Extraction
4.4. Statistical Exploration of a New Method for Keyword Extraction
4.5. Patent Keywords Extraction
5. Conclusions
Acknowledgments
Author Contributions
Conflicts of Interest
References
- Brent, A.C.; Pretorius, M.W. Sustainable development: A conceptual framework for the technology management field of knowledge and a departure for further research. South Afr. J. Ind. Eng. 2008, 19, 31–52. [Google Scholar] [CrossRef]
- Lee, S.; Yoon, B.; Park, Y. An approach to discovering new technology opportunities: Keyword-based patent map approach. Technovation 2009, 29, 481–497. [Google Scholar] [CrossRef]
- Holger, E. Patent information for strategy technology management. World Patent Inf. 2003, 25, 233–242. [Google Scholar]
- Noh, H.; Jo, Y.; Lee, S. Keyword selection and processing strategy for applying text mining to patent analysis. Expert Syst. Appl. 2015, 42, 4348–4360. [Google Scholar] [CrossRef]
- Yan, L.; Leuo, W.; Chao, H. Extracting the significant-rare keywords for patent analysis. Expert Syst. Appl. 2009, 36, 5200–5204. [Google Scholar]
- Magerman, T.; Van Looy, B.; Song, X. Exploring the feasibility and accuracy of latent semantic analysis based text mining techniques to detect similarity between patent documents and scientific publications. Scientometircs 2010, 82, 289–306. [Google Scholar] [CrossRef]
- Chang, L.; Bok, K.; June, S. Novelty-focusted patent mapping for technology opportunity analysis. Technol. Forecast. Soc. Chang. 2015, 90, 355–365. [Google Scholar]
- Aviv, S.; Jussi, K.; Chil, J.; Jae, L. Analyzing multilingual knowledge innovation in patents. Expert Syst. Appl. 2013, 40, 7010–7023. [Google Scholar]
- Dorothy, B. Core capabilities and core rigidities: A paradox in managing new product development. Strateg. Manag. J. 1992, 13, 111–125. [Google Scholar]
- Kim, Y.; Suh, J.; Park, S. Visualization of patent analysis for emerging technology. Expert Syst. Appl. 2008, 34, 1804–1812. [Google Scholar] [CrossRef]
- Grimaldi, M.; Cricelli, L.; Di Giovanni, M.; Rogo, F. The patent portfolio value analysis: A new framework to leverage patent information for strategic technology planning. Technol. Forecast. Soc. Chang. 2015, 94, 286–302. [Google Scholar] [CrossRef]
- Holger, E.; Legler, S.; Lichtenthaler, U. Determinants of patent value: Insights from a simulation analysis. Technol. Forecast. Soc. Chang. 2010, 77, 1–19. [Google Scholar]
- OuYang, K.; Weng, C.S. A new comprehensive patent analysis approach for new product design in mechanical engineering. Technol. Forecast. Soc. Chang. 2011, 78, 1183–1199. [Google Scholar] [CrossRef]
- Alexander, B.; Petra, A.N.; Emma, L.H. Open innovation and intellectual property rights. J. Mech. Des. Manag. Decis. 2011, 55, 1285–1306. [Google Scholar]
- Alexander, B.; Petra, A.N.; Gerd, S. Innovation and de facto standardization: The influence of dominant design on innovative performance, radical innovation, and process innovation. Technovation 2016, 50–51, 79–88. [Google Scholar]
- Tseng, Y.; Lin, C.; Lin, Y. Text mining techniques for patent analysis. Inf. Process. Manag. 2007, 43, 1216–1247. [Google Scholar] [CrossRef]
- Liu, S.; Liao, H.; Pi, S.; Hu, J. Development of a patent retrieval and analysis platform—A hybrid approach. Expert Syst. Appl. 2011, 38, 7864–7868. [Google Scholar] [CrossRef]
- Chen, H.; Zhang, G.; Zhu, D.; Lu, J. Topic-based technological forecasting based on patent data: A case study of Australian patents from 2000 to 2014. Technol. Forecast. Soc. Chang. 2017, 119, 39–52. [Google Scholar] [CrossRef]
- Cheong, H.; Chiu, I.; Shu, L.; Stone, R.; McAdams, D. Biologically meaningful keywords for functional terms of the functional basis. J. Mech. Des. 2011, 133, 1–11. [Google Scholar] [CrossRef]
- Xie, Z.; Miyazaki, K. Evaluating the effectiveness of keyword search strategy for patent identification. World Patent Inf. 2013, 35, 20–30. [Google Scholar] [CrossRef]
- Fiona, M. Innovation as co-evolution of scientific and technological networks: Exploring tissue engineering. Res. Policy 2002, 31, 1389–1403. [Google Scholar]
- Singhal, A. Modern Information Retrieval: A Brief Overview. Bull. IEEE Comput. Soc. Tech. Comm. Data Eng. 2001, 24, 35–43. [Google Scholar]
- Moehrle, M. Measures for textual patent similarities: A guided way to select appropriate approaches. Scientometrics 2010, 85, 95–109. [Google Scholar] [CrossRef]
- Shibata, N.; Kajikawa, Y.; Sakata, I. Detecting potential technological fronts by comparing scientific papers and patents. Foresight 2011, 13, 51–60. [Google Scholar] [CrossRef]
- Kim, S.; Jang, D.; Jun, S.; Park, S. A Novel Forecasting Methodology for Sustainable Management of Defense Technology. Sustainability 2015, 7, 16720–16736. [Google Scholar] [CrossRef]
- Choi, J.; Jang, D.; Jun, S.; Park, S. A predictive model of technology transfer using patent analysis. Sustainability 2015, 7, 16175–16195. [Google Scholar] [CrossRef]
- Choi, J.; Jun, S.; Park, S. A patent analysis for sustainable technology management. Sustainability 2016, 8, 688. [Google Scholar] [CrossRef]
- Kim, J.; Lee, J.; Kim, G.; Park, S.; Jang, D. A hybrid method of analyzing patents for sustainable technology management in humanoid robot industry. Sustainability 2016, 8, 474. [Google Scholar] [CrossRef]
- Ye, J. Cosine similarity measures for intuitionistic fuzzy sets and their applications. Math. Comput. Model. 2011, 53, 91–97. [Google Scholar] [CrossRef]
- Gerstenkorn, T.; Manko, J. Correlation of intuitionistic fuzzy sets. Fuzzy Sets Syst. 1991, 44, 39–43. [Google Scholar] [CrossRef]
- WIPSON. 2017. Available online: http://www.wipson.com/service/mai/main.wips (accessed on 11 May 2017).
- Google. 2017. Available online: http://scholar.google.co.kr (accessed on 11 May 2017).
- Xerox Corporation. Category and Term Polarity Mutual Annotation for Aspect-Based Sentiment Analysis. US Patent No. 9690772, 5 December 2014. [Google Scholar]
Analysis Data | Database | Search Words | Period |
---|---|---|---|
Patents (30) | WIPSON | “Text mining” | 1 January 200–31 December 2015 |
Papers (30) | Google Scholar | “Text mining” | 1 January 200–31 December 2015 |
Group | N | Mean | SD | SE |
---|---|---|---|---|
Patents | 30 | 0.6473 | 0.0694 | 0.0127 |
Papers | 30 | 0.6179 | 0.0550 | 0.0100 |
Papers and Patents | 60 | 0.6450 | 0.0636 | 0.0082 |
Total | 120 | 0.6388 | 0.0638 | 0.0058 |
SS | DF | MS | F | P | |
---|---|---|---|---|---|
Between group | 0.018 | 2 | 0.009 | 2.201 | 0.115 |
Within group | 0.466 | 117 | 0.004 | ||
Total | 0.484 | 119 |
No. | Author Keywords | The Number of Keywords | Precision (Accuracy) | TF-IDF Threshold |
---|---|---|---|---|
1 | Data; Mapping; Analysis; Clustering; Techniques; Competitive; Intelligence; Assigned; Classifications; Results; Validation; Linguistic; Technology; Packaging; Patent | 15 | 0.133 | 0.0760 |
2 | Data; Visualization tools; Competitive; Intelligence; Property; Review | 6 | 0 | 0.0995 |
3 | New; Business; Areas; Technological; Strength; Patent; Information; Data; Envelopment; Analysis; Text; Mining | 12 | 0.167 | 0.0608 |
4 | Text; Mining; Feature; Extraction; Categorization; Clustering; Customer; Relationship; Management | 9 | 0.111 | 0.0944 |
5 | Text; Mining; Taxonomy; Construction; Term; Extraction | 6 | 0 | 0.0680 |
6 | Text; Mining; Natural; Language; Processing; Information; Extraction; Summarization; Image; Question; Answering; Literature-Based; Discovery; Evaluation; User; Orientation | 16 | 0.125 | 0.0843 |
7 | M&A; Target; Selection; Technology; Acquisition; Patent; Analysis; Subject–Action–Object; Similarity | 9 | 0.222 | 0.0796 |
8 | Chance; Discovery; Text; Mining; Patent; Analysis; Significant-Rare | 7 | 0.143 | 0.1072 |
9 | Text; Mining; Hong; Kong; Hotels; Competitor; Intelligence; Marketing | 8 | 0 | 0.0944 |
10 | Design; Rationale; Representation; Discovery; Text; Mining; Patent | 7 | 0 | 0.0860 |
11 | Patent; Analysis; Competitor; Company; Ranking; Social; Network | 7 | 0.143 | 0.0562 |
12 | Conjoint; Analysis; Hybrid; Approach; Morphology; Patent; Information; Technology; Forecasting | 9 | 0.222 | 0.0550 |
13 | Technological; Opportunity; Discovery; Morphology; Analysis; Text; Mining | 7 | 0.143 | 0.0771 |
14 | Patent; Analysis; Text; Mining; Visualization; Techniques; Natural; Language; Processing | 9 | 0 | 0.0718 |
15 | Patent; Content; Representation; Retrieval; Extraction; Paraphrasing; Summarization; Visualization; Navigation; Valuing; PATExpert; Classification; Translation; Documentation; Ontologies; Knowledge; Base | 17 | 0.176 | 0.0673 |
16 | Monitoring; Technology; Intelligence; Patent; Analysis; Formal; Concept; Dynamic; Lattice | 9 | 0.222 | 0.0629 |
17 | Knowledge; Discovery; Text; Mining; Patent; Databases; Linguistic; Preprocessing; Correspondence; Analysis; Cluster | 11 | 0.182 | 0.0832 |
18 | Information; Retrieval; Text; Mining; Performance; Medical; Documentation | 7 | 0 | 0.0608 |
19 | Patent; Analysis, Knowledge; Discovery; Information; Visualization; Self; Organizing; Map; Citation; Networks; Nanoscale; Science; Engineering; Nanotechnology; Technological; Innovation; International; Interactions | 19 | 0.158 | 0.0393 |
20 | Text; Mining; Knowledge; Discovery; Post; Project; Reviews; Manufacturing; Construction | 9 | 0.222 | 0.0697 |
21 | Text; Mining; Data; Visualization; Tools; Patent; Information; Intellectual; Property; Analysis; Landscape; Business | 12 | 0.083 | 0.0665 |
22 | Patent; Mining; Retrieval; Vector; Space; Model | 6 | 0.167 | 0.0858 |
23 | Chemical; Named; Entity; Recognition; Conditional; Random; Fields; Text; Mining | 9 | 0.222 | 0.0912 |
24 | Text; Mining; Word; Distribution; Zipf’s; Law; STN; AnaVist; Thomson; Aureka; OmniViz; Stopwords; Patent; Mapping | 14 | 0.071 | 0.0790 |
25 | Business; Intelligence; Competitive; Advantage; Data; Mining; Information; Systems; Knowledge; Discovery | 10 | 0 | 0.0626 |
26 | Open; Source; Text; Information; Mining; Analysis; Multilinguality; Automated; Media; Monitoring | 10 | 0.200 | 0.1086 |
27 | Technical; Intelligence; Bibliometrics; Foresight; Management; Rapid; Analyses; Mining; Text; Knowledge; Discovery; Databases | 12 | 0 | 0.1216 |
28 | Text; Mining; Theory; Application | 4 | 0 | 0.1193 |
29 | Text; Mining; Summarization; Feature; Extraction; Patent; Classification; Clustering | 8 | 0.125 | 0.0626 |
30 | Text; Mining; Information; Retrieval | 4 | 0 | 0.1384 |
Average | 9.5 | 0.108 | 0.0810 |
Group | N | Mean | SD | SE |
---|---|---|---|---|
Abstract | 30 | 0.1079 | 0.0864 | 0.0158 |
Introduction | 30 | 0.2318 | 0.1376 | 0.0251 |
Conclusion | 30 | 0.1980 | 0.1554 | 0.0284 |
Total | 120 | 0.1792 | 0.1387 | 0.0146 |
SS | DF | MS | F | P | |
---|---|---|---|---|---|
Between group | 0.246 | 2 | 0.123 | 7.303 | 0.001 |
Within group | 1.466 | 87 | 0.017 | ||
Total | 1.712 | 89 |
(I) Component | (J) Component | Mean Difference | SE | P |
---|---|---|---|---|
Abstract | Introduction Conclusion | −0.1239 −0.0901 | 0.0297 0.0325 | 0.000 0.024 |
Introduction | Abstract Conclusion | 0.1239 0.0338 | 0.0297 0.0379 | 0.000 0.754 |
Conclusion | Introduction Abstract | 0.0901 −0.0338 | 0.0325 0.0379 | 0.024 0.754 |
No. | Author Keywords | Simultaneous Appearance Keywords in Three Components | Simultaneous Appearance Keywords in Two Components | ||||
---|---|---|---|---|---|---|---|
Keywords | Precision | Accuracy | Keywords | Precision | Accuracy | ||
1 | 15 | 8 | 0.500 | 0.267 | 18 | 0.222 | 0.267 |
2 | 6 | 15 | 0.200 | 0.333 | 15 | 0.000 | 0.000 |
3 | 12 | 19 | 0.579 | 0.917 | 27 | 0.000 | 0.000 |
4 | 9 | 7 | 0.286 | 0.222 | 18 | 0.111 | 0.222 |
5 | 6 | 21 | 0.143 | 0.500 | 26 | 0.077 | 0.333 |
6 | 16 | 5 | 0.400 | 0.125 | 24 | 0.167 | 0.250 |
7 | 9 | 20 | 0.150 | 0.333 | 33 | 0.121 | 0.444 |
8 | 7 | 2 | 1.000 | 0.286 | 18 | 0.167 | 0.429 |
9 | 7 | 9 | 0.556 | 0.714 | 196 | 0.000 | 0.000 |
10 | 7 | 13 | 0.231 | 0.429 | 18 | 0.111 | 0.286 |
11 | 7 | 23 | 0.261 | 0.857 | 18 | 0.000 | 0.000 |
12 | 9 | 29 | 0.241 | 0.778 | 57 | 0.035 | 0.222 |
13 | 7 | 15 | 0.400 | 0.857 | 30 | 0.033 | 0.143 |
14 | 9 | 11 | 0.273 | 0.333 | 19 | 0.158 | 0.333 |
15 | 16 | 10 | 0.400 | 0.250 | 19 | 0.105 | 0.125 |
16 | 9 | 28 | 0.286 | 0.889 | 37 | 0.000 | 0.000 |
17 | 11 | 6 | 0.500 | 0.273 | 19 | 0.105 | 0.182 |
18 | 7 | 5 | 0.800 | 0.571 | 18 | 0.056 | 0.143 |
19 | 19 | 19 | 0.526 | 0.526 | 39 | 0.051 | 0.105 |
20 | 9 | 20 | 0.400 | 0.889 | 32 | 0.000 | 0.000 |
21 | 12 | 13 | 0.615 | 0.667 | 23 | 0.043 | 0.083 |
22 | 6 | 11 | 0.455 | 0.833 | 18 | 0.000 | 0.000 |
23 | 9 | 12 | 0.500 | 0.667 | 18 | 0.111 | 0.222 |
24 | 14 | 12 | 0.333 | 0.286 | 20 | 0.000 | 0.000 |
25 | 10 | 11 | 0.182 | 0.200 | 24 | 0.292 | 0.700 |
26 | 10 | 8 | 0.250 | 0.200 | 14 | 0.429 | 0.600 |
27 | 12 | 9 | 0.444 | 0.333 | 59 | 0.068 | 0.333 |
28 | 4 | 5 | 0.400 | 0.500 | 11 | 0.090 | 0.250 |
29 | 8 | 15 | 0.333 | 0.625 | 27 | 0.000 | 0.000 |
30 | 4 | 16 | 0.063 | 0.250 | 20 | 0.050 | 0.250 |
Average | 9.5 | 13.2 | 0.388 | 0.497 | 24.6 | 0.087 | 0.197 |
No | Author Keywords | Keywords | Accuracy | Precision |
---|---|---|---|---|
1 | 15 | 21 | 0.533 | 0.381 |
2 | 6 | 7 | 0.833 | 0.714 |
3 | 12 | 12 | 0.500 | 0.500 |
4 | 9 | 12 | 0.333 | 0.250 |
5 | 6 | 7 | 0.667 | 0.571 |
6 | 16 | 17 | 0.250 | 0.235 |
7 | 9 | 9 | 0.333 | 0.333 |
8 | 7 | 9 | 0.571 | 0.444 |
9 | 7 | 8 | 0.286 | 0.250 |
10 | 7 | 7 | 0.143 | 0.143 |
11 | 7 | 9 | 0.429 | 0.333 |
12 | 9 | 9 | 0.444 | 0.444 |
13 | 7 | 7 | 0.286 | 0.286 |
14 | 9 | 11 | 0.444 | 0.364 |
15 | 16 | 18 | 0.313 | 0.278 |
16 | 9 | 10 | 0.778 | 0.700 |
17 | 11 | 17 | 0.455 | 0.294 |
18 | 7 | 12 | 0.714 | 0.417 |
19 | 19 | 21 | 0.421 | 0.391 |
20 | 9 | 11 | 0.444 | 0.364 |
21 | 12 | 14 | 0.583 | 0.500 |
22 | 6 | 6 | 0.333 | 0.333 |
23 | 9 | 9 | 0.222 | 0.222 |
24 | 14 | 20 | 0.286 | 0.200 |
25 | 10 | 10 | 0.400 | 0.400 |
26 | 10 | 16 | 0.300 | 0.188 |
27 | 12 | 15 | 0.333 | 0.267 |
28 | 4 | 6 | 0.500 | 0.333 |
29 | 8 | 8 | 0.375 | 0.375 |
30 | 4 | 4 | 0.250 | 0.250 |
Average | 9.5 | 11.4 | 0.425 | 0.359 |
TF-IDF | Co-Word Analysis | Frequency Analysis | |
---|---|---|---|
Number of keywords | 9.5 | 13.2 | 11.4 |
Precision (%) | 10.8 | 38.8 | 35.9 |
Accuracy (%) | 10.8 | 49.7 | 42.5 |
No. | Keywords | Precision | Accuracy |
---|---|---|---|
1 | 8 | 0.500 | 0.267 |
2 | 8 | 0.375 | 0.500 |
3 | 14 | 0.786 | 0.917 |
4 | 5 | 0.400 | 0.222 |
5 | 13 | 0.231 | 0.500 |
6 | 3 | 0.667 | 0.125 |
7 | 10 | 0.300 | 0.333 |
8 | 2 | 1.000 | 0.286 |
9 | 5 | 0.800 | 0.571 |
10 | 6 | 0.167 | 0.143 |
11 | 17 | 0.353 | 0.857 |
12 | 12 | 0.583 | 0.778 |
13 | 7 | 0.571 | 0.571 |
14 | 5 | 0.600 | 0.333 |
15 | 8 | 0.500 | 0.250 |
16 | 17 | 0.471 | 0.889 |
17 | 5 | 0.600 | 0.273 |
18 | 4 | 1.000 | 0.571 |
19 | 12 | 0.667 | 0.421 |
20 | 9 | 0.778 | 0.778 |
21 | 8 | 0.875 | 0.583 |
22 | 7 | 0.571 | 0.667 |
23 | 5 | 0.600 | 0.333 |
24 | 7 | 0.571 | 0.286 |
25 | 3 | 0.667 | 0.200 |
26 | 5 | 0.400 | 0.200 |
27 | 5 | 0.800 | 0.333 |
28 | 4 | 0.500 | 0.500 |
29 | 10 | 0.400 | 0.500 |
30 | 7 | 0.143 | 0.250 |
Average | 7.7 | 0.562 | 0.448 |
TF-IDF | Co-word Analysis | Frequency Analysis | Combine Analysis | |
---|---|---|---|---|
Number of keywords | 9.5 | 13.2 | 11.4 | 7.7 |
Precision (%) | 10.8 | 38.8 | 35.9 | 56.2 |
Accuracy (%) | 10.8 | 49.7 | 42.5 | 44.8 |
Difference of Mean | T | P-Value | |
---|---|---|---|
Precision | −0.174 | −3.203 | 0.002 |
Accuracy | 0.051 | 0.825 | 0.413 |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, J.; Choi, J.; Park, S.; Jang, D. Patent Keyword Extraction for Sustainable Technology Management. Sustainability 2018, 10, 1287. https://doi.org/10.3390/su10041287
Kim J, Choi J, Park S, Jang D. Patent Keyword Extraction for Sustainable Technology Management. Sustainability. 2018; 10(4):1287. https://doi.org/10.3390/su10041287
Chicago/Turabian StyleKim, Jongchan, Jaehyun Choi, Sangsung Park, and Dongsik Jang. 2018. "Patent Keyword Extraction for Sustainable Technology Management" Sustainability 10, no. 4: 1287. https://doi.org/10.3390/su10041287