Next Article in Journal
Application of Reeds as Carbon Source for Enhancing Denitrification of Low C/N Micro-Polluted Water in Vertical-Flow Constructed Wetland
Previous Article in Journal
Experimental Research of the Structure Condition Using Geodetic Methods and Crackmeter
 
 
Article
Peer-Review Record

Effective Identification of Technological Opportunities for Radical Inventions Using International Patent Classification: Application of Patent Data Mining

Appl. Sci. 2022, 12(13), 6755; https://doi.org/10.3390/app12136755
by Wendan Yang 1,2, Guozhong Cao 1,2,*, Qingjin Peng 3, Junlei Zhang 2,4 and Chuan He 1,2
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Appl. Sci. 2022, 12(13), 6755; https://doi.org/10.3390/app12136755
Submission received: 14 June 2022 / Revised: 26 June 2022 / Accepted: 27 June 2022 / Published: 3 July 2022

Round 1

Reviewer 1 Report

The authors investigate features of the international patent classification system as a basis for developing a tool with which to identify technological opportunities.

A well written paper with great detail.

Descriptions of each patent is learnt using NLP, K-Means clustering and IPC Network Analysis.

Improvements can be made with respect to the K-Means Clustering.

1. How was the number of clusters determined?

2. The clusters do not appear to be distinct. See figure 12. Need to improve the clusters to be more distinct. Try using PCA on the input data and then input the PCA variables into the K-Means cluster algorithm, to help make the clusters more distinct.

3. Perform post analysis on the cluster analysis. For example, perform an ANOVA test using each input variable to validate that the means for each each cluster are significantly different.

 

Author Response

Authors’ response to the reviewer 1 comments

Point 1: The authors investigate features of the international patent classification system as a basis for developing a tool with which to identify technological opportunities. A well written paper with great detail. Descriptions of each patent is learnt using NLP, K-Means clustering and IPC Network Analysis. Improvements can be made with respect to the K-Means Clustering. 1. How was the number of clusters determined?

Authors’ response: We are very grateful for your valuable comments and questions. Before using the K-Means clustering, we actually used the “perplexity” of Latent Dirichlet Allocation to determine an ideal number of clusters. Although there are different methods available for determining the optimal number of clusters, such as PCA (Principal Components Analysis), PAM (Partitioning Around Medoids), Calinsky criterion, and Gap Statistic, etc., the perplexity was chosen since it is the most widely applied and well-developed method.

We have added the explanation for the number decision. Details are as follows.

(Section 3.2.3, Paragraph 6): The number of clusters should be determined before using the K–means clustering. The Latent Dirichlet Allocation’s perplexity is selected to estimate the appropriate number of clusters since it is most extensively used and well–developed.

 

Point 2: 2. The clusters do not appear to be distinct. See figure 12. Need to improve the clusters to be more distinct. Try using PCA on the input data and then input the PCA variables into the K-Means cluster algorithm, to help make the clusters more distinct.

Authors’ response: Thanks for your suggestions. We have studied the suggested PCA carefully and found it has the potential to decrease the dimensions of highly complex data. However, the effect of the PCA is duplicated with that of the “perplexity” that we used.

We have added some descriptions and a new Figure 12 in Section 4.3 (Paragraph 1) to explain it as follows.

Next, a coherence model in Gensim is used to estimate the appropriate number of clusters. The perplexity for the different number of clusters is shown in Figure 12. It shows that the smaller value the perplexity indicates a better number of clusters. The cluster K value is thus set to 12.

 

To improve the distinction of clusters, noisy data should be removed from keywords extracted from patents. The Natural Language Processing technology and deep learning algorithms can be used to make the clusters more distinct.

We have added future research tasks in Section 6 (Paragraph 3) as follows.

In addition, the Natural Language Processing can extract meaningful keywords from the background art of patents, but the results contain some noisy data. In the future, methods to facilitate the extraction of features and reduce noise will be introduced by adding updated natural language processing technology and using deep learning algorithms, such as Subject–Action–Object analysis, dependency parsing, and structure topic model.

 

Point 3: 3. Perform post analysis on the cluster analysis. For example, perform an ANOVA test using each input variable to validate that the means for each cluster are significantly different.

Authors’ response: Following the suggestion, we have performed an ANOVA test to validate the means for each cluster. However, its significance is less than 0.05. To this end, methods to facilitate the extraction of features and reduce noise will be introduced to expect better analysis results. We have added future research tasks in Section 6 (Paragraph 3) as follows.

(Section 4.3, Paragraph 3): To expect better analysis results, we will also add multi-criteria to input variables and run post-analysis on the results, such as a MANOVA test.

Author Response File: Author Response.docx

Reviewer 2 Report

The manuscript "Effective identification of technological opportunities for radical inventions using International Patent Classification: application of patent data mining" is an interesting material, up-to-date, and of scientific value.

The authors correctly conducted a literature review, focusing on works strongly related to their manuscript.

In the introduction, however, I lacked a sentence in the description of the structure, and what methods will be used.

The framework presented in the paper is clear and the description of the procedure.

The authors took care of the detail.

In the case study, I also miss an explanation of why the authors chose just such a time range.

The structure of the article does not raise any objections, and neither did the selected methods and the results. The research procedure is transparent.

In the conclusions, I propose to add the information to whom the survey is addressed, and who can use the results. Please also emphasize your innovation.

Author Response

Authors’ response to the reviewer 2 comments

Point 1: The manuscript "Effective identification of technological opportunities for radical inventions using International Patent Classification: application of patent data mining" is an interesting material, up-to-date, and of scientific value. The authors correctly conducted a literature review, focusing on works strongly related to their manuscript. In the introduction, however, I lacked a sentence in the description of the structure, and what methods will be used.

Authors’ response: We are very grateful for your valuable comments. We have expanded the introduction section to better describe the used methods as follows.

(Section 1, Paragraph 5): Having recognized these gaps, this paper proposes a method for identifying technological opportunities of RIs based on IPC. For identifying technological opportunities for RIs, it is necessary to distinguish it from general technological opportunities based on the characteristics of RIs. RIs show differentiated creativity because they come from solving inventive problems using new technology. Thus, values of difficulty (VOD) of the problems solved and the value of technological novelty (VON) of new technology are applied to distinguish technological opportunities for RIs from others. Using IPC symbols and order of filing dates of the patent set, technology manifested in a patent set with higher VONs is searched by a map of technology changes over time. Using the unstructured patent data, R&D themes of a patent set are decided by natural language processing and K–means clustering, and analyzed by the complex network analysis to determine VOD of each R&D theme. Finally, patents for technological opportunities of RIs are identified using an IPC–based coordinate system based on VOC and VOD.

 

Point 2: The framework presented in the paper is clear and the description of the procedure. The authors took care of the detail. In the case study, I also miss an explanation of why the authors chose just such a time range.

Authors’ response: Thanks for the comments.

As November 12, 2019 was the start date of this study, this specified time is mentioned for retrieval of the patent set. At the beginning of the study, a technical roadmap was developed for the study, including the research content, research methods and patent set to verify feasibility and effectiveness of the research results. In addition, it took about 2 years for Chinese patents to be granted after the initial disclosure. Therefore, to ensure the integrity of the annual statistics, we chose such a time range when studying granted patents.

We have revised the text with the retrieval date to make it clear as follows.

(Section 4, Paragraph 1): 19235 patents are gathered from Patsnap (https://analytics.zhihuiya.com) using the search query of “DESC: ("unmanned aerial vehicle" OR "UAV") AND PATENT_TYPE: (granted) AND APD: [20000101 TO 20191112]” (date: November 12, 2019).

As for DJI’s inventions, we have updated patent data in the text to describe the current trend as follows.

(Section 4.1): Patents belong to DJI are gathered from the SIPO using the search query of “Inventor = (Shenzhen Dajiang Innovation) AND DESC = ("unmanned aerial vehicle" OR "UAV" OR "camera drone") AND Invention_Type = (Invention) AND Application_Date ≤ (20220622)” (the retrieval date is June 23, 2022), 720 patents were collected.

Figure 10 illustrates the number of inventions about UAV and other products (camera stabilizer, advanced educational robots, pro accessories, etc.) DJI applied them from 2010 to 2022. According to the graph, the number of UAV inventions DJI applied yearly is more than that of other products, and UAV inventions account for more than 82% of the total. As presented in the graph, the application number of UAV inventions from 2010 to 2013 was only 1, 0, 3 and 3 respectively. However, there was a substantial growth in the number of inventions DJI applied for 50 in 2014 and 56 in 2015. The change in the number of inventions in 2013 and 2014 matches the fact that DJI first launched a drone named ‘Phantom’ in early 2013 and ‘Inspire’ in 2014 followed by ‘Spark’ series and ‘Mavic’ series in following years. It is clear from the data that, the number of annual applications has remained high for the next two years (197 inventions in 2016 and 250 inventions in 2017). In other words, DJI continues to maintain and even enhance its innovation capabilities. In addition, the number of filing patents in 2021 and 2022 shown in the figure are zero, which does not mean that DJI applied no patents in these two years. It is more likely that the patents filed by DJI in 2021 and 2022 have not been granted yet, as it takes about 2 years for Chinese patents to be granted after the initial disclosure. Likewise, the downward trend in the number of patents shown in the graph from 2018 to 2020 does not reflect the recent lack of innovation at DJI, as there may be patents filed during this period but not yet granted. Finally, 720 inventions of UAV constitute a patent set. Their application dates, IPC symbols and description parts were extracted.

 

Point 3: The structure of the article does not raise any objections, and neither did the selected methods and the results. The research procedure is transparent. In the conclusions, I propose to add the information to whom the survey is addressed, and who can use the results. Please also emphasize your innovation.

Authors’ response: Thanks for the comment and suggestion. We have added Discussion section (Section 5) to better explain to whom the survey is addressed, who can use the results, and with what our study contribute as follows.

The framework of identifying technological opportunities for radical inventions (RIs) is established in a novel way. Previous studies did not provide a clear understanding of conditions under which radical inventions (RIs) form, and usually identifying a large number of general technological opportunities before screening them out. This kind of post-screening is costly with limited guidance for improving effectiveness of the identification process. The method proposed in this study considers characteristics of RIs at an early stage and uses the value of difficulty (VOD) of the problems solved and the value of technological novelty (VON) of new technology applied to directly identify technological opportunities for RIs without extra cost on processing noisy data. It is helpful for managers and designers to quickly and accurately identify technological opportunities for RIs.

Patent maps are being developed for evaluating the technological novelty of patents. The previous studies have tended to deal with technical details in patent texts. However, after extracting a large number of micro-technical details from the patent text, it is still summarized into macro-technical points to represent the patent technology, which is not only time-consuming but also misleading. In comparison, using patent filing dates and international patent classification codes (IC), we built a novel patent map to determine the VON of patent technology. This method allows innovation managers quickly locate novel patents based on the high VON of IC to decide high novel of patents classified under the IC. At the same time, the method can help industrial firms to improve the utilization of structured patent data.

The use of natural language processing, K-means clustering and complex network analysis in unstructured patent data has been applied to various tasks. Natural language processing is to pre-process unstructured patent data into structured ones and combine the analysis results of K-means clustering and complex network analysis to get desired information intelligently. In this study, we use the intelligent computation of VOD for each paten. With the spread of artificial intelligence, theoretical research and practical application of intelligently technological opportunities analysis using patent data mining are continuously developed. This will reduce the difficulty of technology development and encourage more firms into radical innovation.

Author Response File: Author Response.docx

Reviewer 3 Report

The paper seems original, as evidenced by the small percentage 9% of similarity in anti-plagiarism software (see the PDF file in the attachment)

A few recommendations to improve the final aspect of the manuscript for publication:

I would recommend to the authors to be clearer why their study (and figures 10.11, tables 5.6 etc.) stops with the data in 2016. However, 6 years have passed since then, we are in 2022 with actual good and bad things, with periods of pandemic, war, etc. It would be interesting to know how these situations affected the field of study, and what is the current trend.

I recommend improving / changing the background of Figure 1, especially in Square III and Square IV for better text visibility for readers.

I also recommend enlarging figure 8 (actually consisting of 4 charts), the text is too small to be clearly visible. The same recommendation remains for figures 11,13,14.

I recommend that authors re-format tables 5 and 6 so that they are not broken on different pages. It is a pity for the information provided.

The paper seems to have a good theoretical foundation, as evidenced by the fairly large number of bibliographic references, and a section of Related Works (including Technological opportunities identification for radical inventions and Patents-based identification methods for technological opportunities). From this point of view, I would like to read more clearly in this article with what authors contribute in addition to existing studies, in the Results / Discussion section.

Comments for author File: Comments.pdf

Author Response

Authors’ response to the reviewer 3 comments

Point 1: The paper seems original, as evidenced by the small percentage 9% of similarity in anti-plagiarism software (see the PDF file in the attachment)

A few recommendations to improve the final aspect of the manuscript for publication:

I would recommend to the authors to be clearer why their study (and figures 10.11, tables 5.6 etc.) stops with the data in 2016. However, 6 years have passed since then, we are in 2022 with actual good and bad things, with periods of pandemic, war, etc. It would be interesting to know how these situations affected the field of study, and what is the current trend.

Authors’ response: Thank you very much for taking time to review this manuscript.

For the data used in this study (and in figures 10.11, tables 5.6 etc.) stopped in 2016: (1) November 12, 2019 was the start date of this study, or the specified time for retrieval of the patent set. At the beginning of the study, we developed a technical roadmap for the study, including the research content, research methods and the patent set to verify feasibility and effectiveness of the research results. (2) It takes about 2 years for Chinese patents to be granted after the initial disclosure. As shown in Figure 8(a), the number of patents from 2017 to 2019 declined due to incomplete data during the invention examination process (In Section 4, Paragraph 2, Lines 359-360). Therefore, the latest patent filing date is November 12, 2017, when studying granted patents. To ensure the integrity of the annual statistics, it was finally decided to regulate the patent filing date in 2016.

We have revised the text with the retrieval date to make it clear as follows.

(Section 4, Paragraph 1): 19235 patents are gathered from Patsnap (https://analytics.zhihuiya.com) using the search query of “DESC: ("unmanned aerial vehicle" OR "UAV") AND PATENT_TYPE: (granted) AND APD: [20000101 TO 20191112]” (date: November 12, 2019).

As for DJI’s inventions, we have updated the patent data to describe the current trend as follows.

(Section 4.1): Patents belong to DJI are gathered from the SIPO using the search query of “Inventor = (Shenzhen Dajiang Innovation) AND DESC = ("unmanned aerial vehicle" OR "UAV" OR "camera drone") AND Invention_Type = (Invention) AND Application_Date ≤ (20220622)” (date: June 23, 2022), 720 patents were collected.

Figure 10 illustrates the number of inventions about UAV DJI applied for from 2010 to 2022. As presented in the graph, the application number of UAV inventions from 2010 to 2013 was only 1, 0, 3 and 3 respectively. However, there was a substantial growth in the number of inventions DJI applied for 50 in 2014 and 56 in 2015. The change in the number of inventions in 2013 and 2014 matches the fact that DJI first launched a drone named ‘Phantom’ in early 2013 and ‘Inspire’ in 2014 followed by ‘Spark’ series and ‘Mavic’ series in the following years. It is clear from the data that, the number of annual applications has remained high for the next two years (197 inventions in 2016 and 250 inventions in 2016). In other words, DJI continues to maintain and even enhance its innovation capabilities. In addition, the number of filing patents in 2021 and 2022 shown in the figure are zero, which does not mean that DJI applied no patents in these two years. It is more likely that the patents filed by DJI in 2021 and 2022 have not been granted yet, as it takes about 2 years for Chinese patents to be granted after the initial disclosure. Likewise, the downward trend in the number of patents shown in the graph from 2018 to 2020 does not reflect the recent lack of innovation at DJI, as there may be patents filed during this period but not yet granted.

Finally, 720 inventions of UAV constitute a patent set. Their application dates, IPC symbols and description parts were extracted.

 

We also updated Tables 5, Table 6 and their descriptions in Section 4.4 (Paragraphs 2-3).

 

Point 2: I recommend improving / changing the background of Figure 1, especially in Square III and Square IV for better text visibility for readers.

I also recommend enlarging figure 8 (actually consisting of 4 charts), the text is too small to be clearly visible. The same recommendation remains for figures 11,13,14.

Authors’ response: We are very grateful for your valuable suggestions. We have improved Figure 1 and Figure 8 to make them readable. We have also improved Figures 11-15.

 

Point 3: The paper seems to have a good theoretical foundation, as evidenced by the fairly large number of bibliographic references, and a section of Related Works (including Technological opportunities identification for radical inventions and Patents-based identification methods for technological opportunities). From this point of view, I would like to read more clearly in this article with what authors contribute in addition to existing studies, in the Results / Discussion section.

Authors’ response: Thanks for the comment and suggestion. We have added Discussion section (Section 5) to better explain research contributions of the existing study as follows.

The framework of identifying technological opportunities for radical inventions (RIs) is established in a novel way. Previous studies did not provide a clear understanding of conditions under which radical inventions (RIs) form, and usually identifying a large number of general technological opportunities before screening them out. This kind of post-screening is costly with limited guidance for improving effectiveness of the identification process. The method proposed in this study considers characteristics of RIs at an early stage and uses the value of difficulty (VOD) of the problems solved and the value of technological novelty (VON) of new technology applied to directly identify technological opportunities for RIs without extra cost on processing noisy data. It is helpful for managers and designers to quickly and accurately identify technological opportunities for RIs.

Patent maps are being developed for evaluating the technological novelty of pa-tents. The previous studies have tended to deal with technical details in patent texts. However, after extracting a large number of micro-technical details from the patent text, it is still summarized into macro-technical points to represent the patent technology, which is not only time-consuming but also misleading. In comparison, using patent filing dates and international patent classification codes (IC), we built a novel patent map to determine the VON of patent technology. This method allows innovation managers quickly locate novel patents based on the high VON of IC to decide high novel of patents classified under the IC. At the same time, the method can help industrial firms to improve the utilization of structured patent data.

The use of natural language processing, K-means clustering and complex network analysis in unstructured patent data has been applied to various tasks. Natural language processing is to pre-process unstructured patent data into structured ones and combine the analysis results of K-means clustering and complex network analysis to get desired information intelligently. In this study, we use the intelligent computation of VOD for each paten. With the spread of artificial intelligence, theoretical research and practical application of intelligently technological opportunities analysis using patent data mining are continuously developed. This will reduce the difficulty of technology development and encourage more firms into radical innovation.

Author Response File: Author Response.docx

Back to TopTop