A Dual-Level Prediction Approach for Uncovering Technology Convergence Opportunities: The Case of Electric Vehicles

Yi, Sang Kwon; Song, Chie Hoon

doi:10.3390/su17083607

Open AccessArticle

A Dual-Level Prediction Approach for Uncovering Technology Convergence Opportunities: The Case of Electric Vehicles

by

Sang Kwon Yi

and

Chie Hoon Song

^*

Department of Technology Management, Gyeongsang National University, Jinju-daero 501, Jinju 52828, Republic of Korea

^*

Author to whom correspondence should be addressed.

Sustainability 2025, 17(8), 3607; https://doi.org/10.3390/su17083607

Submission received: 20 February 2025 / Revised: 31 March 2025 / Accepted: 14 April 2025 / Published: 16 April 2025

(This article belongs to the Section Economic and Business Aspects of Sustainability)

Download

Browse Figures

Versions Notes

Abstract

:

The transition to electric vehicles is a critical step toward achieving carbon neutrality and environmental sustainability. This shift relies on advancements across multiple technological domains, driving the need for strategic technology intelligence to anticipate emerging technology convergence opportunities. To address this challenge, this study aimed at providing an analytical framework for identifying technology convergence opportunities using node2vec graph embedding. A dual-level prediction framework that combines similarity-based scoring and machine learning-based classification was proposed to systematically identify new potential technology linkages between previously unrelated technology areas. The patent co-classification network was used to generate graph embeddings, which were then processed to calculate edge similarity among unconnected nodes and to train the classifier model. A case study in the EV market demonstrated the framework can reliably predict future patterns across disparate technology domains. Consequently, advancements in battery protection, thermal management, and composite materials emerged as relevant for future technology development. These insights not only deepen our understanding of future innovation trends but also provide actionable guidance for optimizing R&D investments and shaping policy strategies in the evolving electric vehicle market. The findings contribute to a systematic approach to forecasting technology convergence, supporting innovation-driven growth in the evolving EV sector.

Keywords:

link prediction; node2vec; patent classification; patent analysis; technology management; edge similarity; node embedding

Graphical Abstract

1. Introduction

Electric vehicles (EVs) have emerged as a pivotal solution for decarbonizing transportation, which is responsible for roughly one-quarter of global CO₂ emissions [1,2]. EVs, especially when powered by renewable energy, offer a promising path to sustainability. However, their adoption is challenged by high costs, limited range, and insufficient charging infrastructure, demanding innovative technological advancements [3]. Governments worldwide are enacting policies to accelerate the shift from combustion engines to EVs. For example, the European Union has agreed on a legislation to ban the sale of internal combustion engine vehicles that run on fossil fuels starting in 2035 [4]. The International Energy Agency’s “Global EV Outlook 2024” forecasts that by 2035, over 25% of vehicles could be electric under current policies, with China leading growth [5]. This transition demands technological leadership, prompting companies like Sony and Honda to collaborate in ventures such as Sony Honda Mobility.

EV technology draws from multiple domains (automotive, electronics, materials, etc.), making technology convergence a key driver of innovation. In fact, EVs are a convergence product at the intersection of traditional automobile engineering and advanced electronics. Identifying where these diverse technologies could intersect next is crucial for guiding future EV development. Patent data provide a rich basis for such analysis. For example, patent classification co-occurrence networks have been widely used to reveal interrelationships between technological domains [6]. By examining which technology classes appear together in patent documents, researchers can detect nascent convergence patterns that signal new development opportunities. This process, known as strategic technology intelligence, enables firms to identify technological opportunities and potential threats that could affect their future growth [7]. It encompasses several key activities, such as monitoring technological advancements or forecasting future technological developments, which are critical for firms to stay competitive in this dynamic market.

Technology convergence, which involves the merging of two or more distinct technological domains into new interdisciplinary fields, provides the impetus for companies to explore new markets beyond their familiar technology areas [8], serving as key to overcoming EV challenges. For instance, integrating automotive engineering with electronics and materials science can enhance battery efficiency and vehicle safety. Decision-making based on convergence dynamics and forecasting of promising technologies during the R&D planning and demonstration phase is an important factor in creating a sustainable innovation ecosystem [9]. Patent information is commonly utilized as an indicator to analyze technology convergence [10]. The patent classification system categorizes patent documents by subject matter, providing a structured approach to understanding technological domains. By examining the co-occurrence relationship among these classifications, researchers can identify potential technology convergence patterns [11].

Previous studies in the EV field have primarily focused on trend analysis [12,13] or tracing technology transfer networks using patent data [14]. Although these studies effectively captured the overall direction of technological evolution, these retrospective approaches rarely predict future developments, limiting their ability to guide proactive R&D. Moreover, there is a lack of methodological research that applies machine learning techniques to identify new technology development opportunities from a convergence perspective. Some efforts, such as Feng et al. (2020) [15], have used resource allocation for convergence prediction but were constrained by narrow timeframes and reliance on a single indicator. Another study has used graph embedding methods to analyze R&D collaborations [16], but their application to predicting EV technology convergence remains limited. Wang and Li (2024) [17] combined link-prediction with node2vec graph embedding for technology forecasting, but they applied it to a co-word network, overlooking the convergence perspective. This study addresses these gaps by introducing a dual-level prediction framework using node2vec graph embedding and machine learning to forecast technology convergence opportunities in the EV sector.

Consequently, this study aims to achieve following objectives:

(1): Characterize the current EV technology landscape using patent data, by identifying and summarizing the trends in patent applications, prominent technological domains (CPC classifications), and key patent assignees;
(2): Develop a predictive analytical framework (dual-level prediction approach) for uncovering potential technology convergence opportunities within the EV sector;
(3): Empirically identify new and promising technological convergence opportunities in the EV patent landscape using the proposed dual-level predictive approach.

To achieve these objectives, this study constructed a patent co-classification network using Cooperative Patent Classification (CPC) codes and applied the node2vec algorithm to generate graph embeddings. These embeddings are utilized in a dual-level prediction framework combining similarity-based scoring and machine learning classification, systematically exploring latent connections across disparate technological domains. This framework is called dual-level, because it is designed to enhance reliability by utilizing node2vec embeddings in two complementary ways: (a) similarity-based link prediction, (b) link prediction framed as a binary classification problem. Lastly, promising new links are suggested by applying the analysis framework.

By explicitly targeting EV technologies, our framework generates industry-specific predictive insights. The integration of unsupervised embeddings and supervised machine learning helps to filter out less relevant connections, while the machine learning-based classification provides a robust validation of these predictions. The contribution of this study lies in offering a robust methodology and in providing strategic foresight for optimizing R&D investments and policy decisions in the EV market. Hence, we expect that the proposed dual-level approach will help R&D managers to streamline technology intelligence processes by providing actionable insights and advanced tools for making more informed decisions regarding which technologies to develop or invest in. The findings can be used to prioritize new areas of technology convergence, ultimately stimulating the progress of EV technology.

The remainder of this paper is organized as follows: following the introduction, Section 2 describes the data collection, the analytical framework and applied methods in detail. In Section 3, we present the empirical findings by highlighting predicted areas of technology convergence. Section 4 discusses the main findings and what they might indicate for the future development of EV. In Section 5, we present the conclusions and outlook for future research.

2. Data and Methods

2.1. Data

The analyzed patent data were extracted from United States Patent and Trademark Office (USPTO) via WINTELIPS, a comprehensive commercial patent search and analysis platform. Various studies have utilized WINTEPLIPS for patent analysis across different fields [18,19]. By reviewing prior research, we used a combination of keyword search and patent classification codes to obtain relevant patent data [13,20,21]. Both positive and negative keywords were defined, with positive keywords designated for inclusion in patent titles, abstracts, and claims, while negative keywords were specified for exclusion. For example, terms like “bike” or “bicycle” are used to filter out unrelated patent information. To focus specifically on EV-related patents, we considered international patent classification (IPC) codes.

Ultimately, following search query was used for data retrieval: (((“electric” ADJ2 (car OR vehicle OR automobile OR mobility)) NOT (bike OR bicycle OR motorcycle OR “fuel cell” OR hybrid)).TI. OR ((“electric” ADJ2 (car OR vehicle OR automobile OR mobility)) NOT (bike OR bicycle OR motorcycle OR “fuel cell” OR hybrid)).AB. OR ((“electric” ADJ2 (car OR vehicle OR automobile OR mobility)) NOT (bike OR bicycle OR motorcycle OR hybrid)).CLA.) AND (B60L-003* OR B60L-011* OR B60L-013* OR B60L-015* OR B60K-001* OR B60L-050* OR B60W-010/08 OR B60W-010/24 OR B60W-010/26).IPC. AND (@AD >=20100101 <= 20241231). (Note: TI = Title, AB = Abstract, CLA = Claims, IPC = International Patent Classification, AD = Application Date).

The search period was limited from 2010 to 2024 to focus on the most recent developments. Data retrieval took place in January 2025, resulting in the identification of 7343 unique patent documents, including both registered patents and published patent applications. These documents were exported for further data processing.

2.2. Analysis Framework

In this study, we followed the six-step analysis outlined in Figure 1 to identify new technology convergence opportunities using link prediction. In the first step, patent data are collected and preprocessed for subsequent analysis. The preprocessing involves handling missing values and standardizing data formats. The second step delivers descriptive statistics of the analyzed patent data, including the distribution of patent applications, key technological domains involved, and key patent assignees. The third step encompasses the construction of patent co-classification networks using CPC codes, which served as input for the node2vec algorithm. CPC codes, which have a broader classification system than IPC, allow a finer-grained representation of technological domains [22]. The fourth step involves the transformation of a network into graph embeddings via node2vec. This process generates low-dimensional vector representations of each node, effectively capturing the complex relationships and structural features within the network. The fifth step consists of two inter-linked subprocesses. The first subprocess involves similarity-based link prediction, identifying potential new connections by selecting links with high similarity scores that do not exist in the original network. The second subprocess treats link prediction as a binary classification problem, using machine learning algorithm to predict whether a link will form between two unconnected nodes. The prediction results from both subprocesses are aggregated to identify potential new connections. Finally, potential technology convergence opportunities are identified from the aggregated prediction results. The final prediction results represent the intersection of both subprocesses. In summary, the described analysis framework enables effective identification of technology convergence opportunities by integrating similarity-based and classification-based link prediction methods. The dual-level approach enhances prediction reliability, leveraging the strengths of both methods. Data analysis was conducted using Python 3.10.16. Next, we provide a description of the theoretical aspects of the final five steps.

2.3. Descriptive Statistics

Descriptive statistics deal with summarizing the basic features of the patent dataset, including the distribution of patent applications over time, key technological domains involved, and key patent assignees. These statistics provide essential context for understanding the scope and relevance of our analysis. They lay the groundwork for the subsequent network construction.

2.4. Patent Co-Classification Network

This study adopted the concept of knowledge flow for constructing the co-classification networks. Knowledge flows in patent analysis refer to transfer or exchange of technological knowledge between distinct entities, such as technological domains. These are inferred from patterns in how patent documents are classified under multiple classification codes [23]. Co-classification networks were constructed based on CPC code co-occurrences, reflecting knowledge flows across technological domains. Analyzing their relationships can provide important clues to how knowledge disseminates across different technological fields. Previous research indicated that merging various technological disciplines can be seen as recombinant innovation, which initiates technological transitions [24]. Moreover, technologically novel patents tend to include combinations of IPC codes that were not previously linked, fostering the development of new technological pathways [25]. By uniquely blending existing knowledge can potentially achieve higher levels of performance, as the current technological knowledge base influences the ways in which technologies combine and diffuse. CPC codes have a hierarchical structure that allows for detailed categorization of technologies. We relied on the group-level CPC codes to create a co-classification network. where nodes denote these codes and links indicate the strength of their interactions. The strength of these interactions is measured via the frequency of co-occurring CPC codes within the same patent documents.

2.5. Node2vec

Node2vec is a popular algorithm for generating low-dimensional, continuous vector representations (also called embeddings) of nodes in a network graph. First introduced in 2016 [26], node2vec extends the idea of word2vec model from natural language processing (NLP) to graph-structured data. The algorithm maps each node to a d-dimensional vector f(v) in a manner that similar nodes are embedded closer together in the vector space. Treating graph nodes as “words” and their neighborhoods as “contexts”, node2vec learns meaningful embeddings through a process analog to training word embeddings in NLP. Node2vec captures structural similarities in a data-driven manner, making it highly adaptable to various network types. Moreover, it can effectively preserve both local and global structural properties of the network upon application.

Node2vec generates the node sequences through biased random walks by introducing two hyperparameters “p” and “q”. P refers to return parameter and controls the likelihood of immediately revisiting a node in the walk, while q refers to in-out parameter, balancing the exploration of nodes between starting and distant nodes. Consequently, a higher p value reduces the tendency to stay near the previous node, promoting broader exploration. A higher q value encourages the walk to remain within the local neighborhood of the starting node, preserving local structural characteristics. This flexibility in random walk strategies allows node2vec to generate diverse sequences of nodes that reflect different aspects of the graph’s structure. By tuning p and q, node2vec can generate random walks that either favor local or global structures, or a combination of both. Once the node sequences are generated through biased random walks, node2vec employs the Skip-Gram model to learn meaningful embeddings. The Skip-Gram model, which is a neural network architecture designed to predict the context of a given word, is adapted in node2vec to work with graph structures. In this context, it aims to maximize the probability of observing neighboring nodes given a particular node. In this study, we used the resulting embeddings for the link prediction task, which investigates the likelihood of forming connections between previously unconnected nodes based on both their local interactions and global roles.

2.6. Link Prediction

Link prediction, which describes the task of identifying potential connections between previously unconnected nodes within a network, has become an increasingly important area of research in various disciplines, ranging from social networks to biological systems [27,28]. In the domain of patent analysis, link prediction has been employed for various purposes, including but not limited to predicting the technological convergence patterns [29], comprehending new word combinations [30] and exploring partner selection [31]. The application of link prediction in patent analysis has provided valuable insights into emerging technological trends and potential collaboration opportunities, enabling researchers and R&D specialists to make informed decisions.

This study proposes a dual-level link prediction framework, which gathers prediction results from two different methodological approaches to finalize more reliable predictions. The first approach calculates cosine similarity scores between all pairs of nodes to evaluate their potential for possible new linkage. Node embeddings generated by the node2vec algorithm are directly used to compute cosine similarity scores between pairs of unconnected nodes. Links with high similarity scores but no existing edges in the original network were identified as potential new connections. A threshold value for similarity can be set to filter out less relevant predictions. To uncover promising areas for future convergence, only CPC combinations, where the first three characters differ (CPC class level), are considered.

The second approach treats link prediction as a binary classification problem. In the binary classification approach, positive links (class ‘1’) are defined as existing edges in the co-classification network, representing CPC code pairs that co-occur in at least one patent document, indicating a known technological relationship. Negative links (class ‘0’) are defined as pairs of CPC codes with no co-occurrence in the dataset, i.e., unconnected nodes. For the binary classification-based link prediction, the input features consist of edge embeddings derived from the node2vec-generated node embeddings. Specifically, for each pair of CPC codes, the node embeddings of the two nodes are combined using the Hadamard product to produce a 100-dimensional edge embedding [32]. Hadamard product combines the embeddings of two nodes to create a single vector representation via element-wise multiplication. This edge embedding serves as the sole input feature vector for training the machine learning models, implying the structural and relational properties of the patent co-classification network. In total, four different machine learning algorithms are employed and compared. These algorithms are trained on the derived edge embeddings to effectively identify and predict potential links within the network. Through this comparison, we selected the most suitable model based on performance metrics, such as F1-score. The employed evaluation metrics are calculated as follows:

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(1)

Precision = \frac{T P}{T P + F P}

(2)

Recall = \frac{T P}{T P + F N}

(3)

F 1 - score = 2 \times \frac{(p r e c i s i o n \times r e c a l l)}{(p r e c i s o i n + r e c a l l)}

(4)

where TP, TN, FP, and FN denote true positives, true negatives, false positives, and false negatives, respectively.

To evaluate the performance of our approach, we split the patent data into training and test sets. The training set was used to learn the node embeddings and train the model, while the test set was used to assess link prediction performance. Our training and evaluation methodology includes, (1) Time-based split, (2) Progressive validation and model update and (3) Simulation.

First, we trained the model using data from 2015 to 2020 and validated it on data from 2021 to 2023. This required chronological splitting of the dataset. Second, we assessed the model’s performance individually for each year from 2021 to 2023. We incrementally updated the model using data from each year and validated it on the subsequent year’s data. Third, we reserved data from 2024, and the best-performing model is selected for predicting future technological convergences.

Both training and test set contain positive and negative edges. Positive edges are the graph’s actual edges, while negative can either include all non-existing edges or be randomly sampled from non-existing edges. To handle class imbalance and consider structural information between nodes, negative edges are sampled using distance-based methods instead of random selection. This distance-based sampling approach ensures that the model encounters more challenging negative examples, enabling it to better distinguish between likely and unlikely connections. In the case of random sampling, it is more likely that the model learns less meaningful patterns, as it predominantly encounters trivial negatives that fail to capture the complex structural relationships present in the network [33]. Figure 2 illustrates the conceptual process of transforming a network into node embeddings, which are then used to generate edge embeddings. These edge embeddings serve as input for a binary classification model that predicts previously unconnected CPC codes. The heatmap visualizes similarity scores derived from these embeddings, indicating how closely connected the different node pairs are. After training, the binary classifier predicts potential new connections. The final network visualization highlights these newly inferred connections using red dashed edges, illustrating how previously unlinked CPC codes are now connected based on learned representations.

2.7. Uncovering Technology Convergence Opportunities

In this analysis step, the prediction results from both approaches are combined to enhance robustness in identifying potential technology convergence opportunities. This integrated approach leverages both the structural information captured in the embeddings and the power of machine learning classification to uncover latent relationships and predict future connections within the technological landscape. By merging these two methodologies, we achieve a more comprehensive understanding of how different technologies may converge, enabling the identification of emerging trends. This holistic approach offers flexibility in understanding the dynamics of technological innovation, as it allows for the adjustment of threshold values. Moreover, our framework can be tailored to specific research needs and can accommodate the integration of methodological variations.

3. Results

3.1. Descriptive Statistics

In this section, the analyzed patent data are examined to highlight the overall trends and distinctive features that shape the EV patent landscape. Specifically, the analysis delves into the evolution of patent filing trend over time, examines the most frequently occurring CPC codes, and evaluates the distribution of patent assignees. This approach not only provides a historical perspective on the growth of EV innovations but also offers insights into the key technological areas and the distribution of assignees shaping the development of the field. Figure 3 illustrates the trend in patent applications from 2010 to 2024, with a total of 7343 patent documents examined during this period.

Overall, the number of patent application has increased steadily till 2018. The sharp decline in 2019 could be related to COVID-19 and associated global economic slowdown. However, following the recovery from the pandemic, the number of patent applications resumed its gradual upward trend. The apparent decrease in 2023 and 2024, on the other hand, can likely be interpreted as an artifact, since these figures include patents that have not yet been published.

Figure 4 delineates the top 20 most frequently occurring CPC codes, providing insight into the dominant technological domains. Given the hierarchical structure and inherent complexity of CPC scheme, we opted to utilize group-level IPC categories to emphasize the key technology areas. (Note: Interested readers can look up for the explanation of CPC codes in the following website: https://www.uspto.gov/web/patents/classification/cpc/html/cpc.html, accessed on 2 February 2025). A total of 1453 unique CPC codes were identified, each contributing at different rates to the depiction of the EV patent landscape. Since most patent documents are classified under more than one CPC code, the overall sum of these codes exceeds the actual number of patents analyzed. The most frequently occurring codes were “Y02T-0010”, followed by “B60L-0050”, “Y02T-0090”, “B60L-0053” and “B60L-2240”. These codes belong to subclass Y02T, which stands for “Climate change mitigation technologies related to transportation”, and subclass B60L, which pertains to “Propulsion of electrically propelled vehicles”. As the EV domain is driven by advancements in sustainable transportation and innovative electric propulsion systems, it is natural that a significant portion of the associated technology areas is centered around technologies that reduce the environmental impact of transportation and enhance energy efficiency. In particular, Y02T codes capture solutions aimed at minimizing the environmental impact of transportation, while B60L codes reflect ongoing progress in electric drive systems and associated power management. In similar vein, Y02E (“Reduction in greenhouse gas emissions”) encompasses a range of technologies focused on lowering carbon footprints across transportation systems. Moreover, H02J codes address innovations in power conversion and inverter technologies essential for effective energy management in EV drivetrains, while H01M codes pertain to advancements in battery systems and energy storage solutions, which are vital for extending range and ensuring reliable power delivery. Together, these classifications offer a comprehensive view of the strategic technological priorities driving the evolution of the EV sector.

Table 1 describes the top 20 patent assignee in the field of EV. As expected, most of the top assignees are well-established automotive manufacturers and technology companies actively involved in electric vehicle innovation. Ford Global Tech, which specializes in intellectual property management, leads the ranking with 632 patents, followed by Toyota Motor Corp, Honda Motor, Hyundai Motor and Kia, and GM Global Technology Operations. These companies have been both traditional leaders in the automotive industry and key drivers of EV development. Since Hyundai Motor and Kia are both under Hyundai Motor Group, they frequently collaborate and file patents together to achieve technological synergy, cost efficiency, and competitive advantage in the rapidly evolving EV market. For example, joint patents ensure unified technological standards across the organizations, fostering interoperability and streamlining innovation of production lines. Beyond traditional automobile manufacturers, the ranking includes several technology and electronics companies, such as Murata Manufacturing Co. and Qualcomm. Their presence underscores the increasing integration of advanced battery management systems, power electronics, and wireless communication technologies in EVs. It signals the convergent nature of EV patent landscape, as the inclusion of semiconductor and electronics manufacturers highlights the critical role of digital technologies in modern EVs. Emerging players like Thunder Power New Energy Vehicle Development Company and NIO USA Inc. demonstrate the rising influence of EV startups in pushing next-generation EV advancements, including battery innovation and autonomous mobility solutions.

Interestingly, when we analyze the assignee’s country of origin from Table 1, most of the top patent holders are based in Japan, the United States, South Korea, Germany and China. Table 2 shows the global distribution of patents by frequency for the top 10 countries. This distribution reflects the dominance of these countries in the global EV industry. Rather than each country specializing in a single technology domain, the patent landscape reveals significant overlap, with companies across different regions actively developing electric powertrain systems, battery management technologies, and smart mobility solutions. This convergence underscores the highly competitive and collaborative nature of EV innovation.

3.2. Co-Occurrence Network

Based on the patent co-classification relationships, we constructed six distinct co-classification networks, which serve as input for the node2vec algorithm in the subsequent analysis step. The first network spans the entire analysis period and is used to generate embeddings for similarity-based link prediction. The resulting network consists of 1452 unique nodes, each representing a specialized technology domain, and 35,962 edges. The second network, incorporating patent data filed between 2015 and 2020, was used to train machine learning models and includes 1155 nodes. The third through fifth networks represent the co-classification network for the years 2021, 2022 and 2023, respectively. These three networks are used to evaluate the models’ predictive performance in a progressive validation process, testing their ability to identify links in each subsequent year. Notably, the number of unique nodes decreased gradually from 501 in 2021 to 489 in 2022 and 416 in 2023. The sixth network was constructed using the patent data from 2024, comprising 312 unique nodes and 3843 edges. Overall, the generated networks span different periods, with varying node counts each year. The decrease in node numbers is influenced by the number of patent applications present in the respective year.

Figure 5 visualizes the resulting network for the entire analysis period, providing a static snapshot of interconnected technology domains. However, to enhance the clarity of visualization, only the top 10% of nodes are displayed. The visualization was performed using Fruchterman–Reingold algorithm, thereby bringing highly connected nodes closer to the center. A greater node size represents a higher degree of centrality, indicating the node’s importance within the network. Accordingly, Y02T-0010 (Road transport of goods or passengers) and B60L-0050 (Electric propulsion with power supplied within the vehicle) have prominent positions due to their high centrality. These prominent nodes align with those frequently occurring CPC codes. The average number of edges per node is approximately 49.53, while the overall density is 0.034. This points to a specific structural characteristic of the network, which is sparse but unevenly connected, with some nodes having significantly more connections than others. This is similar to hub-and-spoke structure, in which hubs act as key intermediaries for connectivity within the network.

Table 3 summarizes the top 10 frequently interconnected node pairs, with the number of co-occurrences measured by the absolute number of connected node pairs. The highest number of co-occurrences was found between Y02T-0010 and Y02T-0090. Given the context of Y02T, which relates to climate change mitigation technologies in transportation, this co-occurrence indicates a trend in the transportation sector toward increasingly integrative solutions, combining direct emission reduction strategies with systemic improvements to maximize environmental impact. Moreover, most of the interactions occur involving Y02T-0010, which is classified as “Road transport of goods or passengers”, highlighting its central role in innovations that focus on reducing emissions and enhancing sustainability in road transportation.

3.3. Node2vec Embedding

To obtain node2vec embeddings, it is necessary to set the conditions for biased random walk. The key parameters for the random walk include “number of walks”, “walk length”, “embedding dimension”, “p (return parameter)” and “q (in-out parameter)” [26]. These settings can influence the balance between capturing local and global network features, and they ultimately affect the quality of the embeddings for downstream tasks.

Number of walks: The number of random walks initiated from each node. Increasing this number provides a more robust sampling of the network structure, but at the cost of higher computational cost.
Walk length: The number of steps in each random walk. Longer walks capture more of the global network structure, whereas shorter walks focus on local connectivity.
Embedding dimension: The size of the vector representing each node. A higher dimension may capture more detailed features but also increases the risk of overfitting.
Return parameter (p): It controls the likelihood of immediately revisiting a node during a walk. Lower values of p increase the likelihood of returning to the previous node, thereby favoring a more localized exploration of the network.
In-out parameter (q): It balances the search between breadth-first and depth-first strategies. A lower value encourages outward exploration (depth-first), while a higher value biases the walk toward the local neighborhood (breadth-first).

In this study, we based our settings on prior research findings and set the embedding dimension 100. According to original node2vec paper [26], performance tends to saturate once the representation dimensionality reaches around 100. Lombardo and Poggi [34] investigated the relationship among “walk length”, “number of walks” and “computation time”, demonstrating a linear increase in processing time as these parameters increased from 1 to 30. Their experiments standardized p = q = 1 to ensure uniform exploration strategies (balancing depth-first and breadth-first approaches). Moreover, Peng et al. (2019) revealed that prediction performance was not significantly affected by varying p and q values [35]. This collective evidence indicated that depending on the research objectives and priorities, certain parameter settings may be more appropriate. For example, one may choose to optimize parameters such as walk length and number of walks, while keeping p and q to their default values. We decided chose a rather neutral setting, with no bias toward either exploration strategy; therefore, we adopted p = q = 1. To reduce the computational complexity, the walk length was set to 20 and number of walks to 200.

3.4. Similarity-Based Link Prediction

The first network was used to perform the similarity-based link prediction. For similarity calculation, we only considered pairs of nodes that are not directly connected and exceeded a certain similarity threshold. Typically, setting a higher threshold yields a smaller set of candidate pairs with greater confidence, while a lower threshold includes a broader range of potential relationships, allowing for more exploratory analyses. Based on these considerations, we set the threshold at 0.75, which resulted in the proposal of 54 unconnected edges. Table 4 shows an exemplary selection of these candidate edges, along with their corresponding similarity scores. Notably, the highest similarity score observed between H01L-0029 and H10D-0064, warranting further investigation. The CPC scheme has undergone multiple revisions since its introduction in 2013, reflecting technical advancements and evolving patent landscapes. While H10D-0064 pertains to “Electrodes of devices having potential barriers”, H01L-0029 is not searchable within the current revised version. Its definition can be found in the original version as “Semiconductor devices adapted for rectifying, amplifying, oscillating or switching, or capacitors or resistors with at least one potential-jump barrier or surface barrier”. The observed similarity between these classifications may stem from overlapping functional or structural attributes inherent in semiconductor device design. In the context of electric vehicles, this convergence is particularly useful, as modern EV powertrains rely on high-performance semiconductor devices for efficient energy conversion and motor control. By integrating advanced electrode designs with optimized semiconductor components, manufacturers can achieve improved thermal management, reduced energy losses, and enhanced durability of power electronics. B60R-0017 pertains to “Arrangements or adaptations of lubricating systems or devices”, focusing on vehicle-specific lubrication mechanisms. F16N-2200, representing “Condition of lubricant”, encompasses broader technical details for monitoring lubricant properties (e.g., oxidation, viscosity, contamination) across industrial applications. The convergence between these two codes might involve designing adaptive lubrication systems for electric vehicles, capable of assessing lubricant condition in real time. EVs still require lubrication for components like reduction gearboxes and adaptive lubrication mechanisms enable proactive maintenance strategies, improving performance and longevity in EV applications. The identified candidate node pairs will be merged with the results from classification-based link prediction to highlight the potential areas of technology convergence.

3.5. Binary Classification-Based Link Prediction

The second network was used to train and compare different machine learning models. Edge embeddings are used as inputs for training the classifier. In terms of generating negative edges for training, we opted for distance-based sampling approach as mentioned in Section 2.6. In any graph, the number of negative edges vastly outnumbers the positive edges, leading to a severe imbalance that can overwhelm the training process if all available negatives are used. To address this issue, a distance-based method is employed to prioritize negative samples that are closer to the positive examples, making them more challenging and thus enhancing the model’s ability to learn finer distinctions and achieve more robust performance. By limiting the number of negatives, we not only mitigate the imbalance but also reduce the computational burden during training, which improves the model’s ability to generalize and accurately predict links. Table 5 summarizes the number of training and test samples, along with the distribution of positive and negative edges for the binary classification task. The training set is used for initial model training, while the test sets (2021–2023) are used for progressive validation, utilizing a distance-based approach to ensure a balanced dataset.

Following the procedure outlined in Section 2.6, four different machine learning algorithms were employed to evaluate the effectiveness of our approach: Logistic Regression (LR), Random Forest (RF), eXtreme Gradient Boosting (XGB), and Light Gradient Boosting Model (LGBM). Logistic Regression serves as a baseline due to its simplicity and helps identify linear relationships in the data. Random Forest, an ensemble method, is known for its robustness and ability to handle high-dimensional data. XGB and LGBM are advanced boosting algorithms that have demonstrated superior performance in various machine learning tasks. By comparing the performance of these diverse algorithms, we aim to obtain the most robust and accurate model for our specific prediction task. To ensure a fair comparison, we trained and evaluated each model using the same dataset and performance metrics.

Subsequently, models were trained and evaluated using data from 2015 to 2020, with performance validated progressively on co-classification networks for 2021, 2022, and 2023. This step-by-step validation approach allows us to capture emerging trends and shifts in the patent landscape over time. Following each validation phase, the models were updated to incorporate the latest insights. In this manner, we can ensure that our predictive framework remains dynamic and responsive to ongoing technological advancements in the field. Table 6 summarizes the model performance across these validation periods, using accuracy, precision, recall, and F1-score as evaluation metrics.

The results revealed that RandomForest achieved the highest overall F1-score (89.25%) in 2023 and demonstrated superior performance on average in most metrics, except recall. Logistic Regression performed the worst among all models, with the lowest accuracy, precision, recall, and F1-score on average. This underperformance can be attributed to its inherent limitation in capturing complex, non-linear relationships within the data. In the context of link prediction in co-classification networks, relationships between patent classifications are likely influenced by non-linear associations and hierarchical dependencies. Moreover, its relatively lower recall values indicate that it struggles to identify positive instances, potentially missing many relevant links. XGBoost and LightGBM also showed competitive performance, highlighting the advantages of boosting methods in link prediction. Most models showed slight improvement in their performance metrics from 2021 to 2023, particularly in recall scores, with XGBoost showing the most substantial improvement in this metric.

Overall, the consistent improvement in evaluation metrics across models over time implies that the training process effectively incorporated new information, enhancing the ability to identify relevant connections in the evolving patent landscape.

3.6. Final Prediction Results

In this analysis step, we combined prediction results from Section 3.4 and Section 3.5 to derive novel technology convergence opportunities. This hybrid modeling approach, which integrates similarity-based methods with the power of machine learning based prediction, provides a more robust framework for identifying potential technology convergence patterns. We selected the best-performing model from 2023 based on F1-score (“RandomForest”) and used it to predict new technology linkages in 2024. High-confidence candidate edges were synthesized by combining machine learning-driven predictions with similarity-based scores. By leveraging insights from previous years, the model was able to anticipate potential linkages that had not yet been observed in historical data. Table 7 summarizes the predicted links between non-connected nodes with corresponding similarity score and prediction probabilities. In total, 18 new links were predicted as relevant for the future. To ensure higher confidence in the prediction results, only connections with a probability score greater than 0.8 were considered. Figure 6 highlights the predicted links for future interval, whereby the green links indicate those interactions. The converging pairs of CPC codes are summarized in Table 7.

According to Table 7, several distinct combinatory technology convergence patterns emerged. For example, the combination of H01L-0029 and H10D-0064 showed the highest confidence. This explains the focus on semiconductor device engineering, which could result in more efficient inverters and converters, improving overall EV powertrain efficiency as well as battery management. The top 10 predictions all showed remarkably high prediction probabilities (>0.9), coupled with similarity scores above 0.75, suggesting strong potential for technology convergence. The convergence between B23K-0015 (Electron-beam welding or cutting) and F28F-2013 (Heat conductive materials) could lead to better thermal management. Especially, the combination of precise welding and advanced heat exchanger design can synergize to create improved battery pack designs. Moreover, the convergence between B32B-0019 (Layered products natural mineral fibers or particles) and D10B-2505 (Industrial textiles) could lead to advanced composite materials that offer superior passive fire protection capabilities in EV batteries. In a similar vein, the convergence between B32B-0019 and D03D-0001 (Woven fabrics designed to make specified articles) could offer enhanced thermal protection systems for battery assemblies. Further promising convergence patterns include B82Y-0030 (Nanotechnology for materials or surface science) and C09D-0127 (Coating compositions based on homopolymers or copolymers of compounds). This combination could generate specialized composite materials that are critical for protecting battery modules, power electronics, and other sensitive EV components from extreme temperatures and environmental degradation. Similarly, the combination of G01J-2001 (Photometry, e.g., photographic exposure meter) and F25D-0017 (Arrangements for circulating cooling fluids) suggests advancements in thermal management and control systems for EVs through non-invasive monitoring technique. The integration of optical sensors in cooling system could enable predictive maintenance and early detection of potential overheating issues, improving overall vehicle safety. Notably, subclass B32B (Layered products) appeared multiple times as a converging node, indicating its potential as a technological convergence hub. This subclass could play an integral role in innovating energy efficiency, safety, and sustainability related aspects of EVs. Table 8 synthesizes the key technology convergence opportunities from Table 7, grouping them by thematic areas and linking them to their potential impact on EV development.

4. Discussion

The prediction results align with the research objectives by identifying key areas of technology convergence, such as battery protection and thermal management. Hence, future technology convergence is most likely to occur at the intersection of nanotechnology, sensor systems, and advanced materials in the EV sector. From the perspective of technology management and R&D policy design, these convergence patterns reveal important strategic directions for innovation [36,37]. Understanding where technological intelligence might emerge is crucial for guiding strategic investments, fostering innovation, and shaping research agendas that support the evolution of next-generation EV technologies. However, previous studies have focused on describing existing technological landscape rather than predicting future convergence trends [38]. While retrospective analyses provide valuable insights, they often lack a forward-looking perspective necessary for proactively identifying synergistic technology relationships. To bridge this gap, our study proposed a dual-level predictive analytical framework to systematically forecast potential technology linkages. The similarity-based method excels at efficiently pinpointing structurally promising pairs, while the classification method refines these predictions with data-driven precision. The findings unlock important knowledge residing in patent information and yield the following implications:

(1): In terms of theoretical implication, this study proposed a novel analysis framework for predicting potential convergent technological linkages. By integrating similarity-based scoring with a classifier model, the framework can mitigate false positives and allows for a systematic identification of emerging technology patterns. The high flexibility in the execution of method (for instance, the ability to adjust threshold values) further enhances its adaptability and enables researchers to tailor the model to various research purposes. This study adopted a rather conservative perspective in adjusting the hyperparameter of random work, but a more aggressive optimization strategy may be applied to explore broader network contexts. Furthermore, it can also be adopted to other technology domains, thereby broadening its applicability for additional innovation studies [17,39].
(2): In terms of managerial implication, this study can serve as a strategic tool for decision-makers by providing actionable insights into emerging convergent technologies. It enables them to anticipate novel combinatory innovations, thereby aligning their strategic initiatives with future technological investments. The proposed framework can help decision-makers to allocate resources more efficiently and mitigate risks associated with market uncertainties. By offering a data-driven approach of emerging trends and recognizing early signals of technology convergence, R&D managers can design targeted policies that facilitate technology buy-in or transfer. This strategic foresight translates into tangible benefits, such as developing a practical roadmap for organizations seeking to navigate complex technological landscapes and ensuring a competitive edge at the forefront of innovation in the evolving EV sector [40].
(3): In terms of uncovering technology convergence opportunities, the predictions highlighted critical technology areas that exhibit strong potential for convergence. The analysis revealed that successful EV technology development will increasingly depend on the ability to integrate diverse technological domains, particularly in areas of thermal management, materials engineering, and protective systems [41]. The findings align with the growing demand in the EV market for the development of integrated solutions in battery protection and thermal management [42]. Thus, this understanding can guide R&D strategies and resource allocation for future technology development.
(4): From an economic sustainability perspective, the emergence of new technological convergence opportunities, such as integrating advanced thermal management systems and novel composite materials, has the potential to extend vehicle lifespan and lower the overall total cost of ownership of EVs. Such improvements could directly impact consumer acceptance rates and adoption speed. The insights gained from our predictive framework offer R&D departments and policymakers tools to preemptively align innovation and investment priorities, thus optimizing economic returns from technology development activities. Furthermore, business strategies in the EV sector could benefit substantially from our predictive analysis of convergence. By understanding which technologies are likely to converge, automotive and technology companies can proactively manage their R&D portfolio, strategically entering new market segments or adjusting existing product lines. For instance, battery manufacturers, semiconductor producers, and automotive OEMs can leverage convergence insights to form strategic alliances or joint ventures, enabling cost-sharing, risk mitigation, and faster time-to-market for innovative EV solutions.
(5): In terms of sustainability impacts, the findings align with the broader goals of sustainable development by fostering innovations that reduce greenhouse gas emissions and improve energy efficiency. This could lead to the development of new business models centered around green technologies, which are increasingly favored by consumers and policymakers.
(6): The findings of this study highlight several key areas where policymakers can intervene to accelerate EV adoption and innovation. By identifying high-potential technology convergence opportunities (e.g., thermal management systems, and advanced composite materials), policymakers can prioritize funding for research programs that target these areas. For instance, directing funding toward projects that combine advanced composite materials with thermal management systems could enhance battery durability and vehicle efficiency [43]. Policymakers can also foster interdisciplinary collaboration between industries by offering tax credits, subsidies or grants for joint R&D projects. Governments can create favorable regulatory environments for convergent technologies by updating standards for EV components. For example, mandating safety protocols for advanced battery materials or thermal management systems could accelerate their adoption [44]. Especially, technology convergence often necessitates the creation of new standards to ensure compatibility across different systems. Hence, policymakers should work with industry stakeholders to develop and implement standards that accommodate these convergent technologies, ensuring seamless integration and market adoption. Lastly, expanding programs like the U.S. National Electric Vehicle Infrastructure (NEVI) Formula Program to include funding for convergent technologies could favor the development of innovative solutions that address multiple challenges simultaneously, such as enhancing both charging efficiency and battery longevity [45].

In sum, the proposed analysis framework can broaden the existing toolbox of generating technology intelligence and resolve uncertainty in technology convergence forecasting. The systematic approach supports strategic decision-making and R&D prioritization, ultimately fostering innovation and sustainable growth in the evolving EV sector.

5. Conclusions

This study provided a flexible framework for navigating and anticipating technology convergence opportunities through a data-driven approach. A case study in the EV field demonstrated the effectiveness of this framework, highlighting its ability to identify emerging synergies between distinct technological domains with high confidence. The derived insights enable organizations to strategically allocate R&D resources, accelerate innovation cycles, and capitalize on technology development opportunities.

Despite its contributions, this study has certain limitations that need to be considered, offering future research avenues. First, the predictive output may vary depending on the parameter settings during the node2vec transformation. Hence, additional studies could compare different threshold values to determine under which conditions the model yields more reliable performance and to better understand the impact on the exploration and exploitation trade-off. Second, expanding the applicability of analysis framework to other domains beyond EVs would allow for a broader validation of its effectiveness and adaptability. Hence, future research could be extended to related areas, such as battery management system, renewable energy, electronics or hybrid electric vehicles, for further testing and refining the methodology. Moreover, the proposed framework could be transferred into other tasks, such as R&D partner recommendation or patent citation prediction by adjusting the input features. Third, accurately defining the analysis scope is crucial, as the quality of used patent data can impact the accuracy of predictions. Including irrelevant patents can introduce noise, while excluding pertinent ones can omit significant data. To ensure comprehensive coverage across the EV sector, the patent retrieval strategy could be enhanced or include patent data registered in other key patent jurisdictions. Lastly, to gain a more intuitive understanding of potential technology convergence dynamics, additional data sources beyond patent classifications, such as patent content or market data, need to be incorporated to provide a more comprehensive view of emerging technological trends. Integrating patent text analysis could enhance the interpretability of prediction results. We, therefore, recommend scholars to carefully examine further application cases to facilitate the diffusion and continuous improvement of this framework.

Author Contributions

Conceptualization, S.K.Y. and C.H.S.; Methodology, S.K.Y. and C.H.S.; Software, S.K.Y. and C.H.S.; Validation, S.K.Y. and C.H.S.; Formal analysis, S.K.Y. and C.H.S.; Investigation, S.K.Y. and C.H.S.; Data curation, S.K.Y. and C.H.S.; Writing—original draft preparation, S.K.Y. and C.H.S.; Writing—review and editing, S.K.Y. and C.H.S.; Visualization, S.K.Y. and C.H.S.; Supervision, C.H.S.; Project administration, C.H.S.; Funding acquisition, C.H.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the research grant of the Gyeongsang National University in 2024.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

3rd Party Data. Restrictions apply to the availability of these data. Data were obtained from WINTELIPS and are available https://www.wintelips.com/ (accessed on 26 January 2025) with the permission of WINTELIPS.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IEA	International Energy Agency
EVs	Electric vehicles
IPC	International Patent Classification
CPC	Cooperative Patent Classification
R&D	Research and development
NEVU	National Electric Vehicle Infrastructure

References

Ritchie, H. Cars, Planes, Trains: Where Do CO₂ Emissions from Transport Come From? Available online: https://ourworldindata.org/co2-emissions-from-transport (accessed on 3 January 2025).
Bonsu, N.O. Towards a Circular and Low-Carbon Economy: Insights from the Transitioning to Electric Vehicles and Net Zero Economy. J. Clean. Prod. 2020, 256, 120659. [Google Scholar] [CrossRef]
Alanazi, F. Electric Vehicles: Benefits, Challenges, and Potential Solutions for Widespread Adaptation. Appl. Sci. 2023, 13, 6016. [Google Scholar] [CrossRef]
European Parliament. EU Ban on the Sale of New Petrol and Diesel Cars from 2035 Explained. Available online: https://www.europarl.europa.eu/topics/en/article/20221019STO44572/eu-ban-on-sale-of-new-petrol-and-diesel-cars-from-2035-explained (accessed on 3 January 2025).
IEA. Global EV Outlook 2024; IEA: Paris, France, 2024; Available online: https://www.iea.org/reports/global-ev-outlook-2024 (accessed on 15 January 2025).
Block, A.; Song, C.H. Exploring the Characteristics of Technological Knowledge Interaction Dynamics in the Field of Solid-State Batteries: A Patent-Based Approach. J. Clean. Prod. 2022, 353, 131689. [Google Scholar] [CrossRef]
Yoon, J.; Kim, K. TrendPerceptor: A Property–Function Based Technology Intelligence System for Identifying Technology Trends from Patents. Expert Syst. Appl. 2012, 39, 2927–2938. [Google Scholar] [CrossRef]
Song, C.H.; Elvers, D.; Leker, J. Anticipation of Converging Technology Areas—A Refined Approach for the Identification of Attractive Fields of Innovation. Technol. Forecast. Soc. Change 2017, 116, 98–115. [Google Scholar] [CrossRef]
Aaldering, L.J.; Leker, J.; Song, C.H. Analyzing the Impact of Industry Sectors on the Composition of Business Ecosystem: A Combined Approach Using ARM and DEMATEL. Expert Syst. Appl. 2018, 100, 17–29. [Google Scholar] [CrossRef]
Curran, C.-S.; Leker, J. Patent Indicators for Monitoring Convergence—Examples from NFF and ICT. Technol. Forecast. Soc. Change 2011, 78, 256–273. [Google Scholar] [CrossRef]
Oh, S.; Choi, J.; Ko, N.; Yoon, J. Predicting Product Development Directions for New Product Planning Using Patent Classification-Based Link Prediction. Scientometrics 2020, 125, 1833–1876. [Google Scholar] [CrossRef]
Ma, S.-C.; Xu, J.-H.; Fan, Y. Characteristics and Key Trends of Global Electric Vehicle Technology Development: A Multi-Method Patent Analysis. J. Clean. Prod. 2022, 338, 130502. [Google Scholar] [CrossRef]
Borgstedt, P.; Neyer, B.; Schewe, G. Paving the Road to Electric Vehicles—A Patent Analysis of the Automotive Supply Industry. J. Clean. Prod. 2017, 167, 75–87. [Google Scholar] [CrossRef]
Li, X.; Yuan, X. Tracing the Technology Transfer of Battery Electric Vehicles in China: A Patent Citation Organization Network Analysis. Energy 2022, 239, 122265. [Google Scholar] [CrossRef]
Feng, S.; An, H.; Li, H.; Qi, Y.; Wang, Z.; Guan, Q.; Li, Y.; Qi, Y. The Technology Convergence of Electric Vehicles: Exploring Promising and Potential Technology Convergence Relationships and Topics. J. Clean. Prod. 2020, 260, 120992. [Google Scholar] [CrossRef]
Lee, J.; Park, S.; Lee, J. Exploring Potential R&D Collaboration Partners Using Embedding of Patent Graph. Sustainability 2023, 15, 14724. [Google Scholar] [CrossRef]
Wang, L.; Li, M. An Exploration Method for Technology Forecasting That Combines Link Prediction with Graph Embedding: A Case Study on Blockchain. Technol. Forecast. Soc. Change 2024, 208, 123736. [Google Scholar] [CrossRef]
Bae, J.; Chung, Y.; Lee, J.; Seo, H. Knowledge Spillover Efficiency of Carbon Capture, Utilization, and Storage Technology: A Comparison among Countries. J. Clean. Prod. 2020, 246, 119003. [Google Scholar] [CrossRef]
Lee, C.-H. Global Patent Analysis of Battery Recycling Technologies: A Comparative Study of Korea, China, and the United States. World Electr. Veh. J. 2024, 15, 260. [Google Scholar] [CrossRef]
Aaldering, L.J.; Leker, J.; Song, C.H. Competition or Collaboration?—Analysis of Technological Knowledge Ecosystem within the Field of Alternative Powertrain Systems: A Patent-Based Approach. J. Clean. Prod. 2019, 212, 362–371. [Google Scholar] [CrossRef]
Caferoglu, H.; Elsner, D.; Moehrle, M.G. The Interplay Between Technology and Pre-Industry Convergence: An Analysis in the Technology Field of Smart Mobility. IEEE Trans. Eng. Manag. 2023, 70, 1504–1517. [Google Scholar] [CrossRef]
Kriesch, L.; Losacker, S. A Global Patent Dataset of Bioeconomy-Related Inventions. Sci. Data 2024, 11, 1308. [Google Scholar] [CrossRef]
Song, C.H. Examining the Patent Landscape of E-Fuel Technology. Energies 2023, 16, 2139. [Google Scholar] [CrossRef]
Frenken, K.; Izquierdo, L.R.; Zeppini, P. Branching Innovation, Recombinant Innovation, and Endogenous Technological Transitions. Env. Innov. Soc. Transit. 2012, 4, 25–35. [Google Scholar] [CrossRef]
Verhoeven, D.; Bakker, J.; Veugelers, R. Measuring Technological Novelty with Patent-Based Indicators. Res. Policy 2016, 45, 707–723. [Google Scholar] [CrossRef]
Grover, A.; Leskovec, J. Node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 855–864. [Google Scholar] [CrossRef]
Sulaimany, S.; Khansari, M.; Masoudi-Nejad, A. Link Prediction Potentials for Biological Networks. Int. J. Data Min. Bioinform. 2018, 20, 161–184. [Google Scholar] [CrossRef]
Daud, N.N.; Ab Hamid, S.H.; Saadoon, M.; Sahran, F.; Anuar, N.B. Applications of Link Prediction in Social Networks: A Review. J. Netw. Comput. Appl. 2020, 166, 102716. [Google Scholar] [CrossRef]
Cho, J.H.; Lee, J.; Sohn, S.Y. Predicting Future Technological Convergence Patterns Based on Machine Learning Using Link Prediction. Scientometrics 2021, 126, 5413–5429. [Google Scholar] [CrossRef]
Denter, N.M.; Aaldering, L.J.; Caferoglu, H. Forecasting Future Bigrams and Promising Patents: Introducing Text-Based Link Prediction. Foresight 2022. ahead-of-print. [Google Scholar] [CrossRef]
Chen, W.; Qu, H.; Chi, K. Partner Selection in China Interorganizational Patent Cooperation Network Based on Link Prediction Approaches. Sustainability 2021, 13, 1003. [Google Scholar] [CrossRef]
Kazemi, B.; Abhari, A. Content-Based Node2Vec for Representation of Papers in the Scientific Literature. Data Knowl. Eng. 2020, 127, 101794. [Google Scholar] [CrossRef]
Yang, Z.; Ding, M.; Zhou, C.; Yang, H.; Zhou, J.; Tang, J. Understanding Negative Sampling in Graph Representation Learning. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, 6–10 July 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 1666–1676. [Google Scholar] [CrossRef]
Lombardo, G.; Poggi, A. ActorNode2Vec: An Actor-Based Solution for Node Embedding over Large Networks. Intell. Artif. 2020, 14, 103–114. [Google Scholar] [CrossRef]
Peng, J.; Guan, J.; Shang, X. Predicting Parkinson’s Disease Genes Based on Node2vec and Autoencoder. Front. Genet. 2019, 10, 226. [Google Scholar] [CrossRef]
Giachetti, C.; Dagnino, G.B. The Impact of Technological Convergence on Firms’ Product Portfolio Strategy: An Information-Based Imitation Approach. RD Manag. 2017, 47, 17–35. [Google Scholar] [CrossRef]
Agarwal, N.; Brem, A. Strategic Business Transformation through Technology Convergence: Implications from General Electric’s Industrial Internet Initiative. Int. J. Technol. Manag. 2015, 67, 196–214. [Google Scholar] [CrossRef]
Yuan, X.; Li, X. Mapping the Technology Diffusion of Battery Electric Vehicle Based on Patent Analysis: A Perspective of Global Innovation Systems. Energy 2021, 222, 119897. [Google Scholar] [CrossRef]
Abbas, K.; Abbasi, A.; Dong, S.; Niu, L.; Yu, L.; Chen, B.; Cai, S.-M.; Hasan, Q. Application of Network Link Prediction in Drug Discovery. BMC Bioinform. 2021, 22, 187. [Google Scholar] [CrossRef]
Kim, J.; Geum, Y. How to Develop Data-Driven Technology Roadmaps:The Integration of Topic Modeling and Link Prediction. Technol. Forecast. Soc. Change 2021, 171, 120972. [Google Scholar] [CrossRef]
Feng, S.; Magee, C.L. Technological Development of Key Domains in Electric Vehicles: Improvement Rates, Technology Trajectories and Key Assignees. Appl. Energy 2020, 260, 114264. [Google Scholar] [CrossRef]
Kim, J.; Oh, J.; Lee, H. Review on Battery Thermal Management System for Electric Vehicles. Appl. Therm. Eng. 2019, 149, 192–212. [Google Scholar] [CrossRef]
Zhang, J.; Shao, D.; Jiang, L.; Zhang, G.; Wu, H.; Day, R.; Jiang, W. Advanced Thermal Management System Driven by Phase Change Materials for Power Lithium-Ion Batteries: A Review. Renew. Sustain. Energy Rev. 2022, 159, 112207. [Google Scholar] [CrossRef]
Doughty, D.H. Vehicle Battery Safety Roadmap Guidance; National Renewable Energy Laboratory: Golden, CO, USA, 2012. [Google Scholar] [CrossRef]
IEA. Global EV Outlook 2023; IEA: Paris, France, 2023; Available online: https://www.iea.org/reports/global-ev-outlook-2023 (accessed on 15 January 2025).

Figure 1. Analysis procedure.

Figure 2. Conceptual flow of dual-level link prediction approach.

Figure 3. Patent development trend of EV technology.

Figure 4. Top 20 frequently occurring CPC codes (Group-level CPC codes) (Note: Y02T-0010 = Road transport of goods or passengers; B60L-0050 = Electric propulsion with power supplied within the vehicle; Y02T-0090 = Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation; B60L-0053 = Methods of charging batteries, specially adapted for electric vehicles; B60L-2240 = Control parameters of input or output; Target parameters; B60L-0058 = Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles; Y02E-0060 = Enabling technologies; Technologies with a potential or indirect contribution to GHG emissions mitigation; B60L-0003 = Electric devices on electrically propelled vehicles for safety purposes; Monitoring operating variables, e.g., speed, deceleration or energy consumption; H02J-0007 = Circuit arrangements for charging or depolarizing batteries or for supplying loads from batteries; B60K-0001 = Arrangement or mounting of electrical propulsion units; B60L-0015 = Methods, circuits, or devices for controlling the traction-motor speed of electrically propelled vehicles; H01M-2220 = Batteries for particular applications; H01M-0010 = Secondary cells; Manufacture thereof; B60K-2001 = One motor mounted directly on a propulsion axle for rotating right and left wheels of this axle; B60L-2210 = Converter types; B60L-2250 = Driver interactions; H01M-0050 = Constructional details or processes of manufacture of the non-active parts of electrochemical cells other than fuel cells, e.g., hybrid cells; B60L-2200 = Type of vehicles; Y04S-0030 = Systems supporting specific end-user applications in the sector of transportation; B60L-2260 = Operating Modes).

Figure 5. Network visualization for the EV patent landscape. (Note: The higher the node centrality, the darker the node color appears).

Figure 6. Visualization of predicted technology convergence opportunities. (Note: The green links represent predicted connections between nodes that were not previously connected).

Table 1. Top 20 patent assignee in the field of EV.

Rank	Assignee Name	Values	Rank	Assignee Name	Values
1	Ford Global Tech LLC	632	10	Robert Bosch	66
2	Toyota Motor Corp	347	12	Sony Group	65
3	Honda Motor	227	13	NIO USA Inc.	64
4	Hyundai Motor Company\|Kia Corporation	218	14	Kubota Ltd.	60
5	GM Global Technology Operations LLC	184	15	Mitsubishi Electric Corp	60
6	Nissan Motor Co., Ltd.	89	16	Porsche Automobil Holding SE	59
7	Murata Manufacturing Co.	88	17	Proterra Inc.	59
8	Thunder Power New Energy Vehicle Development Company	81	18	Qualcomm	58
9	BYD Co., Ltd.	76	19	Subaru Corporation	57
10	Hyundai Motor Company	66	20	LS Electric Co., Ltd.	55

Table 2. Global distribution of patent applications.

Country	Number of Patent Applications
USA	3060
Japan	1745
South Korea	606
Germany	531
China	382
Taiwan	122
Sweden	118
Canada	117
France	95
Italy	60

Table 3. Top 10 interconnected nodes pairs sorted by co-occurrence frequency.

Node Pairs		Number of Co-Occurrences
Y02T-0010	Y02T-0090	3251
B60L-0053	Y02T-0010	2989
B60L-0053	Y02T-0090	2756
B60L-0050	Y02T-0010	2730
B60L-2240	Y02T-0010	2244
B60L-0058	Y02T-0010	2163
Y02E-0060	Y02T-0010	1723
H02J-0007	Y02T-0010	1563
B60L-0003	Y02T-0010	1513
B60L-0050	B60L-0058	1448

Note: Y02T-0010 = Road transport of goods or passengers; B60L-0050 = Electric propulsion with power supplied within the vehicle; Y02T-0090 = Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation; B60L-0053 = Methods of charging batteries, specially adapted for electric vehicles; B60L-2240 = Control parameters of input or output; Target parameters; B60L-0058 = Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles; Y02E-0060 = Enabling technologies; Technologies with a potential or indirect contribution to GHG emissions mitigation; B60L-0003 = Electric devices on electrically propelled vehicles for safety purposes; Monitoring operating variables, e.g., speed, deceleration or energy consumption; H02J-0007 = Circuit arrangements for charging or depolarizing batteries or for supplying loads from batteries.

Table 4. Top 10 candidate node pairs for similarity-based link prediction (sorted after similarity score).

Unconnected Node Pairs		Similarity Score
H01L-0029	H10D-0064	0.914
B60R-0017	F16N-2200	0.820
A45F-0003	B32B-0019	0.814
A45F-0003	B32B-2255	0.813
B67D-0007	F17C-2260	0.807
F01K-0003	G21D-0005	0.805
F28F-2013	B23K-0015	0.804
A45F-0003	B32B-2571	0.793
D03D-0015	B32B-0019	0.790
B67D-0007	F17C-0013	0.790

Note: H01L-0029 = Semiconductor devices adapted for rectifying, amplifying, oscillating or switching, or capacitors or resistors with at least one potential-jump barrier or surface barrier; H10D-0064 = Electrodes of devices having potential barriers; B60R-0017 = Arrangements or adaptations of lubricating systems or devices; F16N-2200 = Condition of lubricant; A45F-0003 = Traveling or camp articles; B32B-0019 = Layered products comprising natural mineral fibers or particles; B32B-2255 = Coating on the layer surface; B67D-0007 = Apparatus or devices for transferring liquids from bulk storage containers or reservoirs into vehicles or into portable containers; F17C-2260 = Purposes of gas storage and gas handling; F01K-0003 = Plants characterized by the use of steam or heat accumulators; G21D-0005 = Arrangements of reactor and engine in which reactor-produced heat is converted into mechanical energy; F28F-2013 = Arrangements for modifying heat-transfer; B23K-0015 = Electron-beam welding or cutting; B32B-2571 = Protective equipment; D03D-0015 = Woven fabrics characterized by the material, structure or properties of the fibers, filaments, yarns, threads or other warp or weft elements used; F17C-0013 = Details of vessels or of the filling or discharging of vessels.

Table 5. Summary of training and test samples for binary classification.

Dataset	Time Period	Number of Nodes	Training Samples	Test Samples	Positive Edges	Negative Edges
Training set	2015–2020	1155	52,524	-	26,262	26,262
Test set 1	2021	501	-	17,270	8635	8635
Test set 2	2022	489	-	16,026	8013	8013
Test set 3	2023	416	-	12,542	6271	6271

Table 6. Evaluation metrics for the binary classification-based link prediction.

Evaluation		Year 2021	Year 2022	Year 2023	Average
Logistic regression	Accuracy	0.6847	0.6770	0.7002	0.6873
	Precision	0.7089	0.6895	0.7348	0.7111
	Recall	0.6268	0.6438	0.6265	0.6324
	F1-score	0.6653	0.6659	0.6764	0.6692
RandomForest	Accuracy	0.8819	0.8768	0.8885	0.8824
	Precision	0.8894	0.8709	0.8611	0.8738
	Recall	0.8724	0.8848	0.9263	0.8945
	F1-score	0.8808	0.8778	0.8925	0.8837
XGBoost	Accuracy	0.8717	0.8659	0.8708	0.8695
	Precision	0.8560	0.8402	0.8257	0.8406
	Recall	0.8938	0.9037	0.9399	0.9125
	F1-score	0.8745	0.8708	0.8791	0.8748
LightGBM	Accuracy	0.8618	0.8573	0.8609	0.86
	Precision	0.8479	0.8332	0.8230	0.8347
	Recall	0.8819	0.8934	0.9196	0.8983
	F1-score	0.8646	0.8623	0.8687	0.8652

Table 7. Final prediction results sorted after prediction probability.

Predicted Technology Convergence Opportunities		Similarity Score	Prediction Probability
H01L-0029	H10D-0064	0.9143	1
F28F-2013	B23K-0015	0.8036	0.99
D10B-2505	B32B-0019	0.7746	0.98
D03D-0001	B32B-2571	0.7561	0.96
D03D-0001	B32B-0019	0.7666	0.96
B32B-2457	A45F-0003	0.7515	0.95
B82Y-0030	C09D-0127	0.7771	0.95
B32B-2250	D10B-2331	0.7506	0.94
D10B-2505	B32B-2260	0.7526	0.94
D10B-2401	B32B-0019	0.7837	0.91
B32B-0009	D10B-2401	0.7554	0.9
G01J-2001	F25D-0017	0.7758	0.9
D03D-0015	B32B-0019	0.7900	0.89
D10B-2401	B32B-2571	0.7575	0.88
Y02P-0020	B01J-0006	0.7530	0.88
G01H-0001	H04R-2410	0.7663	0.86
C01B-0032	C09D-0127	0.7770	0.86
D03D-0015	B32B-2260	0.7563	0.84

Note: H01L-0029 = Semiconductor devices adapted for rectifying, amplifying, oscillating or switching, or capacitors or resistors with at least one potential-jump barrier or surface barrier; H10D-0064 = Electrodes of devices having potential barriers; F28F-2013 = Arrangements for modifying heat-transfer; B23K-0015 = Electron-beam welding or cutting; B32B-2571 = Protective equipment; D10B-2505 = Industrial textiles; B32B-0019 = Layered products comprising natural mineral fibers or particles; D03D-0001 = Woven fabrics designed to make specified articles; B32B-2457 = Electrical equipment; A45F-0003 = Traveling or camp articles; B82Y-0030 = Nanotechnology for materials or surface science; C09D-0127 = Coating compositions based on homopolymers or copolymers of compounds having one or more unsaturated aliphatic radicals; B32B-2250 = Layers arrangement; D10B-2331 = Fibers made from polymers obtained otherwise than by reactions only involving carbon-to-carbon unsaturated bonds; B32B-2260 = Layered product comprising an impregnated, embedded, or bonded layer wherein the layer comprises an impregnation, embedding, or binder material; D10B-2401 = Physical properties; B32B-0009 = Layered products comprising a particular substance; G01J-2001 = Photometry; F25D-0017 = Arrangements for circulating cooling fluids; Arrangements for circulating gas; D03D-0015 = Woven fabrics characterized by the material, structure or properties of the fibers, filaments, yarns, threads or other warp or weft elements used; Y02P-0020 = Technologies relating to chemical industry; B01J-0006 = Calcining; Fusing; G01H-0001 = Measuring vibrations in solids by using direct conduction to the detector; H04R-2410 = Microphones; C01B-0032 = Carbon; Compounds thereof.

Table 8. Summary of key technology convergence areas in EV field.

Convergence Area	Predicted CPC Pairs	Potential Impact on EVs
Thermal management	F28F-2013, B23K-0015	Enhanced battery pack cooling via advanced heat exchangers and welding techniques
Battery protection	H01L-0029, H10D-0064	Improved powertrain efficiency with semiconductor-based energy management
Composite materials	B32B-0019, D10B-2505	Fire-resistant materials for battery safety
Composite materials	B32B-0019, D03D-0001	Thermal protection systems for battery assemblies
Nanotechnology and Coatings	B82Y-0030, C09D-0127	Durable coatings for battery and electronics protection

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yi, S.K.; Song, C.H. A Dual-Level Prediction Approach for Uncovering Technology Convergence Opportunities: The Case of Electric Vehicles. Sustainability 2025, 17, 3607. https://doi.org/10.3390/su17083607

AMA Style

Yi SK, Song CH. A Dual-Level Prediction Approach for Uncovering Technology Convergence Opportunities: The Case of Electric Vehicles. Sustainability. 2025; 17(8):3607. https://doi.org/10.3390/su17083607

Chicago/Turabian Style

Yi, Sang Kwon, and Chie Hoon Song. 2025. "A Dual-Level Prediction Approach for Uncovering Technology Convergence Opportunities: The Case of Electric Vehicles" Sustainability 17, no. 8: 3607. https://doi.org/10.3390/su17083607

APA Style

Yi, S. K., & Song, C. H. (2025). A Dual-Level Prediction Approach for Uncovering Technology Convergence Opportunities: The Case of Electric Vehicles. Sustainability, 17(8), 3607. https://doi.org/10.3390/su17083607

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Dual-Level Prediction Approach for Uncovering Technology Convergence Opportunities: The Case of Electric Vehicles

Abstract

1. Introduction

2. Data and Methods

2.1. Data

2.2. Analysis Framework

2.3. Descriptive Statistics

2.4. Patent Co-Classification Network

2.5. Node2vec

2.6. Link Prediction

2.7. Uncovering Technology Convergence Opportunities

3. Results

3.1. Descriptive Statistics

3.2. Co-Occurrence Network

3.3. Node2vec Embedding

3.4. Similarity-Based Link Prediction

3.5. Binary Classification-Based Link Prediction

3.6. Final Prediction Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI