1. Introduction
Weather prediction has long been a cornerstone of our ability to understand and adapt to the ever-changing dynamics of our environment [
1]. As the global climate continues to undergo profound shifts, the importance of accurate and timely weather forecasting has never been more apparent [
2]. The consequences of climate change, including extreme weather events, rising temperatures, and shifting precipitation patterns, underscore the critical need for reliable weather prediction systems that can help us mitigate risks, protect lives, and safeguard vital resources [
3,
4].
In recent years, the intersection of meteorology and technology has ushered in a new era of weather prediction. Advanced computational models, satellite imaging, and data analytics have revolutionized our ability to anticipate and respond to weather-related challenges [
5]. Modern numerical weather prediction (NWP) has its roots in the 1920s [
6]. Yet, with the emergence of data-driven techniques, especially deep learning models, we have not only reduced computational demands but also maintained or even improved the accuracy of predictions in comparison with traditional NWP methodologies [
7]. This transformation has not only improved our understanding of atmospheric processes but has also paved the way for innovative solutions that can help us navigate an increasingly unpredictable climate.
The graph in
Figure 1 illustrates the projected growth of the weather forecasting service market from 2022 to 2032. The market size, measured in billions of USD, is seen to exhibit a consistent upward trajectory over the decade. Starting at 1.9 billion in 2022, it is projected to soar to 4.6 billion by 2032 [
8]. This steady increase underscores the escalating significance and investments directed toward weather forecasting services. The rise can be attributed not only to traditional methods but also to the increasing application of advanced technologies in the field. Leading tech giants such as Google, NVIDIA, and Microsoft have ventured into this domain, leveraging deep learning models to revolutionize weather forecasting [
9,
10,
11]. Such initiatives from major corporations validate the importance of weather prediction in today’s age and showcase the immense potential of combining meteorology with cutting-edge technology.
Within this dynamic landscape, the patenting of weather prediction technologies has emerged as a key indicator of innovation and progress. Patents not only serve as a testament to human ingenuity but also play a pivotal role in the dissemination and commercialization of cutting-edge weather forecasting methodologies and tools [
12]. Analyzing patent data within the field of weather prediction is thus a valuable endeavor, as depicted in
Figure 1, which illustrates the accelerating growth of the weather forecasting service market (2022–2032). This graph is justified by its relevance in highlighting the market’s rapid expansion, reflecting the increased investment and commercialization of cutting-edge weather forecasting methodologies and tools.
The primary objective of this paper is to conduct a comprehensive analysis of patent data within the field of weather prediction. By exploring patent grant durations, the geographical distribution of innovations, emerging technologies such as the “transformer” model, and keyword trends, this study seeks to provide valuable knowledge that could unlock crucial insights into the dynamics of innovation in a weather prediction domain that directly impacts society. Two different approaches, including trend analysis with text mining and predicting patent grant duration through machine learning algorithms, were utilized during the experiments.
Moreover, to complement our understanding and ensure a more holistic view, we integrated datasets from arXiv. arXiv, renowned for its rich repository of AI-related papers, provides an invaluable context to our study [
13]. By incorporating scholarly articles from arXiv, we could juxtapose our patent trends against a backdrop of cutting-edge academic discourse, especially in artificial intelligence.
By shedding light on the burgeoning role of artificial intelligence, showcasing predictive modeling possibilities for patent grant durations, and identifying future research directions, this paper contributes to bridging gaps in patent analysis. Ultimately, the research carries significance in its potential to drive innovation and enhance our capacity to address real-world weather-related challenges, thus benefiting both scientific advancements and society at large.
2. Literature Review
Numerous studies have delved into discerning patent trends using text-mining techniques. In this section, we begin by exploring the methodologies of these experiments, shedding light on their primary findings and outcomes. Additionally, we will also address various papers focusing on weather prediction within this discussion.
Rezende et al. presented a data-mining framework for patents by leveraging natural language processing. This technique focuses on analyzing technological trends and comparing patent similarities. Using the US Patent and Trademark Office’s data, a decline in flash memory and PDA technologies was observed from 2010 to 2018. Furthermore, to determine patent similarity, methods such as LSA, Word2vec, and WMD were compared with the Jaccard index. LSA and WMD showed comparable results, whereas Jaccard’s indications differed from the aforementioned methods [
14]. Gim et al., introduced a trend analysis method, leveraging ETI relations to discern patterns from patent datasets. Results from a real IoT patent dataset revealed 98.6% expansion relations and 1.4% transition relations among ETI relations. The study also confirms that proposed dictionaries enhance accuracy in extracting ETI relations from patent networks automatically [
15]. Han et al. developed a unique interactive method, PatStream, for analyzing patent dataset trends. This system integrates multiple views linked by brushing and offers a streamgraph view for spotting technological trends. Additional views present deeper insights such as IPC distribution, patent applicants, and innovation scores. Backed by advanced natural language processing, PatStream aids in concept extraction and patent similarity evaluations. A use case from the “inductive sensor” field demonstrated PatStream’s efficiency in giving a quick technology overview and aiding in technology management decisions [
16].
Hotte and Jee devised a framework that combines patent analysis and Twitter data mining to monitor emerging technologies. The research involved tracking the technology’s evolution through patents and gauging Twitter users’ perceptions and expectations about it. By comparing results from both data sources, CCATs demonstrated strong technological complementarities with mitigation, as over 25% of CCATs offer mitigation benefits, and this result offers insights into the development trends of the technology [
17]. Touboul et al. examined global innovation rates in climate adaptation technology, spotlighting leading nations and technology diffusion patterns. The findings indicate slower progress in adaptation technology innovation compared with low-carbon technologies since 2005, especially in slower-innovating sectors such as agriculture. A significant portion of this innovation is centralized in China, Germany, Japan, South Korea, and the U.S., which make up almost two-thirds of global patented inventions for climate adaptation. Notably, technology transfer through patents is minimal, particularly in areas such as agriculture and flood protection, with negligible transfers to low-income countries. This creates a prominent disparity between countries’ adaptation necessities and the available technological solutions [
18].
Distinct from the aforementioned studies, our work diverges in two critical ways. Firstly, we narrow our focus to specific AI technologies within patents related to weather prediction. Secondly, we delve deeper into assessing the influence of AI-related keywords on the duration it takes for a patent to be granted. This specialized approach sets our research apart from those related works previously introduced.
4. Result
Trend analysis was also conducted for certain keywords (“machine learning”, “deep learning”, and “transformer) with two different datasets from arXiv and Google Patent, which are exhibited in
Figure 8,
Figure 9 and
Figure 10. We calculated a Pearson correlation coefficient, which measures the linear relationship between datasets; a value close to 1 indicates a strong positive correlation. High correlation values (e.g., >0.9 for “machine learning” and “deep learning”) between academic research (arXiv) and patent trends suggest synchronized advancements in both sectors. Such synchrony implies that technologies or methodologies represented by these keywords are mature, with academia and industry innovating concurrently. On the other hand, a moderate correlation (e.g., 0.5 for “transformer”) might hint at a disparity between academic research and its industrial application.
It is essential to recognize that pioneering research, including cutting-edge models harnessing transformers for weather prediction, typically undergoes a gradual evolution from academic inception to practical applications that meet the criteria for patenting. For instance, FourCastNet, one of the leading models that utilizes transformers for weather prediction, was unveiled on arXiv in 2022, signaling that it may necessitate additional time before integrating into the patent landscape. Similarly, groundbreaking innovations such as ClimaX, a transformative leap in weather prediction, emerged as recently as 2023, making their immediate appearance in patents less probable [
10,
11]. These intricacies contribute to the observed delay in the emergence of “transformer” keywords within our patent dataset, underscoring the dynamic nature of technological adoption in the field of weather prediction.
However, we could predict the possible future trend of patents with “transformer” through the remarkable correlation between academic research trends and practical applications, as evident in the keywords “machine learning” and “deep learning”. These keywords yielded correlation scores exceeding 90%, signifying a robust alignment between the academic and industrial sectors in the assimilation and advancement of these cutting-edge technologies. In contrast, as mentioned earlier, “transformer” represents a relatively recent development within the broader domain of deep learning, potentially resulting in a slower pace of industry adoption compared with its academic prominence. Consequently, whereas “machine learning” and “deep learning” flourish in both academic research and patent filings, “transformer” is progressively etching its imprint, hinting at an impending transformation in the technological landscape of weather prediction patents. These dynamics underscore the vital importance of sustained monitoring and analysis to accurately capture and comprehend emerging trends.
In this experiment, we analyzed patent data to investigate the publication timelines of patents related to machine learning, deep learning, and neural networks, contrasting them with patents in other domains. We focused on patents filed after 2015 and examined the time difference between the priority date and publication date as a key metric. Our analysis revealed a noteworthy and somewhat unexpected pattern. As
Figure 11 shows, patents associated with those keywords exhibited a shorter mean time difference between priority and publication dates compared with patents without these keywords. This implies that, on average, patents in the domain of artificial intelligence (AI) and deep learning technologies tend to be published more rapidly following their initial filing. It could be concluded that AI technologies, marked by their fast-paced innovation and applicability, may encourage inventors to expedite the patent publication process to stake their claims in a competitive landscape.
Moreover, we undertook a comprehensive data preprocessing and feature engineering process to prepare patent data for the task of predicting patent grant durations. Our workflow involved several critical steps. First, we calculated the time span between the priority date and the grant date, a fundamental metric for understanding the patent grant process. To ensure data quality, we converted date columns to datetime objects and handled missing values. Notably, we introduced a categorical representation of grant durations, categorizing spans into “Short Turnaround”, “Medium Turnaround”, and “Long Turnaround”. Additionally, we utilized text vectorization through TF-IDF to convert patent titles into numerical features suitable for machine learning. To facilitate predictive modeling, we converted the categorical grant duration labels into integers using label encoding. Finally, we partitioned the dataset into training and test sets for model evaluation. These data preprocessing and feature engineering steps lay the foundation for subsequent machine learning tasks, enabling the prediction of patent grant durations based on patent titles and categorical duration labels. As
Figure 12 depicts, various machine learning algorithms including “GaussianNB”, “SVC”, “Gradient Boosting”, “XG Boost”, “LightGBM”, and “Extra Tree” were utilized as the classifiers, and the “LightGBM” attained the highest accuracy of 77.02%. For comparison, we also utilized the bidirectional encoder representations from transformers (BERT), but it only achieved about 66.5% accuracy. Even though the highest accuracy was below 80%, this experiment showed the possibility of predicting grant duration using only the title of the patent.
5. Conclusions
This paper explored patent data analysis, uncovering intriguing patterns and trends in the field of weather prediction. Notably, a significant proportion of weather prediction innovations adhered to a common timeline for patenting, with a distinctive peak in grant durations observed between 1500 and 2000 days. This standard timeline reflects the complexities of the patent review process in this domain. The years from 2010 to 2023 witnessed a consistent upward trend in patent grants for weather prediction technologies, with 2020 to 2022 marking periods of heightened innovation. The dataset showcased a global landscape, with the United States, China, and Japan as prominent contributors to weather prediction innovation.
One of the most striking findings was the growing influence of artificial intelligence (AI), exemplified by the emergence of AI-related keywords such as “machine learning” and “neural network”. These trends signify a paradigm shift in weather prediction, where data-driven methodologies are increasingly leveraged for improved predictive accuracy [
26]. Patents featuring AI-related keywords demonstrated a more condensed publication span in comparison with those lacking such keywords. Furthermore, this research delved into predictive modeling, drawing correlations between data sourced from arXiv and Google Patent. Keywords such as “machine learning” and “deep learning” produced impressive correlation scores exceeding 0.9, indicating that patent trends closely mirror those on arXiv, whereas the “transformer” keyword currently displays a lower correlation score; this can be attributed to its novelty in the tech landscape resulting in an inherent lag in its representation [
27].
Although this study provides valuable insights into patent dynamics in weather prediction and the growing influence of artificial intelligence (AI), it is essential to acknowledge its limitations. First and foremost, the predictive modeling aspect of the study, aimed at forecasting patent grant durations based solely on patent titles, exhibited a moderate level of accuracy. The highest accuracy achieved, approximately 77.02%, suggests room for improvement. To address this limitation and further enhance our understanding of patent grant durations, future research could explore the integration of multiple data modalities, such as patent text, images, and structured data [
28,
29,
30]. Additionally, qualitative analysis through interviews or surveys with patent examiners, applicants, and legal experts could provide valuable insights into the non-textual factors influencing patent grant durations [
31,
32].
In essence, this study offers valuable insights into the patent dynamics of weather prediction, illuminating the rising prominence of artificial intelligence (AI) and the promising prospects of predictive modeling in patent analysis. By interpreting patent grant durations, understanding innovation hotspots, and tracking recent technologies, this research contributes to a deeper comprehension of the dynamic weather prediction landscape. Moreover, it provides actionable knowledge that can inform decisionmakers in academia, industry, and policy, facilitating informed choices about research directions and resource allocation [
33]. As we bridge existing gaps and improve predictive accuracy, the findings presented in this study hold the potential to drive innovation in weather prediction technologies and ultimately enhance our ability to address weather-related challenges in the real world.