*7.2. Performance Comparison*

From an investor's perspective, the accuracy of a classifier is only of secondary importance compared to its usefulness as a support tool for investment decisions. Figure 12 shows the Year-End and Year-Highest return distributions for positive and negative predictions conducted by the random forest and SVM model. Since the target return group "Under 0%" is assumed not to be of interest for investors since correctly predicting that a stock may reach its target price, which is lower than the current price, is likely of limited investment value, these observations are not included in the return distributions presented in Figure 12.

*7.2. Performance Comparison*

sented in Figure 12.

**Figure 12.** Actual return distribution by prediction (excl. "Under 0%" target return group). **Figure 12.** Actual return distribution by prediction (excl. "Under 0%" target return group).

For the Year-End, especially the random forest, which was the most accurate model for this target, showed the most interesting distributions. In particular, positive predictions of the random forest did not just have a clearly higher median and mean than all returns (in grey), the first quartile also exceeds zero (3.2%). This means that less than 25% of the stocks for which the model predicted that the target price would be reached, experienced a negative return over the subsequent year. In contrast, the negative predictions lead to a median year-end return close to zero. Thus, close to 50% of the observations were characterized with a negative return whereas overall this is only the case for about 39.4% of observations. For the SVM the average year-end return is lower than that of all observations and the third quartile for negative predictions is larger than for positive ones, indicating that the top 25% of returns for negative predictions are actually higher than for positive predictions. It is noteworthy that for both the random forest and the SVM the distribution of negative predictions is wider, reflecting that for negative predictions there is a wide variety of returns that can be obtained. For the Year-End, especially the random forest, which was the most accurate model for this target, showed the most interesting distributions. In particular, positive predictions of the random forest did not just have a clearly higher median and mean than all returns (in grey), the first quartile also exceeds zero (3.2%). This means that less than 25% of the stocks for which the model predicted that the target price would be reached, experienced a negative return over the subsequent year. In contrast, the negative predictions lead to a median year-end return close to zero. Thus, close to 50% of the observations were characterized with a negative return whereas overall this is only the case for about 39.4% of observations. For the SVM the average year-end return is lower than that of all observations and the third quartile for negative predictions is larger than for positive ones, indicating that the top 25% of returns for negative predictions are actually higher than for positive predictions. It is noteworthy that for both the random forest and the SVM the distribution of negative predictions is wider, reflecting that for negative predictions there is a wide variety of returns that can be obtained.

From an investor's perspective, the accuracy of a classifier is only of secondary importance compared to its usefulness as a support tool for investment decisions. Figure 12 shows the Year-End and Year-Highest return distributions for positive and negative predictions conducted by the random forest and SVM model. Since the target return group "Under 0%" is assumed not to be of interest for investors since correctly predicting that a stock may reach its target price, which is lower than the current price, is likely of limited investment value, these observations are not included in the return distributions pre-

For the Year-Highest returns, the distributions look clearly different than for the Year-End returns. Both the random forest and the SVM show higher median and average returns than overall. Moreover, the positive predictions are characterized by a larger variation of the returns. Again, the random forest shows better performance in terms of the actual returns. However, it should be kept in mind that these are the Year-Highest returns, which means that the corresponding high stock prices are accomplished at some point during the year, likely not at year-end and not necessarily for a prolonged period of time. Thus, achieving such returns might be extremely challenging. In this regard, the Year-End returns might be of larger interest for investors since they only require the implementation For the Year-Highest returns, the distributions look clearly different than for the Year-End returns. Both the random forest and the SVM show higher median and average returns than overall. Moreover, the positive predictions are characterized by a larger variation of the returns. Again, the random forest shows better performance in terms of the actual returns. However, it should be kept in mind that these are the Year-Highest returns, which means that the corresponding high stock prices are accomplished at some point during the year, likely not at year-end and not necessarily for a prolonged period of time. Thus, achieving such returns might be extremely challenging. In this regard, the Year-End returns might be of larger interest for investors since they only require the implementation of a buy-and-hold strategy and do not necessarily require additional monitoring.

of a buy-and-hold strategy and do not necessarily require additional monitoring. The subsequent analysis will, thus, focus on the Year-End returns achieved using the most accurate model, the random forest. Figure 13 depicts the Year-End return by target return group accomplished with negative and positive predictions of the random forest.

**Figure 13.** Year-End return distribution by random forest prediction and target return group. **Figure 13.** Year-End return distribution by random forest prediction and target return group.

It is apparent that the median and average return by year-end is considerably higher for positive predictions of the random forest for stocks with target prices between "30% to 70%" and those "Above 70%". The shares of these predictions compared to all predictions made are overall very low, 1.5% and 0.4%, respectively. However, they appear of interest as it suggests a potentially higher return for stocks with high target prices for which the random forest predicts that they will meet the target price. Positive predictions are with a share of only 4.1% even within the "Above 70%" target return low (0.4% overall). Thus, positive predictions for "Above 70%" target returns are very rare but appear to be associated with very high average and median returns. It is apparent that the median and average return by year-end is considerably higher for positive predictions of the random forest for stocks with target prices between "30% to 70%" and those "Above 70%". The shares of these predictions compared to all predictions made are overall very low, 1.5% and 0.4%, respectively. However, they appear of interest as it suggests a potentially higher return for stocks with high target prices for which the random forest predicts that they will meet the target price. Positive predictions are with a share of only 4.1% even within the "Above 70%" target return low (0.4% overall). Thus, positive predictions for "Above 70%" target returns are very rare but appear to be associated with very high average and median returns.

The subsequent analysis will, thus, focus on the Year-End returns achieved using the most accurate model, the random forest. Figure 13 depicts the Year-End return by target return group accomplished with negative and positive predictions of the random forest.

This finding was manually verified for companies in this group (positive prediction and "Above 70%" target return), which were characterized by the highest returns (200% or higher). Of the 12 companies that were contained in this subset, these extremely high positive returns were observed during recoveries of the stock prices which were prior over 90% below their all-time highs (e.g., Vestas Wind Systems A/S in 2012, SunPower Corp. in 2012 and 2019, Enphase Energy in 2017, First Solar in 2012). Apart from that, some companies simply experienced a stock price surge to new all-time highs after 2020, which has been an exceptional year due to the COVID-19 pandemic (e.g., Enphase Energy, Sunrun Inc, Bloom Energy Corp., Sunnova Energy International). Thus, the results appear plausible, but this does not necessarily mean that they are repeatable. This finding was manually verified for companies in this group (positive prediction and "Above 70%" target return), which were characterized by the highest returns (200% or higher). Of the 12 companies that were contained in this subset, these extremely high positive returns were observed during recoveries of the stock prices which were prior over 90% below their all-time highs (e.g., Vestas Wind Systems A/S in 2012, SunPower Corp. in 2012 and 2019, Enphase Energy in 2017, First Solar in 2012). Apart from that, some companies simply experienced a stock price surge to new all-time highs after 2020, which has been an exceptional year due to the COVID-19 pandemic (e.g., Enphase Energy, Sunrun Inc, Bloom Energy Corp., Sunnova Energy International). Thus, the results appear plausible, but this does not necessarily mean that they are repeatable.

Figure 14 allows a more detailed look at the positive return predictions of the random forest in terms of hits and misses. Figure 14 allows a more detailed look at the positive return predictions of the random forest in terms of hits and misses.

It is unsurprising that when the model correctly predicts a target price being met (i.e., a hit), the returns achieved are higher than when a misclassification occurs (i.e., a miss). Moreover, it is intuitive that correctly predicting higher return groups leads on average to higher returns. Having said that, it is noteworthy that the magnitude of the actual returns in the "30% to 70% and the "Above 70%" target return group are very high—on average 195.2% and 296.5% respectively. However, the magnitude of the returns associated with misses appears even more interesting. The average returns are in general negative, but their magnitude decreases for higher target return groups. In other words, the higher the target return group, the smaller the consequences of misclassifications. This appears plausible given that higher average target returns reflect a higher confidence of analysts in a company's stock. Moreover, a higher target return also means that the range of positive returns a stock can accomplish while not meeting the target price is larger. The extreme case is the "Above 70%" target return group for which the average return of misclassifications is still positive with an average return of 18.6% and a median return of even 28%. The low or even positive average returns for misclassifications is one of the contributing factors

for the overall high average returns of positive predictions for high return groups. Lastly, it is noteworthy that the share of hits for the positive predictions (=precision) is often around 70% and appears rather consistent throughout the return groups. This indicates that independently of the magnitude of the return group the positive predictions of the random forest model are largely correct. *Sustainability* **2021**, *13*, x FOR PEER REVIEW 23 of 29

**Figure 14.** Average Year-End return for hits and misses of positive predictions of the random forest. **Figure 14.** Average Year-End return for hits and misses of positive predictions of the random forest.

It is unsurprising that when the model correctly predicts a target price being met (i.e., a hit), the returns achieved are higher than when a misclassification occurs (i.e., a miss). Moreover, it is intuitive that correctly predicting higher return groups leads on average to higher returns. Having said that, it is noteworthy that the magnitude of the actual returns in the "30% to 70% and the "Above 70%" target return group are very high—on average 195.2% and 296.5% respectively. However, the magnitude of the returns associated with misses appears even more interesting. The average returns are in general negative, but their magnitude decreases for higher target return groups. In other words, the higher the target return group, the smaller the consequences of misclassifications. This appears plausible given that higher average target returns reflect a higher confidence of analysts in a company's stock. Moreover, a higher target return also means that the range of positive returns a stock can accomplish while not meeting the target price is larger. The extreme case is the "Above 70%" target return group for which the average return of misclassifications is still positive with an average return of 18.6% and a median return of even 28%. The low or even positive average returns for misclassifications is one of the contrib-From an investors' point of view, it should be kept in mind that clean energy stocks represent a relatively new asset class that tends to be very volatile [64]. Moreover, the performance of clean energy companies is linked to the (crude) oil price where the oil price has a unidirectional short-term causality on the price of alternative energy companies [65] and the volatility of the oil price affects the profitability of these stocks [66]. Apart from that, previous research found that the volatility of the oil market (e.g., measured by OVX) impacts the volatility of clean energy companies [67] and vice versa [68] and that this spillover effect of volatility is stronger than the spillover effect of returns [69]. Moreover, during the COVID-19 pandemic, the volatility spillovers appear to have intensified [66]. Apart from the (crude) oil market, technology stocks, and investor sentiment towards renewable energy have been shown to affect the stocks of cleantech companies as well [69,70]. Finally, it is noteworthy that hedging against adverse movements of clean energy stocks can be possible using the volatility index VIX or crude oil [64] and that clean energy companies can be part of profitable hedging strategies themselves [68] as well as contributing to portfolio diversification, e.g., in times of extreme market events (e.g., a pandemic) [66].

#### uting factors for the overall high average returns of positive predictions for high return **8. Conclusions**

groups. Lastly, it is noteworthy that the share of hits for the positive predictions (= precision) is often around 70% and appears rather consistent throughout the return groups. This indicates that independently of the magnitude of the return group the positive predictions of the random forest model are largely correct. From an investors' point of view, it should be kept in mind that clean energy stocks represent a relatively new asset class that tends to be very volatile [64]. Moreover, the performance of clean energy companies is linked to the (crude) oil price where the oil price has a unidirectional short-term causality on the price of alternative energy companies [65] and the volatility of the oil price affects the profitability of these stocks [66]. Apart from that, previous research found that the volatility of the oil market (e.g., measured by OVX) impacts the volatility of clean energy companies [67] and vice versa [68] and that this spillover effect of volatility is stronger than the spillover effect of returns [69]. Moreover, during the COVID-19 pandemic, the volatility spillovers appear to have intensified In this paper, the accuracy and predictive power of mean target prices for the stocks of companies contained in the Standard and Poor's Global Clean Energy (USD) index were investigated. This study shows that the mean target prices for these stocks during the timeframe from 2009 to 2020 are on average 22.2% above the current stock price. This is in line with recent research works that cover time periods after 2000, whereas studies covering partially or entirely the 1990s show higher implied returns for target prices. The Year-End accuracy of 46.6% (41.5% excl. 2020) shows that only less than half of the mean target prices were met by year-end, whereas the Year-Highest accuracy of 68.1% (62.5% excl. 2020) highlights that close to two thirds of mean target prices are met at some point during the 12 months. These results are similar to those found in recent research, illustrating that the accuracy for global clean energy stocks is not considerably different than those of different cross-sections of stocks in different stock markets. In line with previous research, the

[66]. Apart from the (crude) oil market, technology stocks, and investor sentiment towards

[69,70]. Finally, it is noteworthy that hedging against adverse movements of clean energy stocks can be possible using the volatility index VIX or crude oil [64] and that clean energy average accuracy of target prices decreases as the implied target return increases, meaning that relatively higher target prices are less likely to be met.

Subsequently, a random forest and an SVM classification model were trained using both the Year-End and the Year-Highest target for the mean target prices and were compared to a random model. The random forest leads in both cases to the highest classification accuracy but both the SVM and random forest are highly significantly more accurate than the random model. Unsurprisingly, the best average accuracy of 73.24% for the Year-End target is lower than the best average accuracy of 81.15% for the Year-Highest target. This appears to reflect that meeting a target price at any point during the 12-month period is easier to predict than meeting the target price only at a single point, at the end of the 12-month period. The analysis of the variables shows that for all models the mean target price is the most relevant variable, whereas the number of target prices appears to be relevant as well. This is in line with previous research that suggested that the implied return of target prices and the number of analysts covering a stock are linked to the accuracy of target prices. A detailed analysis of the results in terms of these two variables for the Year-End target indicates for the random forest that this model is particularly accurate for the high target returns ("30% to 70%" and "Above 70%"), especially when the number of target prices is high (coverage of at least 15 analysts). For these subsets, only a few positive predictions are made but those are in the vast majority of cases correct. Thus, it is unsurprising that the actual mean and median returns for high target return groups are considerably higher than for all observations. These high actual returns are based on extremely high mean and median returns for actual hits and close to positive or even positive returns when positive predictions for high target returns are incorrect. Consequently, following the rare positive predictions of the random forest for the highest target return groups ("30% to 70%" and "Above 70%") may represent potentially attractive investment opportunities.

Some limitations apply to the results of this study. First, the results are obtained for a selection of clean energy stocks, which may not be generalizable for stocks in other sectors or even all clean energy stocks. Moreover, the results are in line with recent research but show clear differences to older research, highlighting that the implied returns and accuracies may differ in various time periods and may also be different in the future. For future research, a set of global stocks from a wider range of sectors can be investigated to confirm the findings. Moreover, additional variables linked to the company and the past stock performance can be included for the classification model, and investment strategies following the corresponding model predictions can be presented.

**Author Contributions:** Conceptualization, C.L. and A.L.; methodology, C.L.; software, C.L.; validation, C.L.; formal analysis, C.L.; investigation, C.L. and A.L.; data curation, C.L. and A.L.; writing original draft preparation, C.L. and A.L.; writing—review and editing, C.L.; visualization, C.L. and A.L.; project administration, C.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the Kone Foundation, the Finnish Academy of Science and Letters, and the Finnish Strategic Research Council, grant number 313396/MFG40 Manufacturing 4.0.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data used in this study were obtained from the commercial Database "Datastream". The information on the location of companies' headquarters and current market capitalization are obtainable free of charge from the website finance.yahoo.com (accessed on 19 July 2021).

**Conflicts of Interest:** The authors declare no conflict of interest.
