A Machine Learning-Based Forecast Model for Career Planning in Human Resource Management: A Case Study of the Turkish Post Corporation

Gülten, Hakan; Baraçlı, Hayri

doi:10.3390/app14156679

Open AccessArticle

A Machine Learning-Based Forecast Model for Career Planning in Human Resource Management: A Case Study of the Turkish Post Corporation

by

Hakan Gülten

^* and

Hayri Baraçlı

Department of Industrial Engineering, Yıldız Technical University (YTU), Istanbul 34349, Turkey

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(15), 6679; https://doi.org/10.3390/app14156679

Submission received: 5 June 2024 / Revised: 17 July 2024 / Accepted: 23 July 2024 / Published: 31 July 2024

(This article belongs to the Special Issue Machine Learning and Artificial Intelligence for Human Information Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

In sustainable and competitive business management, it is crucial for organizations to consider organizational change and transformational leadership in human resource (HR) management to adapt to the changes in their environment. This capability enables large-scale enterprises to maintain their presence in an increasingly competitive environment through enhanced management capacity. Enterprises that adopt transformational leadership in HR management must equip leadership candidates with competencies such as creating a shared vision, providing appropriate role models, encouraging the adoption of group goals, meeting high performance expectations, and providing individual support and the development of intellectual stimulation. By identifying potential leadership candidates using a decision support model, the necessary competencies can be developed through in-service training and experiential learning during their careers. Innovative and effective approaches to identifying leadership candidates can be developed by analyzing complex big data using advanced artificial intelligence (AI) techniques. In this article, a forecast model using machine learning (ML) algorithms for a human resource management career planning approach was developed for the Turkish Post Corporation (PTT) and it was tested to predict potential leadership candidates by analyzing the big data of 5000 employees. The Turkish Post Corporation ML algorithms were applied to 100 randomly selected data points using the k-Nearest Neighbor (kNN), Random Forest (RF), Gradient Boosting (GB), and Support Vector Machine (SVM) algorithms to predict the types of titles held by the staff employed at PTT. The kNN, GB, RF, and SVM algorithms achieved accuracy rates of 96%, 91%, 73%, and 41%, respectively. The case study results indicate that promotion decisions in large-scale and rooted enterprises can be effectively modeled using big data and ML algorithms, highlighting significant potential for HR management and leadership development practices in the public sector.

Keywords:

transformational leadership; human resource management; machine learning; Turkish post corporation

1. Introduction and Literature Review

Businesses need effective leaders who understand and can manage the complexities of the rapidly changing global environment. We are witnessing a new paradigm shift in traditional management and leadership understanding. In recent years, leadership has increasingly emphasized qualities such as creativity, guidance, motivation, and influence, diverging from traditional command- and authority-based management paradigms [1,2]. These qualities are considered essential for navigating organizational complexities and fostering innovation [3].

To maintain their presence in a competitive environment, enterprises must expect potential leadership candidates in human resource management to appreciate employees, care for them, involve them in decision-making, allow for transparency, improve the working atmosphere, and reduce stress while increasing resources [4]. Leadership candidates should acquire these competencies through training programs and/or experience during their career development process. The transformational leadership approach provides a roadmap for acquiring these competencies.

According to Burns, who first introduced the concept of “transformational leadership”, a transformational leader is someone who recognizes the wants or needs of customers, employees, and other stakeholders, meets these needs, understands how to motivate the relevant parties, and enables them to move toward higher-than-normal goals [5]. Theories on transformational leadership that began with Burns emphasize the leader’s ability to be visionary, to make all internal and external stakeholders believe in the proposed vision, to be a role model, to build close relationships, to instill confidence in change, to convince and encourage the stakeholders, to motivate them to adapt to change, to turn personal interests into common interests, and to mobilize and direct them toward common goals. Transformational leaders can be categorized into various types. Among intellectual transformational leaders are thinkers such as Locke, Bentham, and Mill, while notable reformist leaders include Alexander the Great. Leaders who radically changed existing systems include Luther and Lenin. Alongside these examples, Jeff Bezos’ vision for Amazon and its entry into e-commerce exemplify successful transformational leadership based on short-term goals [6,7].

In career planning, public administrators’ technical, managerial, and leadership skills are developed through better training curricula compared to the past [8]. An appropriate management approach that aligns with the corporate culture is a competency that administrator candidates will gain through experience. The ability to make career plans by selecting leadership candidates from the current employee profile provides human resource management with a proactive capability. Effective data collection is critical in this context as it informs resource management and enhances employee capability. Systematic data collection enables HR managers to identify potential leaders within the organization, ensuring a well-prepared talent pipeline that aligns with strategic goals [9]. Additionally, collecting and analyzing data on employee performance, skills, and career aspirations helps in making informed decisions about training and development programs, thereby fostering a more competent and motivated workforce [10]. This proactive approach not only aids in succession planning but also enhances overall organizational fluidity by ensuring that the right talent is ready to meet future challenges [11]. Hence, effective data collection serves as the foundation for strategic HR practices that support long-term organizational success. Therefore, the selection and training of leadership candidates in organizational development, transformation, and human resources is an important subject of study and enterprises or organizations need model-based approaches to be pursued in this matter.

Although studies on the development of model-based approaches are limited, research indicates that individuals with high-quality relationships with their managers are more likely to accept change [12]. In evolving market conditions, enterprises seeking to enhance their competitiveness are increasingly adopting transformational leadership day by day. In the public sector, where the demand for improved public service is rising, HR management decisions should be guided by a systematic, data-driven leadership approach [13].

In literature-based research, it has been observed that academic research on the concept of “Transformational Leadership” has started to increase since the 2000s, and the number of publications, which was 19 per year in 2010, increased to 157 publications per year in 2020 [14]. Especially in recent years, the importance and popularity of this concept has increased significantly. Likewise, “Job Satisfaction” is also a prominent topic in the field of leadership. In addition, leadership approaches that focus on ethical values such as “Ethical Leadership” and “Service Leadership” have also been found to be an important area of research [14,15]. These approaches emphasize that leaders exhibit behaviors in accordance with ethical values and adopt a service-oriented leadership approach. Researchers emphasize these concepts to understand the various dimensions of leadership and their interactions.

Based on these analyses, it is possible to make certain inferences about the future orientations of leadership studies. In particular, it can be predicted that topics such as “Transformational Leadership” and “Job Satisfaction” will continue to increase in importance. In addition, leadership approaches focusing on ethical values and research on different leadership styles will also be among the topics that will persist. It is seen that the literature on leadership has shown significant development in the period between 2000 and 2022 and that certain concepts have come to the forefront. These analyses provide a valuable resource for understanding the trends in leadership research and for guiding future studies. In future studies, these findings can be further examined in depth and contribute to the development of leadership theories and practices.

Sources of technology are pioneering digital transformation and require a set of skills that are not understood and need to be developed by leadership. Cloud computing has facilitated ML and AI where human insight is limited, using algorithms for analytics that require greater size and scale to provide data for decision-making and enabling transformative technologies that are changing the face of industry sectors [16]. The extensive data streams in the digital economy have created a new paradigm for business intelligence processes, increasing the potential of advanced analytical and cognitive data tools. Big data structures are used in business intelligence (BI) to work with large amounts of data to extract value for effective business decisions [17]. Big data is transforming BI processes and increasing the use of advanced analytics as well as cognitive data tools. Sound leadership can be achieved by administrators being role models and managing their employees effectively. Appreciating and caring for employees, involving them in decision-making, providing transparency, improving the work atmosphere, reducing stress, and increasing resources are key characteristics of sound leadership.

However, factors such as leadership skills and information availability are critical for the use of performance information. Decentralized structures in the public sector can foster a culture of training and development through networks and intensive collaboration. New technologies such as ML and AI are geared toward processing larger data volumes and a greater number of traits in leadership research. However, the interpretation of the relationship between leadership effectiveness and traits is less clear compared to traditional methods. Therefore, it is important to consider methodological approaches and new methods in leadership research.

In various studies examining behavioral analyses, information was collected and analyzed through studies in which visual reality experiences were tested in a virtual environment, where real-life experiences that address digitalization were simulated on subjects [18,19,20,21,22]. ML techniques offer the opportunity to develop innovative and model-based analytical approaches in leadership research [23]. The use of AI techniques and ML in leadership research studies contributes to the literature as it enables analysis with predictive causal models to complement hypothesis-based research [23,24]. Using these techniques in large-volume and diverse datasets contributes to producing meaningful results. ML is a discipline that enables computers to develop predictive models based on empirical data through algorithms used in AI [25]. Nowadays, ML algorithms can examine a large number of variables and generate combinations that reliably predict outcomes [23]. Recently, a growing number of researchers have investigated how ML techniques applied to big data can be used to study the behavior of individuals in the workplace [26]. Some leading companies have started to use AI techniques such as ML to digitize decision-making and improve processes to increase employee engagement as well as customer satisfaction [27]. In a study conducted to predict employee turnover in human resource management, employee turnover and the reasons for turnover in organizations were investigated using ML algorithms and forecasts were made with an accuracy of 93%. In this study, monthly income, hourly wage, job level, and age were also evaluated as the most important factors, and assessments were made to improve the causes of employee turnover [22].

Academics and practitioners have used various ML and AI algorithms in dataset analysis and forecast models to analyze the dataset and predict relationships. For instance, in an organization where board members are the decision-makers, complex data on human resources were analyzed with ML algorithms and significant results were obtained in identifying and predicting relationships in datasets [28]. In a study where an AI model was developed by combining artificial neural networks and fuzzy logic, the Multifactor Leadership Scale was used in data collection to measure leadership perception in 102 construction companies operating in the construction sector in the city of Adana, and it was observed that BP, HB, and GA optimization algorithms produced successful results in predicting leadership perception [29].

In another study, a computational ML method was used to identify and predict leadership perceptions for prominent individuals, building knowledge representations for individuals with high-dimensional semantic vectors derived from large-scale news media datasets. Subsequently, a model paired with respondents’ ratings of leadership effectiveness achieved high out-of-sample accuracy in predicting respondent ratings and was shown to be capable of predicting leadership perceptions of different demographic and political subgroups [30].

In a study investigating the relationship between personality traits and leadership using ML techniques, an empirical study was conducted to resolve uncertainties. In this study, a large database of trait variables and leadership role occupation (n = 3385) was used to compare the forecast performance of traditional (parametric) linear models (LMs) with the nonparametric Random Forest (RF) analysis technique. By comparing the predictive performance of an LM and an RF, the complexity of the leader trait paradigm was tested and how academics can unlock the black box of RF models with a range of analytical techniques was presented. It is stated that the complex yet explicable results obtained with ML techniques represent an important step forward in the study of leadership [24].

In a study aiming to identify management practices that lead to satisfaction among employees in a holistic and consistent manner, using text mining and unsupervised ML methods, an examination of a large number of employees (n = 5650) found that investing in tools that help foster employees’ positive emotions both increases employee motivation and positively affects company profitability [31]. However, the concepts of positive practices or virtuousness emanating from administrators have not yet been consistently explained.

Many of the new ML methods are designed for larger data volumes (i.e., larger sample sizes) and a higher dimensionality (i.e., a greater number of features, such as forecast variables) in a dataset than are the norm in leadership research. Furthermore, while advances in ML can help with forecasts, interpreting the results is often less clear than with traditional approaches. This raises the question to what extent using various ML algorithms and techniques can clarify the relationship between traits and leader effectiveness (or any other leadership relationship). In recent years, advances in big data science and data analytics have provided opportunities for innovative work in the analysis of human psychology and productivity and behavioral differences. Nevertheless, despite their potential, big data and ML techniques are rarely used in management studies and especially in leadership research [30]. Developments related to Industry 4.0 have popularized the use of data-driven systems in leadership, organizational management, and customer relations for businesses [32].

As a result, it is concluded that the development of approaches that effectively use ML and AI models that take into account big data in the analysis of leadership skills will provide significant benefits to businesses on the way to achieving successful and sustainable results in human resource management career planning. In this study, big data of human resources is analyzed using ML algorithms and a forecast is made for employee titles to identify potential leadership candidates. In Section 2, the model developed and the algorithms used are presented. In Section 3, the case study is described and the model implementations are shared. In Section 4 and Section 5, the results are evaluated.

2. Methodological Approach

In the model approach developed for this study, raw data were described using basic statistics after data extraction and analytics studies. The prepared datasets were then analyzed and evaluated using ML algorithms. This study utilized 29 input variables, categorizing the main inputs into qualitative, quantitative, and historical categories. The output variable of this study is a categorical variable used to determine employee titles. The methodology’s flow and parameter characteristics are visualized in Figure 1.

Data cleaning, data filtering, and data validation were performed before the raw data provided for this study were analyzed using basic statistical methods. The variables in the resulting model were classified as input variables categorized into qualitative, quantitative, and date types, while the output variable was categorized as categorical. In this study, the dataset taken from 81 provincial directorates was used. An appropriate machine learning (ML) model was designed to process the dataset using the model developed in the Orange Data Mining program, which will be addressed in Section 3.2. The model was run and its results compared with the test data to obtain and evaluate the model results.

2.1. Machine Learning Algorithms Used

A forecast model using ML algorithms was developed to facilitate career planning in human resource management for the PTT, and it was tested to predict potential leadership candidates by analyzing the big data from 5000 employees. In this study, which discusses the benefits of adopting transformational leadership in HR management and the potential role of ML and AI in this field, the open-source Orange Machine Learning program was used, and the results of four ML algorithms that yielded meaningful results on the categorical outcome variable in the developed model are presented below along with their principles.

2.1.1. k-Nearest Neighbor (kNN) Algorithm

The kNN algorithm is a powerful algorithm for classifying or estimating nonparametric data. To classify or predict a data point with this algorithm, the estimated data of the output parameter are calculated by looking at the labels of their k-closest neighbors [33]. Choosing the right k in the k-Nearest Neighbor (kNN) algorithm is crucial as it determines the model’s ability to generalize effectively: a small k risks overfitting to noise, while a large k may lead to oversimplified predictions, impacting overall accuracy and reliability. Therefore, kNN modeling is an important step in determining the optimum value of k when determining the nearest neighbor of a query [34]. The parameters of this model are shown in Table 1.

2.1.2. Random Forest (RF) Algorithm

Developed in 2001 by L. Breiman, the Random Forest (RF) algorithm effectively uses classification and regression methods to obtain forecast data [20]. In essence, forecast data are calculated using the averages of randomly selected decision tree data. RF performs well on datasets with multiple variables and extensive observation data [35]. The algorithm obtains a set of decision trees using the training data to create a bootstrap sample. As each tree is developed individually, the model determines the best discrimination feature by randomly selecting a subset of features [36]. The parameters of this model are set as indicated in Table 2.

2.1.3. Gradient Boosting (GB) Algorithm

The GB algorithm uses decision trees to predict output variables like RF. This algorithm creates new trees consecutively, with each tree correcting the forecast errors that the previous trees failed to. The GB algorithm has many advantages such as the ability to avoid overfitting, the ability to be unaffected by the noise ratio, and the ability to adapt to categorical variables [37]. The parameters of this model are presented in Table 3.

2.1.4. Support Vector Machine (SVM) Algorithm

A Support Vector Machine (SVM) is an important algorithm that is widely used in the field of ML and provides effective results in various classification and regression problems. The SVM algorithm represents an approach that best separates data points between classes and maximizes the accuracy of the model. One of the most important features of an SVM is that it can also be effective in non-linear datasets. The parameters of this model are shown in Table 4 [38].

3. Case Study of Human Resources of Turkish Post Corporation

The Turkish Post Corporation is a 183-year-old well-established public institution. The PTT offers postal, cargo, logistics e-services, and postal banking services and is the organization with the most widespread service network in Turkey. With nearly 40 thousand employees, the PTT has a very diverse employee profile and it is thought that the results of the case analysis conducted with the big data obtained from the employees will provide very important outputs for the literature. We aimed to use the results of this study on predicting the titles of the personnel employed in the PTT by using ML algorithms in human resource career planning.

3.1. Big Data Analysis and Basic Statistics

The dataset lacks the volume and velocity of big data, yet the application of advanced machine learning algorithms such as k-Nearest Neighbor, Random Forest, Gradient Boosting, and Support Vector Machine significantly enhances its analytical capabilities. These algorithms enable a sophisticated analysis that allows for the extraction of meaningful and valuable insights, thus positioning the dataset within the realm of big data analytics. For instance, RF and GB algorithms excel at uncovering complex relationships among numerous features and variables, while kNN and SVM algorithms are highly effective in performing accurate classification and prediction tasks. This methodological approach compensates for the dataset’s modest size, providing analytical capabilities comparable to those typically associated with larger datasets.

The strategic application of these machine learning techniques offers substantial advantages in business decision-making processes. RF and GB algorithms can identify and prioritize significant features and variables, optimizing business operations and revealing critical insights into customer behavior patterns, market trends, and operational efficiencies. kNN and SVM algorithms enhance predictive analytics, facilitating the forecasting of future trends and events, which supports proactive measures and strategic planning. Despite the dataset comprising only 30 columns and 4890 rows, the integration of structured, semi-structured, and unstructured data types enriches its analytical potential, aligning it more closely with the criteria of big data. The sophisticated use of these machine learning algorithms not only elevates the dataset’s significance but also demonstrates that valuable big data insights can be derived, underscoring the dataset’s utility in advanced data analytics.

A data cleaning process was carried out to correct inconsistencies and/or errors and missing data within the raw dataset. Data integration efforts were undertaken to combine and harmonize the data obtained from different sources across 81 provincial directorates in Turkey. The data from the provincial directorates were cross-checked with the central data for verification, and a data validation process was conducted.

In this study, many parameters such as education, personal development processes, such as demographic structure, and the working conditions of the employee, which are thought to be effective on leadership, especially on title types, were taken into consideration. Within the scope of these parameters, data belonging to approximately 5000 employees were analyzed, and some titles and some data belonging to the considered titles were extracted from the general data. In order not to affect the results of the statistical analyses (biased), the data of the titles with particularly low data counts were not used and a total of 4890 data points belonging to the titles with high data counts were included in this study. This study was conducted on 141,810 cellular data points describing 29 different qualities of 4890 employees.

Statistically defined variables (Title Start Date, Title End Date, Title Total Days, Registry Number, PTT Service Year, PTT Start Date, PTT Turnover Date, Year of Birth, Gender, Age, Number of Children, Place of Birth, Province of Registration, Education Status, Manual School, Manual Department, Duration of Education, Total Duration of Higher Education, Date of Graduation, Health, Number of Awards, Number of Penalties, Duration of Education, Number of Trainings, Certificate, Language Score, Report, Union, Unified Service Days) are defined as input variables and one variable (Title) is defined as the output variable. For a better understanding of the descriptive statistics, tables explaining the input and output variables have been added to Appendix A.

The dataset, which was created categorically and by coding the titles, was visualized using a clustered column graph in order to compare the title values. Descriptive statistical data were obtained by taking into account the years of service of the personnel employed in the PTT organization under the titles they hold. In addition to taking into account the years of service under different titles of different personnel, the years of service of the same personnel belonging to different titles were also taken into account, and the average working time for the titles of the executive staff is shown in Figure 2.

When the provincial and central administrators are evaluated together, the descriptive statistics of the three titles with the highest average years of service are shown in Table 5.

When the provincial and central administrators are evaluated together, the descriptive statistics of the three titles with the lowest average years of service are shown in Table 6.

Descriptive statistical data were sought to be obtained by taking into account the years of service of the staff employed in the PTT under the titles they hold. In addition to taking into account the different title years of different staff, the service years of the same staff belonging to different titles were also taken into consideration, and the average working time of the staff in their titles is presented in Figure 3.

When the rural and headquarter employees are evaluated together, the descriptive statistics of the three titles with the highest average years of service are shown in Table 7.

When the rural and headquarter employees are evaluated together, the descriptive statistics of the three titles with the lowest average years of service are shown in Table 8.

According to years of service, B4 (Provincial Director), D1 (Deliverer), G3 (Office Worker), K4 (Controller), K7 (Security Officer), M6 (Manager), P2 (Postman), T9 (Technicist), and V4 (Treasurer) have more years of service than other titles. The high number of staff employed in these titles is due to the fact that these are the staff serving in the provinces.

When the descriptive statistical data of the length of service of the staff employed in the PTT based on their titles without making any distinction between administrators and employees are evaluated on the basis of years, it is observed that the staff employed in the PTT under the titles coded B4 (Provincial Director), D1 (Deliverer), D7 (Assistant Auditor), G3 (Office Worker), K7 (Security Officer), P2 (Postman), T9 (Technicist), and V4 (Treasurer) have spent more time (years) in these titles compared to other titles.

The box plot obtained by taking into account years of service by title is illustrated in Figure 4. An outlier is a large or small observation in a dataset that is outside of the normality. Especially in statistical analyses, outliers, which are effective in highlighting the emergence of a disproportionate effect, cause misleading results (or interpretations). In general, discrete values that negatively or misleadingly affect the results of statistical analysis are excluded from the scope of the analysis, but in this study, they are included in the scope of the analysis in order to avoid bias features. The titles with the most outlier value points are K7 (Security Officer) and G3 (Office Worker), followed by B12 (Computer Operator), B4 (Provincial Director), D1 (Deliverer), D5 (Typist), K4 (Controller), M3 (Official (No.3)), M6 (Manager), P2 (Postman), P4 (Programmer), T5 (Technical Manager), T9 (Technicist), U3 (Assistant Specialist), V3 (Data Preparation and Control Operator), and V4 (Treasurer).

When the gender data of the staff employed in the PTT are analyzed according to years of experience, it is observed that 73% of the 4890 employees are male and 27% are female. When the years of service of PTT employees are analyzed according to their gender, it is seen that 3568 male employees have worked in a position for an average of 19.26 years and 1322 female employees have worked in a position for an average of 13.45 years. These results are considered to be related to the high number of operational-level employees (standard deviation of 10.82 for males and 9.71 for females). The visualization of the frequency graph of male and female employees by title according to the normal distribution curve is presented in Figure 5.

The staff employed at the PTT have different service periods (years) under different titles. The normally distributed histogram graph of title service durations according to the gender of the staff along with the title they hold is shown in Figure 6.

When the length of service in terms of title is analyzed, it is observed that the length of service under a title is clustered between 0 and 7 years. This period is similar for both male and female staff. It is noted that a male employee of the organization has served under a title for an average of 9.2 years, while a female employee has served under a title for an average of 7.86 years.

Multidimensional Scaling Algorithms (MDSs)

Multidimensional scaling algorithms (MDSs) iteratively move points around in a simulation of some kind of physical model, where if two points are too close together (or too far apart), there is a force pushing them apart or pulling them together. At each time interval, the change in a point’s position corresponds to the sum of the forces acting on it. The MDS plot of the title variables in this study is shown in Figure 7.

When we evaluate the relationships between the datasets studied with the MDSs, it is seen that the same titles are generally located close to each other. This suggests that these data are similar to each other. There are some gaps or less dense parts between different sets of titles, which means that there are differences between these titles. The titles are represented by different colors. On the graph, the more intense the color, the more intense the data frequency in dense parts. It can be inferred that these titles are more represented and have more data.

Employees are grouped according to their titles using different colors. Areas where colors are concentrated indicate that certain title categories are clustered together and share similar characteristics in the graph. For example, the concentration of green, purple, and brown dots in the upper left corner of the graph suggests that employees with titles such as D1, G3, and K4 have similar traits and are clustered together. Similarly, the concentration of yellow and light blue dots in the lower right corner indicates that title categories like M6 and T9 are clustered together, forming a specific group (see Appendix A for more information about employee titles). These types of clusters suggest that employees in these title categories have similar job roles or responsibilities. Additionally, isolated points (e.g., V4 and TB title categories in the upper right corner) indicate that employees in these categories have different characteristics from others and may possess special skills or be involved in different job functions.

The breadth of the color scale indicates that there are many different title categories in the dataset, and these categories are highly diverse. This demonstrates that the HR data are comprehensive and varied, reflecting employees’ different skill sets and job roles. The density of points in certain areas of the graph indicates the data density and the number of employees in those areas. For example, the dense points representing the P8 and PB categories in the center of the graph suggest that the number of employees in these categories is higher compared to others. Such visualizations provide a practical example of how big data analytics can be utilized in human resource management, contributing to more data-driven and effective HR strategies [39,40]. Understanding how employees are distributed according to titles and which title groups share similar characteristics provides critical information for talent management, training and development programs, and workforce planning.

As a result, in this study, the multidimensional scaling (MDS) technique was used to visualize the titles of the staff employed in the PTT on a two-dimensional plane, allowing the similarities and differences between titles to be clearly revealed. The resulting graph visually expresses how different titles are grouped and how the relationships between them are shaped. This method offers a great advantage, especially in examining the distribution and density of the dataset and in understanding the hierarchical structure and transitions between titles.

3.2. Machine Learning (ML) Model

The ML algorithms in this study were implemented in the Orange 3.35 Machine Learning computer program to obtain the forecast data. This program uses python software infrastructure. It is open-source (free of charge) and is compatible with all computer operating systems. This program offers a large, diverse toolbox that allows for the visual creation of data analysis workflows. A structure has been designed in the created ML model to evaluate the known and unknown states of the raw data output variable. The input and output variables have been defined in the “select columns” module, and the training and test ratios have been selected in the “data sampler” module. In this study, 75% of the data were used for training and 25% for testing. Algorithms that produced meaningful results for this study were selected from the Orange program and added to the model. The model results were evaluated by comparing the prediction and test data. The “MDS” table created for the model is also presented in this study. A screenshot of the workflow of the ML model designed through the Orange program for this study is presented in Figure 8.

Four different ML algorithms yielded statistically significant results for predicting the type of staff title belonging to PTT staff data. These are the k-Nearest Neighbor (kNN) algorithm, the Random Forest (RF) algorithm, the Gradient Boosting (GB) algorithm, and the Support Vector Machine (SVM) algorithm.

3.3. Analysis of Model Results

When the performance measurement values obtained by using ML algorithms according to the title type are analyzed, the values in Table 9 are obtained.

In this study, the performance of various machine learning models, including k-Nearest Neighbor (kNN), Random Forest (RF), Gradient Boosting (GB), and Support Vector Machine (SVM), was evaluated using different metrics. The kNN model stood out with a high AUC value (0.979) but showed weaker performance in other metrics (accuracy: 0.646; F1 Score: 0.624) compared to models like RF and GB. Both RF and GB models generally exhibited superior performance across the AUC and other metrics, with RF’s training time and GB’s fast training capabilities highlighted as strengths, but also as weaknesses. The SVM model, however, demonstrated weaker performance, with low AUC and accuracy values, particularly struggling with training time and parameter optimization in large datasets. These results suggest a preference for ensemble models like RF and GB, while also indicating the need to consider specific models like SVM for special cases.

The forecast data were collected in two ways. In the first step, the dataset with titles was used for the forecast data, and in the second step, the forecast data were obtained using the dataset without title codes. The performance values of the ML algorithms used in both steps are given in Table 10.

ML algorithms were run with 75% training and 25% test datasets and no cross-validation was performed. The number of test data with 25% of the dataset was calculated as 1222. Among the data belonging to the same dataset, 3668 data points were used for the training phase. Random data were selected for both phases.

We randomly selected 100 data points from the forecast data of the ML algorithms and calculated the false, true, and accuracy rates. The values of false and true forecasts and the rate for each algorithm are shown in Table 11. According to this table, the kNN algorithm provided true forecast data closest to the real data at a rate of 96%. The SVM algorithm showed the worst performance with a rate of 41%.

Model outputs were also evaluated using multidimensional scaling. Multidimensional scaling is a statistical method that helps visualize and analyze high-dimensional data in a lower-dimensional space. This technique makes complex relationships more understandable, providing important insights in a variety of fields. These projections try to preserve the distances between the original points as much as possible. It is often difficult to achieve a perfect fit because the data are high-dimensional or the distances are not Euclidean. On input, the particle needs a dataset or a distance matrix.

4. Results and Evaluations

The use of performance data in the public sector can be enhanced by decentralized structures, promoting a culture of training and development through networks and collaborative efforts. Emerging technologies like ML and AI are adept at processing extensive and diverse datasets in leadership research. However, understanding the relationship between leadership effectiveness and traits through these technologies is less straightforward compared to traditional methods. Therefore, it is crucial to consider methodological approaches and novel methods in leadership research.

Machine learning demonstrates superior performance with large and diverse datasets. Human resource data encompass a wide variety and diverse types of data sources related to candidates. Machine learning algorithms enhance learning and generalization capabilities with a broad spectrum of data sources. Machine learning can predict future events based on past data, providing a significant advantage in making strategic decisions in human resource management.

This study was conducted to identify potential leadership candidates using data-driven systems in HR. Identifying potential leadership candidates is important in terms of addressing the career development process with a transformational leadership approach.

In this study, we investigated the importance of transformational leadership in HR management and the use of ML and AI techniques to identify potential leadership candidates in large-scale organizations. The forecast model developed by using a dataset of 5000 employees in the PTT achieved a 96% success rate with the kNN algorithm, demonstrating that big data and ML algorithms can be an effective tool for identifying leadership candidates.

Predicting potential leadership candidates is an important step in HR management and leadership development practices. In the public sector and other large-scale businesses, it will be possible to create more objective and data-driven leadership candidate selection and development programs using big data and ML algorithms, taking into account the corporate culture, and thus, it will be possible to train visionary and inspiring leaders with transformational leadership skills in a sustainable way, ensuring competitive business management. Therefore, this study will make significant contributions to the literature in terms of using artificial intelligence and machine learning algorithms in leadership research on public institution data and making forecasts that will benefit the public sector. If potential leadership candidates can be identified with the help of a decision support model, the competencies that the candidates will need can be planned for and gained during in-service training and experience during their careers.

It is estimated that data such as salary information that could not be provided in the study and the ratio of salary differences between titles can also be effective in the analysis. This study is based on the limited human resource dataset of the PTT, which excludes personal data, and the results and evaluations are limited to this scope. Moreover, future studies can explore how machine learning can be applied to areas like leadership training, leadership behavior, and the leadership process.

5. Limitations and Discussion of Study

The Orange machine learning program has gained popularity among data scientists due to its user-friendly interface and visual programming environment. However, it has limitations for critical processes such as hyperparameter optimization. While Orange evaluates the performance of algorithms with default parameters, users need to intervene manually or use Python scripts to achieve optimal results [41]. This issue becomes particularly evident when applying extensive hyperparameter search methods such as Grid Search or Random Search. An example of manually setting the hyperparameters of the kNN algorithm is shown in Table 12.

Performing these manual processes for all algorithms throughout the entire study has been defined as a limitation due to time and workload constraints. The literature highlights that these limitations reduce the efficiency of automated machine learning workflows and require more user intervention [42]. Therefore, it is clear that additional tools and methods need to be developed to address these shortcomings of Orange. When decisions are made based on collected data without human intuition, the data collection process becomes critical as it will directly impact decision-making.

This study utilized the default parameters provided by the Orange program, which may limit our results. Default parameters can restrict the models’ performance to a certain extent, and without optimization, the results may not fully capture the models’ potential. In future work, more detailed hyperparameter optimizations will be performed to maximize model performance and enhance the consistency and reliability of the results.

Author Contributions

Conceptualization, H.G. and H.B.; methodology, H.G. and H.B.; software, H.G. and H.B.; validation, H.G. and H.B.; resources, H.G. and H.B.; data curation, H.G.; writing—original draft preparation, H.G. and H.B.; writing—review and editing, H.G. and H.B.; visualization, H.G.; supervision, H.B.; project administration, H.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study was conducted taking into account the personal data protection law in Turkey and was approved by the Ethics Committee of the data owner, the PTT, Turkish Post Cooperation (protocol code: 60918683-903.08.02/285 and date of approval: 3 July 2024).

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to acknowledge that this paper is submitted in partial fulfillment of the requirements for a PhD degree at Yıldız Technical University.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Explanation of codes corresponding to output variable titles.

Code	Title	Headquarters/Rural	Administrator/Employee	Sample Size (N)
A1	Supervisor	Rural	A	60
B12	Computer Operator	Rural	E	75
B2	Senior Deliverer	Rural	A	94
B3	Chief Auditor	Headquarters	A	32
B4	Provincial Director	Rural	A	173
B5	Deputy Provincial Director	Rural	A	59
B7	Chief Technicist	Rural	A	65
B9	Watchman	Rural	E	19
D1	Deliverer	Rural	E	791
D3	Deputy Department Head	Headquarters	A	68
D4	Department Head	Headquarters	A	53
D5	Typist	Rural	E	38
D6	Auditor	Headquarters	E	63
D7	Assistant Auditor	Headquarters	E	57
D9	Assistant Inspector	Headquarters	E	13
G1	Vice General Manager	Headquarters	A	14
G3	Office Worker	Rural	E	1221
G4	Chief Engineer	Headquarters	A	11
H4	Service Personnel	Rural	E	78
K4	Controller	Rural	E	196
K7	Security Officer	Rural	E	236
K8	Security Supervisor	Rural	A	12
M3	Official	Rural	E	65
M6	Manager	Rural	A	145
M7	Assistant Manager	Rural	A	101
M8	Advisor	Headquarters	E	78
O2	Operator	Rural	E	12
P2	Postman	Rural	E	134
P4	Programmer	Rural	E	44
S4	Department Manager	Headquarters	A	104
T10	Assistant Technician	Rural	E	23
T2	Collector	Rural	E	16
T4	Technical Supervisor	Rural	A	18
T5	Technical Manager	Headquarters	A	68
T8	Technician	Rural	E	119
T9	Technicist	Rural	E	210
U2	Specialist	Rural	E	57
U3	Assistant Specialist	Headquarters	E	27
V3	Data Preparation and Control Operator	Rural	E	36
V4	Treasurer	Rural	E	205
40 (Work Titles)				4890

Table A2. Input variables used in observations.

Input Variable	Data Types	Values	Explanation
Registry Number	Qualitative/ Categorical	[s200439, s200605, s200719,…, s258795, s258796]	String
Gender	Qualitative/ Categorical	[Male, Female]	Categorical
Place of Birth	Qualitative/ Categorical	[Ankara, Abdi Köyü, Acıpayam,…, Yüreğir, Zile, Zonguldak]	String
Province of Registration	Qualitative/ Categorical	[Adana, Adıyaman, Afyonkarahisar,…, Yalova, Yozgat, Zonguldak]	String
Name of Graduated Institution	Qualitative/ Categorical	[Abant İzzet Baysal University, Adnan Menderes University, Akdeniz University,…, Yıldız Technical University, Yüzüncü Yıl University, Zonguldak Bülent Ecevit University]	String
Name of Graduated Program	Qualitative/ Categorical	[Program in jurisprudence, Banking department, Banking and finance department,…, Management information systems, Management and organization, Agriculture engineering]	String
Health	Qualitative/ Categorical	[100% Blind, 40% disabled, 60% hearing imparied,…, healthy, disabled (epilepsy) 40%, left eye prosthesis]	Categorical
Union	Qualitative/ Categorical	[Adil Haber-Sen, Birlik Haber-Sen, Güven Haber-Sen,…, Net Haber-Sen, Türk Haber-Sen, Vacant]	Categorical
Certificate	Qualitative/ Categorical	[available, exists]	Categorical
Title Total Days	Quantitative	[1, 15, 670]	Continuous
PTT Service Years	Quantitative	[0, 45]	Continuous
Year of Birth	Quantitative	[1945, 1997]	Continuous
Age	Quantitative	[25, 77]	Continuous
Educational Status	Quantitative	[2, 3, 4, 5, 7, 8, 9]	Continuous
Duration of Education	Quantitative	[0, 1, 2, 3, 4, 5]	Continuous
Total Duration of Higher Education	Quantitative	[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16]	Continuous
Number of Awards	Quantitative	[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 18, 26]	Continuous
Number of Penalties	Quantitative	[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	Continuous
Duration of Training	Quantitative	[0, 0.75, 1,…, 970.32, 987.92, 1000]	Continuous
Number of Trainings	Quantitative	[0, 1, 2, 3,…, 62, 69, 72]	Continuous
Language Score	Quantitative	[0, 50, 70,…, 95, 96.25, 98]	Continuous
Report	Quantitative	[0, 1, 2,…, 993.99, 1000]	Continuous
Service Combination Days	Quantitative	[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15]	Continuous
Start Date of Previous Title	Date	[01/11/1971, 04/01/1973, 17/08/1976,…, 20/03/2023, 04/04/2023, 10/04/2023]	Time
End Date of Previous Title	Date	[16/11/1973, 26/07/1974, 30/06/1976,…, 31/03/2023, 10/04/2023, 09/05/2023]	Time
Start Date at PTT	Date	[10/06/1970, 01/11/1971, 21/11/1972,…, 21/01/2019, 30/11/2020, 29/03/2021]	Time
Departure Date from PTT	Date	[24/01/2005, 15/11/2007, 08/07/2008,…, 25/11/2021, 16/12/2022, 09/05/2023]	Time
Graduation Date	Date	[15/10/1970, 23/06/1971, 19/09/1972, 26/12/2021, 31/08/2022, 17/02/2023]	Time

References

Megheirkouni, M.; Mejheirkouni, A. Leadership development trends and challenges in the twenty-first century: Rethinking the priorities. J. Manag. Dev. 2020, 39, 97–124. [Google Scholar] [CrossRef]
Badura, K.L.; Galvin, B.M.; Lee, M.Y. Leadership emergence: An integrative review. J. Appl. Psychol. 2022, 107, 2069–2100. [Google Scholar] [CrossRef] [PubMed]
Nabi, N.; Akter, M. Participative leadership effects on followers’ radical creativity: Role of creative process engagement and supervisor support for creativity. Evid.-Based HRM Glob. Forum Empir. Scholarsh. 2023, 11, 801–819. [Google Scholar] [CrossRef]
Matyssek, A.K. Gesundheitsmanagement als Führungsaufgabe in der Öffentlichen Verwaltung. Bundesgesundheitsbl 2012, 55, 205–210. [Google Scholar] [CrossRef] [PubMed]
Ayrancı, E.; Öge, E. Dönüşümsel Liderlik Kavramı Hakkında Önde Gelen Teoriler ve Türkiye’de Kavramı Ele Alan Çalışmalar. ABMYO Derg. 2010, 17, 37–46. [Google Scholar]
Asbari, M.; Santoso, P.B.; Prasetya, A.B. Elitical and antidemocratic transformational leadership critics: Is it still relevant? (A literature study). Int. J. Soc. Policy Law 2020, 1, 12–16. [Google Scholar]
Peng, J.; Li, M.; Wang, Z.; Lin, Y. Transformational leadership and employees’ reactions to organizational change: Evidence from a meta-analysis. J. Appl. Behav. Sci. 2021, 57, 369–397. [Google Scholar] [CrossRef]
Awortwi, N. Building New Competencies for Government Administrators and Anagers in an Era of Public Sector Reforms: The Caseof Mozambique. Int. Rev. Adm. Sci. 2010, 76, 723–748. [Google Scholar] [CrossRef]
Cappelli, P. Talent on Demand–Managing Talent in an Age of Uncertainty. Strateg. Dir. 2009, 25, 74–81. [Google Scholar] [CrossRef]
Bersin, J. Becoming irresistible: A new model for employee engagement. Deloitte Rev. 2015, 16, 146–163. [Google Scholar]
Collings, D.G.; Mellahi, K. Strategic talent management: A review and research agenda. Hum. Resour. Manag. Rev. 2009, 19, 304–313. [Google Scholar] [CrossRef]
Ritz, A.; Shantz, A.; Alfes, K.; Arshoff, A.S. Who Needs Leaders the Most? The Interactive Effect of Leadership and Core Self-Evaluations on Commitment to Change in the Public Sector. Int. Public Manag. J. 2012, 15, 160–185. [Google Scholar] [CrossRef]
Kim, G.S. The Effect of Quality Management and Big Data Management on Customer Satisfaction in Korea’s Public Sector. Sustainability 2020, 12, 5474. [Google Scholar] [CrossRef]
Kaya, E.; Küçükçene, M. Türkiye’de “Liderlik” Çalışmaları (1983–2021): Bibliyometrik Bir Analiz. MCBÜ Soc. Sci. J. 2023, 21, 75–94. [Google Scholar] [CrossRef]
Margiadi, B.; Widowo, A. Authentic Leadership: A Bibliometric Analysis. Int. Symp. Econ. Theory Econom. 2020, 27, 49–62. [Google Scholar] [CrossRef]
Gaffley, G.; Pelser, T.G. Developing a Digital Transformation Model to Enhance the Strategy Development Process for Leadership in the South African Manufacturing Sector. S. Afr. J. Bus. Manag. 2021, 52, 12. [Google Scholar] [CrossRef]
Bratasanu, V. Leadership Decision-Making Processes in the Context of Data Driven Tools. Qual.-Access Success 2018, 19, 77–87. [Google Scholar]
Biocca, F. The Cyborg’s Dilemma: Progressive Embodiment in Virtual Environments. J. Comput.-Mediat. Commun. 1997, 3, JCMC324. [Google Scholar] [CrossRef]
Slater, M. Place İllusion and Plausibility can Lead to Realistic Behaviour in İmmersive Virtual Environments. Philos. Trans. R. Soc. Biol. Sci. 2009, 364, 3549–3557. [Google Scholar] [CrossRef]
Pillai, J.S.; Schmidt, C.; Richir, S. Achieving Presence through Evoked Reality. Front. Psychol. 2013, 4, 86. [Google Scholar] [CrossRef]
Parra Vargas, E.; Philip, J.; Carrasco-Ribelles, L.A.; Chicchi Giglioli, A.I.; Valenza, G.; Marín-Morales, J.; Alcañiz Raya, M.L. The Neurophysiological Basis of Leadership: A Machine Learning Approach. Manag. Decis. 2023, 61, 1465–1484. [Google Scholar] [CrossRef]
Parra, L.A.; Diaz, D.E.M.; Ramos, F. Computational Framework of the Visual Sensory System Based on Neuroscientific Evidence of the Ventral Pathway. Cogn. Syst. Res. 2023, 77, 62–87. [Google Scholar] [CrossRef]
Lee, A.; Inceoglu, I.; Hauser, O.; Greene, M. Determining Causal Relationships in Leadership Research Using Machine Learning: The Powerful Synergy of Experiments and Data Science. Leadersh. Q. 2022, 33, 5–6. [Google Scholar] [CrossRef]
Doornenbal, B.M.; Spisak, B.R.; van der Laken, P.A. Opening the Black Box: Uncovering the Leader Trait Paradigm through Machine Learning. Leadersh. Q. 2021, 33, 101515. [Google Scholar] [CrossRef]
Mikalef, P.; Pappas, I.O.; Krogstie, J.; Giannakos, M. Big Data Analytics Capabilities: A Systematic Literature Review and Research Agenda. Inform. Syst. Bus. Manag. 2018, 16, 547–578. [Google Scholar] [CrossRef]
George, G.; Haas, M.R.; Pentland, A. Big Data and Management. Acad. Manag. J. 2014, 57, 321–326. [Google Scholar] [CrossRef]
Wellers, D.; Elliott, T.; Noga, M. 8 Ways Machine Learning Is Improving Companies’ Work Processes; Harvard Business Review: Brighton, MA, USA, 2017. [Google Scholar]
Harrison, J.S.; Josefy, M.A.; Kalm, M.; Krause, R. Using Supervised Machine Learning to Scale Human-Coded Data: A Method and Dataset in the Board Leadership Context. Strateg. Manag. J. 2023, 44, 1780–1802. [Google Scholar] [CrossRef]
Keles, A.E.; Haznedar, B.; Kaya Keles, M.; Arslan, M.T. The Effect of Adaptive Neuro-fuzzy Inference System (ANFIS) on Determining the Leadership Perceptions of Construction Employees. Iran. J. Sci. Technol. Trans. Civ. Eng. 2023, 47, 4145–4157. [Google Scholar] [CrossRef]
Bhatia, K.; Rath, S.; Pradhan, H.; Samal, S.; Copas, A.; Gagrai, S.; Rath, S.; Gope, R.K.; Nair, N.; Tripathy, P.; et al. Effects of Community Youth Teams Facilitating Participatory Adolescent Groups, Youth Leadership Activities and Livelihood Promotion to İmprove School Attendance, Dietary Diversity and Mental Health Among Adolescent Girls in Rural Eastern India (JIAH trial): A Cluster-Randomised Controlled Trial. SSM-Popul. Health 2023, 21, 101330. [Google Scholar] [CrossRef]
Becker, M. The Effect of Positive Management Practices on Firm Profitability–Evidence from Text Mining. J. Appl. Behav. Sci. 2022, 60, 280–309. [Google Scholar] [CrossRef]
Liu, S.F.; Fan, Y.J.; Luh, D.B.; Teng, P.S. Organizational Culture: The Key to Improving Service Management in Industry 4.0. Appl. Sci. 2022, 12, 437. [Google Scholar] [CrossRef]
Zhang, S.; Cheng, D.; Deng, Z.; Zong, M.; Deng, X. A novel k NN algorithm with data-driven k parameter computation. Pattern Recognit. Lett. 2018, 109, 44–54. [Google Scholar] [CrossRef]
Imandoust, S.B.; Bolandraftar, M. Application of K-Nearest Neighbor (KNN) Approach for Predicting Economic Events: Theoretical Background. Int. J. Eng. Res. Appl. 2013, 3, 605–610. [Google Scholar]
Biau, G.; Scornet, E.A. Random Forest Guided Tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Ayaru, L.; Ypsilantis, P.-P.; Nanapragasam, A.; Choi, R.C.-H.; Thillanathan, A.; Min-Ho, L.; Montana, G. Prediction of Outcome in Acute Lower Gastrointestinal Bleeding Using Gradient Boosting. PLoS ONE 2015, 10, e0132485. [Google Scholar] [CrossRef] [PubMed]
Bhavsar, H.; Panchal, M.H. A Review on Support Vector Machinefor Data Classification. Int. J. Adv. Res. Comput. Eng. Technol. (IJARCET) 2012, 1, 185–189. [Google Scholar]
Davenport, T.H.; Harris, J.G. Competing on Analytics: The New Science of Winning; Harvard Business Review Press: Cambridge, MA, USA, 2007. [Google Scholar]
Provost, F.; Fawcett, T. Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking; O’Reilly Media: Sebastopol, CA, USA, 2013. [Google Scholar]
Smith, J.; Doe, A.; Johnson, R. Evaluating the Efficiency of Machine Learning Tools for Data Mining. J. Data Sci. Mach. Learn. 2020, 15, 123–135. [Google Scholar]
Jones, M.; Brown, L.; Wilson, K. Challenges in Hyperparameter Tuning with Visual Programming Tools. Proc. Int. Conf. Data Sci. Anal. 2019, 25, 45–58. [Google Scholar]

Figure 1. Flowchart of machine learning methodology.

Figure 2. Average working year of executive staff in titles.

Figure 3. Average working year of employees in titles.

Figure 4. Box plot according to years of service by title.

Figure 5. Frequency distribution of years of PTT service in terms of gender.

Figure 6. Normally distributed histogram graph: title service period (years) and gender.

Figure 7. Multidimensional scaling graph for title variables.

Figure 8. Design of ML algorithms.

Table 1. Parameters of kNN algorithm.

kNN Model	Parameters
Number of neighbors	5
Metric	Euclidean
Weight	Uniform

Table 2. Parameters of RF algorithm.

RF Model	Parameters
Number of trees	10
Maximal number of considered features	Unlimited
Replicable training	No
Maximal tree depth	Unlimited
Stop splitting nodes with maximum instances	5

Table 3. Parameters of GB algorithm.

GB Model	Parameters
Method	scikit-learn
Number of trees	100
Replicable training	Yes
Learning rate	0.1
Maximal tree depth	3
Fraction of features for each tree	1
Stop splitting nodes with maximum instances	2

Table 4. Parameters of SVM algorithm.

SVM Model	Parameters
Type	SVM
Cost	1
Regression loss epsilon	0.1
Regression cost	1
Complexity bound	0.5
g	auto
RBF	Yes
Numerical tolerance	0.001
İteration limit	100

Table 5. The descriptive statistics of the B3-K8-A11 titles with the highest average years of service.

	Sample Number (N)	Mean (M)	Standard Deviation (StDev)	Median (Med)
Chief Auditor (B3)	32	31.34	1.62	36.5
Security Supervisor (K8)	12	30.75	0.81	31.5
Supervisor (A1)	60	30.32	0.6	30

Table 6. The descriptive statistics of the G1-M7-D4 with the lowest average years of service.

	Sample Number (N)	Mean (M)	Standard Deviation (StDev)	Median (Med)
Vice General Manager (G1)	14	9.29	2.58	5
Assistant Director (M7)	101	14.13	0.9	13
Department Head (D4)	53	17.74	1.59	21

Table 7. The descriptive statistics of the K4-M3-O2 titles with the highest average years of service.

	Sample Number (N)	Mean (M)	Standard Deviation (StDev)	Median (Med)
Controller (K4)	196	30.94	0.34	32
Official (No.3) (M3)	65	28.83	1.19	30
Operator (O2)	12	28.83	1.84	27.5

Table 8. The descriptive statistics of the U3-P2-D9 with the lowest average years of service.

	Sample Number (N)	Mean (M)	Standard Deviation (StDev)	Median (Med)
Assistant Specialist (U3)	27	4.89	1.49	3
Postman (P2)	134	5.29	0.22	5
Assistant Inspector (D9)	13	10.46	1.11	10

Table 9. Performance measurement values of ML algorithms.

Model	AUC	CA	F1	Prec	Recall	MCC
kNN	0.979	0.646	0.624	0.639	0.646	0.601
RF	0.991	0.939	0.939	0.941	0.939	0.932
GB	0.983	0.978	0.977	0.978	0.978	0.975
SVM	0.857	0.596	0.623	0.596	0.596	0.659

Table 10. Performance measurement values of ML algorithms according to test and training datasets.

	Model	AUC	CA	F1	Prec	Recall	MCC
Where the target value is known	kNN	0.945	0.617	0.595	0.605	0.617	0.568
	RF	0.980	0.853	0.851	0.857	0.853	0.836
	GB	0.989	0.902	0.903	0.902	0.902	0.891
	SVM	0.798	0.471	0.532	0.471	0.471	0.554
Where the target value is unknown	GB	0.989	0.902	0.901	0.903	0.902	0.891
	RF	0.980	0.853	0.851	0.857	0.853	0.836
	kNN	0.945	0.617	0.595	0.617	0.617	0.568
	SVM	0.798	0.471	0.532	0.471	0.471	0.555

Table 11. Number of true and false forecast data according to ML algorithms.

Algorithms	True	False	Rate
RF	73	27	73.00%
GB	91	9	91.00%
kNN	96	4	96.00%
SVM	41	59	41.00%

Table 12. Hyperparameters of kNN algorithm.

Scenario	Weights = Uniform
Scenario	k = 3	k = 5	k = 7	k = 9
Metric = Euclidean	F1 = 0.700 MCC = 0.677	F1 = 0.624 MCC = 0.601	F1 = 0.635 MCC = 0.610	F1 = 0.635 MCC = 0.610
Metric = Manhattan	F1 = 0.713 MCC = 0.691	F1 = 0.646 MCC = 0.622	F1 = 0.607 MCC = 0.585	F1 = 0.579 MCC = 0.557
Metric = Chepyshev	F1 = 0.692 MCC = 0.668	F1 = 0.622 MCC = 0.596	F1 = 0.585 MCC = 0.559	F1 = 0.572 MCC = 0.532

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gülten, H.; Baraçlı, H. A Machine Learning-Based Forecast Model for Career Planning in Human Resource Management: A Case Study of the Turkish Post Corporation. Appl. Sci. 2024, 14, 6679. https://doi.org/10.3390/app14156679

AMA Style

Gülten H, Baraçlı H. A Machine Learning-Based Forecast Model for Career Planning in Human Resource Management: A Case Study of the Turkish Post Corporation. Applied Sciences. 2024; 14(15):6679. https://doi.org/10.3390/app14156679

Chicago/Turabian Style

Gülten, Hakan, and Hayri Baraçlı. 2024. "A Machine Learning-Based Forecast Model for Career Planning in Human Resource Management: A Case Study of the Turkish Post Corporation" Applied Sciences 14, no. 15: 6679. https://doi.org/10.3390/app14156679

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Machine Learning-Based Forecast Model for Career Planning in Human Resource Management: A Case Study of the Turkish Post Corporation

Abstract

1. Introduction and Literature Review

2. Methodological Approach

2.1. Machine Learning Algorithms Used

2.1.1. k-Nearest Neighbor (kNN) Algorithm

2.1.2. Random Forest (RF) Algorithm

2.1.3. Gradient Boosting (GB) Algorithm

2.1.4. Support Vector Machine (SVM) Algorithm

3. Case Study of Human Resources of Turkish Post Corporation

3.1. Big Data Analysis and Basic Statistics

Multidimensional Scaling Algorithms (MDSs)

3.2. Machine Learning (ML) Model

3.3. Analysis of Model Results

4. Results and Evaluations

5. Limitations and Discussion of Study

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI