Mapping the Landscape of Data Science Education in Higher General Education in Taiwan: A Comprehensive Syllabi Analysis

Hsu, Yu-Chia

doi:10.3390/educsci14070763

Open AccessArticle

Mapping the Landscape of Data Science Education in Higher General Education in Taiwan: A Comprehensive Syllabi Analysis

by

Yu-Chia Hsu

Department of Sport Information and Communication, National Taiwan University of Sport, Taichung 404401, Taiwan

Educ. Sci. 2024, 14(7), 763; https://doi.org/10.3390/educsci14070763

Submission received: 25 April 2024 / Revised: 17 June 2024 / Accepted: 1 July 2024 / Published: 12 July 2024

Download

Browse Figures

Versions Notes

Abstract

:

The evolving landscape of data science education poses challenges for instructors in general education classes. With the expansion of higher education dedicated to cultivating data scientists, integrating data science education into university curricula has become imperative. However, addressing diverse student backgrounds underscores the need for a systematic review of course content and design. This study systematically reviews 60 data science courses syllabi in general education across all universities in Taiwan. Utilizing content analysis, bibliometric, and text-mining methodologies, this study quantifies key metrics found within syllabi, including instructional materials, assessment techniques, learning objectives, and covered topics. The study highlights infrequent textbook sharing, with particular focus on Python programming. Assessment methods primarily involve participation, assignments, and projects. Analysis of Bloom’s Taxonomy suggests a focus on moderate complexity learning objectives. The topics covered prioritize big data competency, analytical techniques, programming competency, and teaching strategies in descending order. This study makes a valuable contribution to the current knowledge by tackling the challenge of delineating the specific content of data science. It also provides valuable references for potentially streamlining the integration of multiple disciplines within introductory courses while ensuring flexibility for students with varying programming and statistical proficiencies in the realm of data science education.

Keywords:

curriculum analysis; data science literacy; bibliometric; text mining

1. Introduction

The era of “Everything Is Data” and the tremendous growth of data from connected devices and social media has created a high demand for big data capabilities [1] and data science skills across disciplines and organizations [2]. Higher education institutions have responded by offering more data science courses to equip students with essential data competencies. Beside the growing number of standalone data science education in various university departments, there is increasing demand for integrating domain expertise into classes. Data science education from computer science and statistics departments is becoming more widely included in general education to enhance interdisciplinary and composite competencies. This equips students with the ability to navigate today’s data-rich world early in their education [3]. However, students in general education courses may still face gaps between their domain and the prerequisite skills required for data science.

Despite the increasing number of university data science programs, data science is typically taught at graduate or higher-level. These multidisciplinary programs have been evaluated through syllabi analyses, revealing early trends [2,4] and common mandatory courses like linear algebra, data mining, programming and statistics [5]. While some undergraduate programs are more diverse, including business courses [6], data science is relatively new in general education, and is often offered as an introductory course.

Several frameworks describe components of data science education, including those focused on mathematical and computational foundations [7], developing undergraduate data acumen [8], and preparing professionals with best practices [9]. Other frameworks include the EDISON Data Science Framework [10], the ACM Data Science Knowledge Areas [11] and the IPSJ Data Science Curriculum Standard [12], which are designed for university-level education majoring in data science. Whilst these frameworks have been used to evaluate data science programs quantitatively [2,4], they focus more on preparing students for careers in data science than promoting data literacy in general education. Additionally, these frameworks are too broad and complex for an introductory course, and their coverage must be narrowed.

Due to the lack of a clear definition of what data science actually entails [13], a potential barrier for instructors in general education classes is how to concentrate several courses across multiple disciplines into one introductory course and how to accommodate students with different programming and statistical skills. Excessive emphasis on programming, mathematics, and statistics often leads to learning difficulties [14]. Thus, integrating data science into specific domains or daily life is crucial [3], but guidelines for this are lacking [15]. Previous studies have yielded limited information on the constituent elements of data science curricula in general education, including requisite components, content emphases, pedagogical approaches, and instructor qualifications. Creating a captivating course that caters to diverse student interests and reflects current data technology trends remains a challenge.

The purpose of this research is to scrutinize the present practices in data science education and offer perspectives for curriculum advancement across diverse academic domains, while answering the following research questions.

Instructional materials: What type of instructional materials are often used? Which software and programming languages are prevalent?
Assessment techniques: How are the learning outcomes evaluated? What are the commonly used assessment techniques?
Learning objectives: What is the level of complexity of the learning objectives? To what extent are the expected learning outcomes aligned with the cognitive processes identified in Bloom’s taxonomy?
Topics covered: Which topics are covered most frequently in these courses?

To answer these questions, this study performed a comprehensive quantitative examination of university general education syllabi through a census of all 76 universities in Taiwan. Firstly, a systematic approach [16] was adopted to extract data from a large number of open syllabi. A total of 60 related course syllabi were collected from general education programs for the entire academic year of 2021–2022. Secondly, the extracted data were used to examine how data science courses are being taught in general education using content analysis, bibliometric [17] and text mining techniques [18]. Five elements of the syllabi, namely, course materials, learning outcome assessment, learning objectives, and weekly schedules with topics, were reviewed to assess the presence and content of the mainstream core of data science courses. Thirdly, discussions were conducted on popular topics, the utilization of skills and software, and the emphasis on cultivating competitiveness within the course, considering their significant trade-offs. The remainder of this paper is organized as follows: Section 2 reviews the related literature, Section 3 details the research methodology, Section 4 presents the results from various analyses, Section 5 discusses the findings in comparison to prior work, Section 6 concludes by summarizing the study, and the limitations and suggestions for future research are addressed in Section 7.

2. Literature Review

2.1. Data Science Competencies

Educators view data science as a competency or a literacy to be cultivated that can help organizations create value. In general education, data analysis competencies and data science literacy are all related to data science education despite some differences in their terminologies. How these competencies or literacies should be organized and systematically described has emerged as a research topic in recent years.

Klee et al. [19] used a systematic review approach to summarize how data analysis capabilities contribute to the value of a business. They associated three data analytics competencies with two work-practice-level business value propositions and posited six propositions regarding the beneficial influence of data analytics competencies on business value. These include enhancing both inductive and deductive analytical work, balancing the stress dynamics between human actors and algorithms, organizing data sources and analytical processes, providing a framework for weighing the strengths and weaknesses of human actors and algorithms, enhancing in-depth analytical activities, and promoting technical and technological principles. Similarly, Hattingh et al. [20] used the term “data science competencies” and conducted a systematic literature review to identify those competencies that are necessary for data scientists to perform their jobs effectively. They proposed a unified model of data science competency based on the results of their theme analysis. This model comprised six broad competencies, namely, organizational, technical, analytical, ethical and regulatory, cognitive, and social competencies. Pratsri et al. [21] synthesized data science competency for higher education students by collecting data from textbooks, academic documents, and research articles. They concluded that data science competency for higher education students includes five competencies: programming skills, elementary statistics, fundamentals of data science, data preparation, and big data analytics.

To educate non-technology-related departments on data science literacy, Overton & Kleinschmit [15] integrated this subject into established domain programs. In the case of public administration programs, they designed a data science literacy framework to help educators understand the breadth of their data science skills and to offer recommendations on how the data science program can integrate its competencies into their current courses. A wide range of skills anchored within data science were organized into four domains, namely, data tasks, computation, statistics and application and systems integration knowledge. By incorporating these domains, data science can achieve its primary goal of extracting knowledge from data to generate insights that can be used to make informed decisions without the need for any statistical or computational sophistication [22].

The above studies were conducted from the perspective of the competency model [23], which is in line with the concept of the educational goal of cultivating literacy in general education. However, the domains or frameworks derived from the systematic literature review still fall short for a limited-hour introductory course in a classroom setting. A comprehensive syllabus review for the actual courses must be conducted to narrow this gap.

2.2. Teaching Practices in Data Science Courses

A number of specific options for the content, coverage and pedagogy of data science education has been defined in detail within the research community. The case of introductory data science course development has recently received scholarly attention, with some sample works listed in Table 1. As can be seen from the table, recent studies on this topic have called for a data science instruction reform to ensure that the curriculum being taught is appropriate and consistent with its goals when catering to students from diverse backgrounds.

Data science curriculum development has received wide attention in the field of statistics education. The data science course developed by Baumer [24] is an introductory course under the Statistics and Data Science program that incorporates introductory statistics and introductory programming as a starting point and requires experience in basic statistics and programming. This one-semester course is organized into a series of two- to three-week modules that include data visualization, data manipulation/data wrangling, computational statistics, machine/statistical learning, and other topics. Moreover, as the first course of the degree program in data science, Yan and Davis [25] designed their course around the concept of data science life cycle, which draws upon activity theory to emphasize the use of tools to transform real data and to answer highly motivated questions. Typically, one of the goals of these programs is to cultivate data scientists, and the students enrolled in these programs are not limited to those majoring in statistics and computer science but also include other students majoring in humanities, social sciences, and natural sciences who are interested in data science.

A distinct approach to data science course development operates independently from the core program. It caters to students from diverse academic backgrounds and is frequently available in the first year without any prerequisite coursework. Such courses prioritize distinct pedagogical strategies and innovations. For instance, the course designed by Çetinkaya-Rundel and Ellison [26] focuses on modern and multivariate exploratory data analysis (especially data visualization) and the data analysis cycle, the importance of collaboration, best practices, and tools for reproducible computing, the model-based perspective, and an effective communication of findings. Lasser et al. [27] designed a complete course that includes case studies, project work, and open online teaching resources based on contemporary datasets. For general education, Schuff [28] designed and structured a data science course by emphasizing practical data literacy through current events, readily available analysis tools, and the methods of scientific inquiry. The course is designed to inspire an “evidence-based” mindset, encouraging students to identify and use data relevant to them in their field of study and the larger world around them. Meanwhile, in terms of pedagogy, Asamoah et al. [29] introduced a synthetic interdisciplinarity approach where the course is delivered by two instructors with expertise in management information system and computer science. Wong and Kawash [30] incorporated experiential learning into the course through teamwork and analysis of actual data. However, as with other general education disciplines, data science lacks a common core content within general education.

Table 1. Summary of studies on the development of introductory data science courses.

Author(s)	Type of Study	Course and Targeted Students	Course Highlights	Feedback
Baumer [24]	Course design	This course is part of the Statistics and Data Science undergraduate program and is offered in a liberal arts environment with prerequisites in introductory statistics and some programming skills.	The course is organized into a series of 2–3-week modules: data visualization, data manipulation/data wrangling, computational statistics, machine/statistical learning, and additional topics.	Useful information was believed to have been learned by the students through informal and formal evaluations.
Schuff [28]	Course design	A general education course for non-technology audiences at Temple University in the Northeastern United States.	The design of the course set out to inspire an “evidence-based” mindset, encouraging students to identify and use data relevant to them in their field of study and the larger world around them. The course is divided into four multi-week modules: data in our daily lives, telling stories with data, working with data in the real world, and analyzing data.	“Data literacy” is the true core skill for undergraduate students, not sophisticated analytics techniques.
Yan & Davis [25]	Course design	A first course in data science as part of a 4-year undergraduate degree program in data science. Approximately 40% of the students major in data science, with the rest coming from a variety of disciplines.	The course is designed around a concept called the data science life cycle. The course philosophy, based on activity theory, emphasizes the use of tools to transform real data in order to answer highly motivated questions about the data.	This course is better motivated with many realistic applications. Students generally express satisfaction with the in-class discussion, the hands-on exercises on examples discussed in class, the exam question on data visualization, and their ability to select problems for their projects.
Çetinkaya-Rundel & Ellison [26]	Course design	Introduction to Data Science and Statistical Thinking is designed for undergraduate students aspiring to major in statistics or data science as well as those pursuing humanities, social science, and natural science fields.	The course emphasizes modern and multivariate exploratory data analysis, data visualization, analysis cycle, collaboration, best practices, and tools for reproducible computing, model-based perspective, and effective communication of findings.	The course has served as a bridge between the statistics and the computer science curriculum, accelerating the development of an interdepartmental data science major, while also meeting the introductory statistics requirement of many majors.
Lasser et al. [27]	Course design	This course is designed as a service course to introduce students to data science from a variety of disciplines.	A complete course is outlined, which incorporates case studies, project work, and open-access online teaching resources based on contemporary data sets.	Students reported very high levels of interest and said they learned a lot. The required workload was neither too low nor too high and gave a very good overall evaluation of the course.
Asamoah et al. [29]	Pedagogy	This course is designed for a cross-disciplinary group of business, liberal arts, engineering, and computer science students.	A synthetic interdisciplinarity approach was used to teach the course by two instructors from Management Information System and Computer Science.	Interdisciplinarity ensures positive learning experiences and results in high learning outcomes.
Wong & Kawash [30]	Pedagogy	An introductory course, Thinking with Data, for first-year multidisciplinary students with no prerequisite.	Experiential learning is incorporated into the course through teamwork and real-life data analysis.	The students’ learning experience was enhanced by hands-on activities during tutorials.

2.3. Research and Methodology of Syllabus Analysis

The syllabus is an epitome of a contract between the student and the teacher [31]. This document should contain sufficient details about the course objectives and goals and provide guidelines on learning activities, available instructional resources, and assessment methods [31,32,33]. From the syllabus, one can understand the structure of a course, the exact units of knowledge covered, and how individual learning objects are packaged for students [34]. The syllabus is particularly valuable, as it provides a cognitive map of the subject in the form of key topics, connections amongst topics, and legitimate sources [35]. In educational research, syllabi analysis is a common method for determining curriculum requirements [36]. The syllabi collected from a number of courses have been used in many fields as research materials to evaluate whether learner-centeredness pedagogy is reflected [37,38], to examine hidden gendering [39], and to determine how the syllabi can be used as tools for socialization [40].

A corpus of 9 million English-language syllabi from 140 countries have been collected and analyzed in the Open Syllabus Project over the years to support novel teaching and learning applications. Through the sharing of syllabi, instructors can peer review one another, explore topics and fields of teaching, improve their materials and pedagogy, and support their efforts to align higher education with the needs of the job market.

A content analysis of the syllabus is helpful for course development, especially for emerging fields. Carey and Najarian Souza [41] drew 25 syllabi from relevant courses to assess the presence and content of the emerging core of the sociology of disability. Grounded theory [42] was employed to scrutinize the themes and assess the consensus among these syllabi. The analysis revealed minimal convergence in prescribed readings or authors, but notably strong agreement existed concerning the course descriptions and topics. This consensus was particularly notable in relation to specific viewpoints and fundamental concepts. Overton and Kleinschmit [43] categorized the research method course content in 52 syllabi from 31 Master of Public Administration programs to identify the data skills currently being taught and which type of content should be included in their research method courses. By assessing the course content covered each week, they approximated the topics covered and the amount of time spent on each topic. Karanja and Malone [44] examined how to improve the project management curriculum by assessing the nature of the learning outcomes in the course syllabi and their alignment with Bloom’s taxonomy framework [45,46]. They conducted document and content analyses to generate quantitative statistics on the learning outcomes documented in 56 syllabi based on the action verbs used to describe the learning outcomes, the relationships between cognitive processes and knowledge domains, and the types of assessments. Friedman [47] conducted an empirical study based on 35 syllabi of big data to examine how leading academic institutions and instructors highlight their understanding of big data using the rubric of Palmer et al. with the aim of discovering common practices and directions. The results show that most of these syllabi cover eight common topics, with the top three being big data infrastructure, data-driven applications, and data mining. To demonstrate certain topics such as data mining, advanced statistics, and visualization, many instructors use their preferred applications, such as R or Python. These syllabi did not share a common textbook. They also often divided the main concepts of big data into smaller content items based on how the instructor interprets the subject matter.

Some syllabus analyses have adopted a quantitative text-analytic method that differs from qualitative content analysis methods, such as text mining. For instance, Urs and Minhaj [4] conducted an impressionistic study of the evolution of data science based on Kuhn’s four stages of paradigm shift [48]. By performing a curriculum analysis, they painted the field as it emerges from word frequency patterns, ranked and clustered the course titles based on text mining, mapped the curriculum to the data science landscape, and then projected it onto the Edison Data Science Framework and the ACM Data Science Knowledge Domain.

Given that the syllabus of graduate degree curricula often lists some required academic readings, the bibliometric techniques adopted for citation research have also been used to examine syllabi. For instance, Fréchet et al. [16] built a text-analysis course syllabus by calculating the Source-Ranking Index, which represents the importance of 10 course topics, based on the recurrence and chronology of the cited literature in the syllabi. Herzog et al. [17] proposed a new approach for studying the curricula using bibliometric methods. They modified bibliometric algorithms to fit the curriculum context and to assess the interdisciplinarity in higher education curricula. They used course syllabi as data objects and then coded the required references listed in these syllabi via disciplinary affiliation for analysis.

3. Methodology

This study investigates several aspects relevant to data science courses within the context of university general education, including instructional materials, assessment techniques, learning objectives, and topics covered. Differing from previous investigations, this study opted for a full census rather than a sample survey to provide an all-encompassing evaluation of general education course syllabi within Taiwanese universities. Consequently, the research does not target the curriculum or programs of any particular department or school.

To collect a wide array of syllabi in general education, the course information website (https://course-tvc.yuntech.edu.tw, accessed on 6 May 2023) was used as a data source. This website was built as part of a project similar to Open Syllabus that was set up under the official policy of the Ministry of Education of Taiwan to create a publicly accessible database of university courses. This database covers every course offered by all 76 universities in Taiwan every semester and stores over 6 million course syllabi in Chinese as of 2022.

The syllabus collection was conducted based on a systematic review approach, following the systematic reviews and meta-analysis (PRISMA) statement [49]. Figure 1 shows the flow diagram with the number of syllabi identified, included, and excluded, and the reasons for exclusion. For a broader review of data science-related general education courses, this study included those titled Data Science or Big Data. It employed three commonly translated Chinese terms as keywords to search within course titles. Only the syllabi for the courses being offered by the Center for General Education were considered in this study, and the syllabi for those courses offered by specialized departments or programs were excluded. For the full academic year 2021, a total of 954 data science-related courses were offered in 380 academic departments at 64 universities. After several rounds of filtering, 91 undergraduate general education courses were deemed eligible for this study. A few inappropriate and incomplete syllabi were eliminated, leaving this research with a total of 60 syllabi from 28 universities as units of analysis. The three syllabi are provided in Supplementary File S1 online, serve as an illustration of the source material.

An in-depth analysis of data science syllabi in general education was conducted using descriptive content analysis. This quantitative approach, as outlined by [50], involves measuring and drawing inferences from the text. Aligned with the research questions, the analysis focused on four key dimensions: instructional materials, assessment techniques, cognitive domain levels within learning objectives, and topics covered in each syllabus. The initial two dimensions—instructional materials and assessment techniques—underwent content analysis. This process involved creating a coding scheme, categorizing text, and determining frequencies and percentages within each category. The coding analysis for the 60 course syllabi based on these aspects can be found in Supplementary File S2 online. The other two dimensions—cognitive domain levels within the learning objectives and topics covered—were scrutinized through text mining and bibliometric techniques.

Learning objectives are a crucial element when developing course syllabi and lesson plans, serving to explicitly communicate what the student needs to do in order to demonstrate learning by utilizing measurable verbs and phrases [51]. The Taxonomy of Educational Objectives, also known as Bloom’s Taxonomy, was created to classify educational aims [45], and has been extensively applied across various fields. This theory is based on the notion that there are ascending degrees of observable behaviors that reveal the depth of cognitive activity. The initial framework was widely embraced but was later re-examined by a group of cognitive psychologists, curriculum theorists, instructional researchers, and testing assessment specialists in 2000 [46]. The revised Bloom’s Taxonomy utilizes measurable verbs and verbal nouns to label the six major categories and subcategories. These categories are arranged in ascending order of cognitive complexity, namely remember, understand, apply, analyze, evaluate, and create. Table 2 presents a detailed list of target verbs mainly utilized in this study, although it may not be exhaustive. The related discriminations from similar action verbs are cross-referenced with the list of action verbs compiled by [52]. Subsequently, this study extracted the measurable verbs from the learning objectives specified in the syllabus and assessed the complexity of these objectives by categorizing them into six categories based on their corresponding verbs.

The syllabi utilized in this study were written in Chinese, which hindered our ability to apply text mining tools to analyze the topics covered. However, it did not impede the automatic extraction of measurement verbs from the learning objectives. Specifically, in the analysis of Chinese text, the vocabulary consists of one or several characters, necessitating the decomposition of the complete sentence into basic vocabulary before the calculation of word frequency can take place. For the analysis of learning objectives, the Jieba Chinese word segmentation tool (https://github.com/fxsjy/jieba, accessed on 8 May 2023) was utilized to extract measurable verbs. Subsequently, these verbs underwent manual review, filtering, and categorization into each of the six Bloom’s Taxonomies.

An alternative method was employed to analyze the topics covered on a weekly basis in the course, diverging from the conventional content analysis approaches. Specifically, the bibliometric approach [54] was integrated with text mining techniques to glean more nuanced insights beyond simple topic frequency. Therefore, the interactions between topics could be mapped onto a knowledge graph and visually presented. However, other studies have used the data science framework of domain coverage as their reference for the coding categories of content analysis methods. Instead of pre-determining these categories, this study adopted a text mining approach similar to unsupervised learning, in which the co-occurrence and relatedness of words between syllabi determine the popularity of topics and the distance on the map, respectively.

To analyze the coverage of topics, the VOS Viewer [55] was utilized as the primary tool. The weekly topics written in Chinese were translated into English, maintaining consistent terminology throughout. Subsequently, VOS Viewer was employed to conduct an analysis of word frequencies and generate a knowledge graph. This approach was adopted to circumvent the usage of the Chinese word segmentation tool, which exhibited unsatisfactory performance in the initial trials.

4. Results

4.1. Instructional Materials

Each textbook, whether required, recommended, or listed as a reference in the syllabi, constituted an individual unit of analysis for instructional materials. Among the 60 syllabi examined, a total of 173 instructional material units were identified. Notably, 26 syllabi featured only one textbook, while 9 courses incorporated more than five materials. Table 3 provides a detailed breakdown of these instructional materials by their titles. The table also indicates the quantity of materials incorporated within each syllabus and the frequency of syllabi adopting each material type. To illustrate, if a syllabus includes three Python-related textbooks, the Python category is tabulated as three distinct materials associated with one syllabus.

The most frequently listed type of material in the syllabi was popular books, followed by Python-related materials, articles, and technical reports. Likewise, the most prevalent categories of instructional materials employed in these syllabi were Python, lecture notes, and popular books. Notably, three of the identified courses featured a substantial number of 26 articles and technical reports. These findings indicate that data science courses predominantly focus on analytical techniques, emphasizing the establishment of a comprehensive knowledge base encompassing both conceptual fundamentals and practical technical skills.

Observationally, the prevailing trend in these courses involves the utilization of lecture notes provided directly by instructors rather than relying on traditional textbooks. This practice stands in contrast to the common approach found among instructors in various disciplines who typically integrate foundational textbooks as central course materials [56]. Furthermore, the additional articles and resources suggested for students are not explicitly enumerated within the syllabi; instead, they are conveniently hosted within a learning management system like Moodle. This approach contrasts with practices observed in American universities, where instructors often use personal blogs as platforms for sharing additional learning materials [57].

4.2. Assessment Techniques

A course’s overall grade is ascertained through diverse assessment techniques with varying weightage. As detailed in Table 4, six categories of assessment techniques are presented. The percentage was derived by dividing the count of syllabi employing a particular assessment technique by the total number of syllabi analyzed.

According to Table 4, data science courses in general education primarily use classroom participation, hands-on assignments, and term projects to grade their students. More than half of the courses also use exams to check the effectiveness of learning. General education not only focuses on the learning of basic knowledge but also on teamwork and communication through course activities. Some courses also adopt problem-based learning. Therefore, participation in course activities is highly valued in these courses. Given that the data analysis involves hands-on skills, many of the course assignments are related to programming and are required to be demonstrated in the classroom. Most courses include a term project in which a written or oral report is delivered by a team of students, illustrating how learning can be applied to real-life problems.

4.3. Bloom’s Taxonomy of Cognitive Learning Objectives

A text mining process was conducted to extract the action verbs used to describe the learning objectives from the 60 data science course syllabi, resulting in a total of 68 verbs. The top 20 most frequent action verbs are listed in Table 5, along with their corresponding numerical occurrences and weighted percentage values. Among these 68 verbs, 29 of them fall under the category of measurable verbs in Bloom’s Taxonomy. Some common verbs such as “learn”, “research”, and “understand” were found but were not in alignment with the criteria, and thus were removed. Table 6 catalogues the 29 verbs, offering a comprehensive breakdown of data science learning objectives in relation to cognitive processes across six levels, along with their respective frequency of occurrence. The data suggest that the majority of the learning objectives were expressed in the context of applying and analyzing cognitive processes, with 116 (44.8%) and 93 (35.9%) occurrences, respectively. Conversely, the higher-order cognitive processes of evaluating and creating were less frequently employed to describe the learning objectives, with only 5 (1.9%) and 31 (12%) occurrences, respectively. The lower-order cognitive processes of remembering and understanding were similarly less frequently used to describe learning goals: both were used seven times (2.7%). The findings of the analysis conducted based on Bloom’s Taxonomy reveal that the learning objectives for data science in general education tend to be predominantly of moderate complexity.

4.4. A Knowledge Map of the Topics Covered

The topics covered in the course were scrutinized based on weekly listings in the syllabi, with a standard semester duration of 18 weeks in Taiwanese universities. Common themes in the 1st, 9th, and 18th weeks, including course introductions, midterm, and final exams, were excluded from the analysis. Employing bibliometrics and text mining for syllabus examination, the weekly topics were amalgamated into a paragraph akin to an abstract, with the course title serving as the literature title. Subsequently, the bibliometric tool VOS Viewer facilitated the analysis. The identification of common topics involved calculating the co-occurrence of terms and total link strength in the syllabi. Co-occurrence denotes the number of syllabi featuring a term at least once, while total link strength signifies the co-word links between a topic in one syllabus and others across all syllabi. Utilizing these data, a network topology knowledge map was generated through cluster analysis.

Figure 2 depicts the knowledge map derived from the analysis. Each circle denotes a topic, with similar topics grouped into clusters marked by distinct colors. Larger circles signify higher co-occurrence across syllabi, emphasizing the importance of the topic in the course. The links and distances between circles indicate the relevance strength of topics, varying based on whether they are from the same or different syllabi. In Figure 2, four distinct clusters are evident, representing major thematic areas across syllabi. These clusters, in descending order of prominence, include big data (red), data analysis (green), data (blue), and exercise (yellow).

The red cluster, situated on the left side, densely covers topics related to big data, applications, life, review, change, design, research, discussion, trends, Excel, decision-making, and big data projects. Constructed from the conceptual framework encapsulated by these terms, the “big data competitiveness domain” was formulated. This domain underscores the emphasis within data science general education courses on honing applied skills, translating insights into trends, fostering capabilities in decision-making and discussions, engaging in research and development, and facilitating design and integration with real-life contexts.

The green cluster at the bottom emphasizes data analysis methodologies, encompassing topics like data analysis, machine learning, Python, case studies, applications, use, big data analysis, and classification, forming the “analytical techniques domain”.

The blue cluster at the top right of the graph mainly comprises the small-sized circles of function, exception handling, matplotlib, list, loops, and deep learning. This cluster is relatively separate from the others, but the largest circle located at the center of the graph, data, belongs to this cluster. These topics, centered around programming languages and specific packages for data manipulation and plotting, constitute the “programming competency domain”. These findings indicate a level of specialization in certain general education data science courses, reminiscent of computer programming courses. Notably, Python, encompassing pandas and matplotlib, emerges as the primary programming language taught in these courses, while terminologies associated with R, another language commonly taught in data science, are absent.

The smallest cluster, marked in yellow at the graph’s center, encompasses exercise and case study, reflecting the pedagogy and learning approach. This is referred to as the “teaching strategies domain”, suggesting that insights in many data science general education courses are cultivated through case studies, while analytical expertise, such as programming, is emphasized through practical exercises.

To highlight distinctions among the four domains, Table 7 presents the top 15 emphasized terms, including their co-occurrences and total link strength, which are calculated through bibliometric techniques. Amongst these most emphasized topics, six belong to the big data competency domain, six belong to the analytical techniques domain, two belong to the teaching strategies domain, and one belongs to the programming competency domain. Table 7 also shows a total of 52 topics identified in this study, with the big data competency and analytical techniques domains accounting for 17 and 16 topics, respectively, followed by the programming competency (11) and teaching strategies (8) domains. Unsurprisingly, compilation skills account for the lowest percentage in introductory data science courses, and the teaching strategies domain is also less frequently mentioned in the syllabi. Amongst the topic terms, the term data has the highest co-occurrence, which was mentioned in 29 out of the 60 syllabi. In other words, the core essence of the data science course is data. Meanwhile, the co-occurrence of big data is 21 higher than the 10 in data science, which implies that big data may be of more interest to students from different backgrounds.

5. Discussion

This study, which is centered on university courses in Taiwan, provides insights that can be compared with analogous studies in different regions, shedding light on variations and establishing common ground for pedagogical considerations. The study uncovers a significant difference in the utilization rates of R and Python as tools, with rates standing at 6.7% and 28.3%, respectively. This disparity contrasts sharply with the observations of Schwab-McCoy et al. [14] in their examination of university courses in the United States, where the adoption rates of R and Python were 52.2% and 34.8%, respectively. The preference for using R or Python may be influenced by the backgrounds of instructors, whether they are statisticians or computer scientists. Data science courses in Taiwan clearly lean towards nurturing computer science talent, reflecting Taiwan’s emphasis on information and communication technology over mathematics and statistics in education. This is further supported by the higher number of university students majoring in computer-related disciplines [58].

The results of the analysis of the assessment technique show that the top three most used are participation, assignment, and project. Participation involves aspects like class attendance and interaction with instructors, serving as an engagement indicator despite data science courses being electives. Attendance is considered crucial for student success, reflecting their commitment [59]. To boost interest, many data science courses incorporate pedagogies like problem-based learning [60], gamification [61], and experiential learning [30]. While the syllabi may not explicitly outline these approaches, the emphasis on participation suggests their value. Projects often entail collaborative final assignments, fostering communication and coordination skills that are vital for this interdisciplinary field [29]. While general education does not aim to shape everyone into data scientists, it lays the foundation for potential interdisciplinary collaborations in the future.

Analyzing learning objectives in tandem with Bloom’s taxonomy of cognitive processes aids in inferring the course difficulty. This study reveals that general education data science prioritizes applying and analyzing knowledge over simple recall and comprehension. This emphasis implies an expectation for students to apply their knowledge in problem-solving, decision-making, and drawing conclusions, necessitating higher-level cognitive processing, and thereby this subject is considered more challenging. The perceived difficulty of a course tends to increase with the elevation of cognitive processes. For instance, a data science course emphasizing the creation of intricate models and algorithms for prediction and analysis may be perceived as more challenging compared to a course concentrating solely on basic data operations and visualization.

To accommodate the diverse undergraduate backgrounds of students, it is essential to tailor data science education to their proficiency levels by defining suitable learning objectives. This customization aims to prevent demotivation stemming from excessively challenging courses, whether in general education or professional contexts. Furthermore, competencies emphasized in general education, such as critical thinking, problem-solving, and effective team communication, can be seamlessly integrated into data science education through designated activities such as discussions, presentations, and projects. This integration facilitates the achievement of desired educational outcomes.

Comparing the studies on course topics reveals both shared elements and distinctive features, despite variations in research methodologies and subjects. The 18-week course analysis employed a text mining and bibliometrics approach, yielding a knowledge map encompassing big data competitiveness, analytical techniques, programming competency, and teaching strategies domains. In contrast, Schwab-McCoy et al.’s [14] questionnaire-based study identified 34 predetermined topics, with top themes including data visualization, data cleaning, professional ethics, and the data science life cycle. Notably, the programming competency domain incorporates matplotlib and pandas, aligning with Schwab-McCoy et al.’s emphasis on data visualization and cleaning. This underscores a consensus in data science instruction regarding the utilization of programming languages for data manipulation. While the study’s big data competitiveness domain centers on conceptual and practical applications, it partially relates to the data science life cycle. Noteworthy, professional ethics was absent in the findings, warranting attention from Taiwan’s educators to align with the growing emphasis on ethics education in data science [62].

Additionally, topics covered in general education data science courses markedly differ from those in non-general education introductory courses. Friedman [47] conducted an analysis of 35 U.S. higher education big data course syllabi, primarily sourced from departments in Information Technology (e.g., Computer Science, Engineering, and Information Science) and Business. Among these, key topics in non-general education courses centered on technical architectures and algorithms, including big data infrastructure, Data-Driven Application Systems (Hadoop MapReduce and Spark, R, and Python), and Statistical Analytics. This indicates a focus on technology-centric data science education, emphasizing aspects like storage media, network infrastructure, and Hadoop MapReduce. In contrast, general education courses go beyond practical applications and data processing skills. They prioritize developing students’ data science literacy, aiming to enhance their ability to comprehend user demands and improve data and model interpretation. Consequently, data science courses in general education align more with human-centered data science education [63,64].

6. Conclusions

This research offers a comprehensive analysis of the current landscape of data science courses within the realm of general education and their focus on competency development. Conducting an extensive survey encompassing 60 data science syllabi from all universities in Taiwan, this study meticulously compiled information on instructional materials, assessment techniques, learning objectives, and covered topics.

The instructional materials employed in these courses exhibited significant diversity, extending beyond conventional professional textbooks. Notably, a considerable proportion included popular books, while instances of shared textbooks were minimal. The predominant type of textbook frequently centered around the utilization of Python programming for data analysis.

Contemporary classroom assessment techniques have transcended traditional examination formats. The majority of courses prioritize assessment methods such as participation, assignments, and projects for grading. These multifaceted techniques enable students to cultivate competencies beyond foundational knowledge, encompassing interactive communication, teamwork, practical skills, and real-world applications.

Analyzing learning objectives through Bloom’s Taxonomy levels unveils a focus on moderate complexity, which is primarily centered on applying and analyzing cognitive processes. Students are expected to apply their data science knowledge to real-world problem-solving and analytical reasoning, surpassing mere recall or understanding.

A bibliometric and text mining analysis of the syllabi resulted in the identification of four clusters, forming the basis for defining four domains within data science syllabi in general education: big data competency, analytic technology, programming competency, and teaching strategy domains. Within these domains, the big data competency domain exhibited the highest consensus among the syllabi and emerged as the most emphasized aspect in the general education curriculum. In contrast, the general education curriculum revealed the least consensus regarding the skills essential for data scientists in the domains of analytical techniques and programming skills. Certain courses diverged from introductory levels, placing emphasis on advanced Python programming. Nevertheless, the significance of data visualization [65,66] within various frameworks of data science education is only modestly reflected in the programming skills domain. Conversely, issues pertaining to ethics [67,68,69] were not explicitly addressed, possibly due to the relatively limited instructional time dedicated to these topics. It is conceivable that these subjects are explored in greater depth in courses beyond the introductory level.

This novel endeavor, scrutinizing syllabi through the lens of general education competencies, yields valuable insights for the ongoing improvement of courses across universities. The outcomes of this study hold significance for academics and professionals alike, aiding in the development of introductory data science and training courses catering to diverse disciplinary backgrounds.

7. Limitations and Future Research

This paper acknowledges several limitations, suggesting avenues for future research. The inclusion of all university general education syllabi in Taiwan, while comprehensive, did not filter out those with poor writing quality. Some syllabi provided limited content descriptions, featuring repetitive topics and instances where different instructors offered identical courses. These observations may lead to slight discrepancies in quantitative measurements.

The study focused on the instructional materials, assessment techniques, learning objectives, and covered topics, excluding learning outcomes and pedagogy due to inconsistencies in formatting and detail levels. Learning outcomes often presented broad concepts applicable to diverse courses, while pedagogical methods were minimally specified. For instance, the learning outcomes often encompass broad concepts, such as the capacity to analyze big data and enhance employability. Consequently, the analysis results may align more with the overarching objectives common to many courses, including teamwork and hands-on experience, rather than being distinctly tailored to data science courses. Regarding pedagogy, only three syllabi explicitly mention the use of problem-based learning, with no specific disclosure of student-centered, lecture, experiential learning, or game-based learning pedagogies in the remaining syllabi.

To address this challenge, future research endeavors should consider integrating syllabus quality assessments, utilizing scales like Palmer et al.’s scoring rubric for evaluation and selection [47,70]. Additionally, forthcoming studies may explore and compare the focal points of data science education across different countries, regions, and academic systems.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/educsci14070763/s1, File S1: Sample syllabus; File S2: Coding of the analysis.

Funding

This research was funded by the Ministry of Education of Taiwan (grant number PBM1123243).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The author declares no conflicts of interest.

References

Jiang, H.; Chen, C. Data Science Skills and Graduate Certificates: A Quantitative Text Analysis. J. Comput. Inf. Syst. 2022, 62, 463–479. [Google Scholar] [CrossRef]
Oliver, J.C.; McNeil, T. Undergraduate Data Science Degrees Emphasize Computer Science and Statistics but Fall Short in Ethics Training and Domain-Specific Context. PeerJ Comput. Sci. 2021, 7, e441. [Google Scholar] [CrossRef] [PubMed]
Dichev, C.; Dicheva, D. Towards Data Science Literacy. Procedia Comput. Sci. 2017, 108, 2151–2160. [Google Scholar] [CrossRef]
Urs, S.R.; Minhaj, M. Evolution of Data Science and Its Education in iSchools: An Impressionistic Study Using Curriculum Analysis. J. Assoc. Inf. Sci. Technol. 2022, 74, 606–622. [Google Scholar] [CrossRef]
Aasheim, C.; Williams, S.; Rutner, P.; Gardiner, A. Data Analytics vs. Data Science: A Study of Similarities and Differences in Undergraduate Programs Based on Course Descriptions. J. Inf. Syst. Educ. 2015, 26, 103–116. [Google Scholar]
Anderson, P.; McGuffee, J.; Uminsky, D. Data Science as an Undergraduate Degree. In Proceedings of the 45th ACM technical Symposium on Computer Science Education, Atlanta, GA, USA, 5–8 March 2014; pp. 705–706. [Google Scholar]
De Veaux, R.D.; Agarwal, M.; Averett, M.; Baumer, B.S.; Bray, A.; Bressoud, T.C.; Bryant, L.; Cheng, L.Z.; Francis, A.; Gould, R.; et al. Curriculum Guidelines for Undergraduate Programs in Data Science. Annu. Rev. Stat. Appl. 2017, 4, 15–30. [Google Scholar] [CrossRef]
National Academies of Sciences, Engineering, and Medicine; Division of Behavioral and Social Sciences and Education; Board on Science Education; Division on Engineering and Physical Sciences; Committee on Applied and Theoretical Statistics; Board on Mathematical Sciences and Analytics; Computer Science and Telecommunications Board; Committee on Envisioning the Data Science Discipline: The Undergraduate Perspective. Data Science for Undergraduates: Opportunities and Options; National Academies Press (US): Washington, DC, USA, 2018; ISBN 978-0-309-47559-4. [Google Scholar]
Donoho, D. 50 Years of Data Science. J. Comput. Graph. Stat. 2017, 26, 745–766. [Google Scholar] [CrossRef]
Demchenko, Y.; Jose, C.G.J.; Brewer, S.; Wiktorski, T. EDISON Data Science Framework (EDSF): Addressing Demand for Data Science and Analytics Competences for the Data Driven Digital Economy. In Proceedings of the 2021 IEEE Global Engineering Education Conference (EDUCON), Vienna, Austria, 21–23 April 2021; pp. 1682–1687. [Google Scholar]
ACM Data Science Task Force. Computing Competencies for Undergraduate Data Science Curricula; ACM: New York, NY, USA, 2021; ISBN 978-1-4503-9060-6. [Google Scholar]
Kakeshita, T.; Ishii, K.; Ishikawa, Y.; Matsubara, H.; Matsuo, Y.; Murata, T.; Nakano, M.; Nakatani, T.; Okumura, H.; Takahashi, N.; et al. Development of IPSJ Data Science Curriculum Standard. In Digital Transformation of Education and Learning—Past, Present and Future, Proceedings of the IFIP TC 3 Open Conference on Computers in Education, OCCE 2021, Tampere, Finland, 17–20 August 2021; Springer Nature Switzerland AG: Cham, Switzerland, 2022; Volume 642, p. 156. [Google Scholar]
Irizarry, R.A. The Role of Academia in Data Science Education. Harv. Data Sci. Rev. 2020, 2. [Google Scholar] [CrossRef]
Schwab-McCoy, A.; Baker, C.M.; Gasper, R.E. Data Science in 2020: Computing, Curricula, and Challenges for the next 10 Years. J. Stat. Data Sci. Educ. 2021, 29, S40–S50. [Google Scholar] [CrossRef]
Overton, M.; Kleinschmit, S. Data Science Literacy: Toward a Philosophy of Accessible and Adaptable Data Science Skill Development in Public Administration Programs. Teach. Public Admin. 2021, 40, 014473942110049. [Google Scholar] [CrossRef]
Fréchet, N.; Savoie, J.; Dufresne, Y. Analysis of Text-Analysis Syllabi: Building a Text-Analysis Syllabus Using Scaling. APSC 2020, 53, 338–343. [Google Scholar] [CrossRef]
Herzog, P.S.; Ai, J.; Ashton, J. Applying Bibliometric Techniques: Studying Interdisciplinarity in Higher Education Curriculum. Computation 2022, 10, 26. [Google Scholar] [CrossRef]
Föll, P.; Thiesse, F. Exploring Information Systems Curricula: A Text Mining Approach. Bus. Inf. Syst. Eng. 2021, 63, 711–732. [Google Scholar] [CrossRef]
Klee, S.; Janson, A.; Leimeister, J.M. How Data Analytics Competencies Can Foster Business Value—A Systematic Review and Way Forward. Inf. Syst. Manag. 2021, 38, 200–217. [Google Scholar] [CrossRef]
Hattingh, M.; Marshall, L.; Holmner, M.; Naidoo, R. Data Science Competency in Organisations: A Systematic Review and Unified Model. In Proceedings of the South African Institute of Computer Scientists and Information Technologists 2019 on ZZZ–SAICSIT ’19, Skukuza, South Africa, 17–18 September 2019; pp. 1–8. [Google Scholar]
Pratsri, S.; Nilsook, P.; Wannapiroon, P. Synthesis of Data Science Competency for Higher Education Students. Int. J. Educ. Inf. Technol. 2022, 16, 101–109. [Google Scholar] [CrossRef]
Cao, L. Data Science: A Comprehensive Overview. ACM Comput. Surv. 2018, 50, 1–42. [Google Scholar] [CrossRef]
Hubwieser, P.; Sentance, S. Taxonomies and Competency Models. In Computer Science Education: Perspectives on Teaching and Learning in School; Bloomsbury Academic: London, UK, 2018; pp. 221–242. [Google Scholar]
Baumer, B. A Data Science Course for Undergraduates: Thinking with Data. Am. Stat. 2015, 69, 334–342. [Google Scholar] [CrossRef]
Yan, D.; Davis, G.E. A First Course in Data Science. J. Statist. Educ. 2019, 27, 99–109. [Google Scholar] [CrossRef]
Çetinkaya-Rundel, M.; Ellison, V. A Fresh Look at Introductory Data Science. J. Stat. Data Sci. Educ. 2021, 29, S16–S26. [Google Scholar] [CrossRef]
Lasser, J.; Manik, D.; Silbersdorff, A.; Säfken, B.; Kneib, T. Introductory Data Science across Disciplines, Using Python, Case Studies, and Industry Consulting Projects. Teach. Stat. 2021, 43, S190–S200. [Google Scholar] [CrossRef]
Schuff, D. Data Science for All: A University-Wide Course in Data Literacy. In Analytics and Data Science: Advances in Research and Pedagogy; Deokar, A.V., Gupta, A., Iyer, L.S., Jones, M.C., Eds.; Annals of Information Systems; Springer International Publishing: Cham, Switzerland, 2018; pp. 281–297. ISBN 978-3-319-58097-5. [Google Scholar]
Asamoah, D.A.; Doran, D.; Schiller, S. Interdisciplinarity in Data Science Pedagogy: A Foundational Design. Journal of Computer Inf. Syst. 2020, 60, 370–377. [Google Scholar] [CrossRef]
Wong, N.; Kawash, J. An Introductory Multidisciplinary Data Science Course Incorporating Experiential Learning. In Data Management and Analysis; Alhajj, R., Moshirpour, M., Far, B., Eds.; Studies in Big Data; Springer International Publishing: Cham, Switzerland, 2020; Volume 65, pp. 33–49. ISBN 978-3-030-32586-2. [Google Scholar]
Parkes, J.; Fix, T.K.; Harris, M.B. What Syllabi Communicate about Assessment in College Classrooms. J. Excell. Coll. Teach. 2003, 14, 61–83. [Google Scholar]
Saville, B.K.; Zinn, T.E.; Brown, A.R.; Marchuk, K.A. Syllabus Detail and Students’ Perceptions of Teacher Effectiveness. Teach. Psychol. 2010, 37, 186–189. [Google Scholar] [CrossRef]
Eng, M.; Nicholls, J.; Mailloux, L. Tone and Style in Pharmacy Course Syllabi. Curr. Pharm. Teach. Learn. 2017, 9, 208–216. [Google Scholar] [CrossRef] [PubMed]
Tungare, M.; Yu, X.; Cameron, W.; Teng, G.; Pérez-Quiñones, M.A.; Cassel, L.; Fan, W.; Fox, E.A. Towards a Syllabus Repository for Computer Science Courses. SIGCSE Bull. 2007, 39, 55–59. [Google Scholar] [CrossRef]
Albers, C. Using the Syllabus to Document the Scholarship of Teaching. Teach. Sociol. 2003, 31, 60. [Google Scholar] [CrossRef]
Chong, F. The Pedagogy of Usability: An Analysis of Technical Communication Textbooks, Anthologies, and Course Syllabi and Descriptions. Tech. Commun. Q. 2016, 25, 12–28. [Google Scholar] [CrossRef]
Karanja, E.; Grant, D.M. Evaluating Learner-Centeredness Course Pedagogy in Project Management Syllabi Using a Content Analysis Approach. J. Inf. Syst. Educ. 2020, 31, 131–146. [Google Scholar]
Donnelly, J.; Winkelmann, K. Analysis of the Learning-Centeredness of Physical Chemistry Syllabi. J. Chem. Educ. 2021, 98, 1888–1897. [Google Scholar] [CrossRef]
Bejerano, A.R.; Bartosh, T.M. Learning Masculinity: Unmasking the Hidden Curriculum in Science, Technology, Engineering, and Mathematics Courses. J. Women Minor. Sci. Eng. 2015, 21, 107–124. [Google Scholar] [CrossRef]
Sulik, G.; Keys, J. “Many Students Really Do Not yet Know How to Behave!”: The Syllabus as a Tool for Socialization. Teach. Sociol. 2014, 42, 151–160. [Google Scholar] [CrossRef]
Carey, A.C.; Najarian Souza, C. Constructing the Sociology of Disability: An Analysis of Syllabi. Teach. Sociol. 2021, 49, 17–31. [Google Scholar] [CrossRef]
Glaser, B.G.; Strauss, A.L. The Discovery of Grounded Theory: Strategies for Qualitative Research; Routledge: London, UK, 2017. [Google Scholar]
Overton, M.; Kleinschmit, S. Transforming Research Methods Education through Data Science Literacy. Teach. Public Admin. 2022, 41, 014473942210844. [Google Scholar] [CrossRef]
Karanja, E.; Malone, L.C. Improving Project Management Curriculum by Aligning Course Learning Outcomes with Bloom’s Taxonomy Framework. J. Int. Educ. Bus. 2020, 14, 197–218. [Google Scholar] [CrossRef]
Bloom, B.S. Taxonomy of Educational Objectives, Handbook 1: Cognitive Domain, 2nd ed.; Addison-Wesley Longman Ltd.: London, UK, 1956; ISBN 978-0-582-28010-6. [Google Scholar]
Anderson, L.; Krathwohl, D.; Airasian, P.; Cruikshank, K.; Mayer, R.; Pintrich, P.; Raths, J.; Wittrock, M. Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom’s Taxonomy of Educational Objectives, 1st ed.; Pearson: New York, NY, USA, 2000; ISBN 978-0-8013-1903-7. [Google Scholar]
Friedman, A. Measuring the Promise of Big Data Syllabi. Technol. Pedagogy Educ. 2018, 27, 135–148. [Google Scholar] [CrossRef]
Kuhn, T.S. The Structure of Scientific Revolutions; University of Chicago Press: Chicago, IL, USA, 1970; Volume 111. [Google Scholar]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. Syst. Rev. 2021, 10, 89. [Google Scholar] [CrossRef] [PubMed]
Weber, R. Basic Content Analysis; SAGE Publications, Inc.: Thousand Oaks, CA, USA, 1990; ISBN 978-0-8039-3863-2. [Google Scholar]
Bumpus, E.C.; Vinco, M.H.; Lee, K.B.; Accurso, J.F.; Graves, S.L. The Consistency of Expectations: An Analysis of Learning Objectives within Cognitive Assessment Course Syllabi. Teach. Psychol. 2022, 49, 30–36. [Google Scholar] [CrossRef]
Newton, P.M.; Da Silva, A.; Peters, L.G. A Pragmatic Master List of Action Verbs for Bloom’s Taxonomy. Front. Educ. 2020, 5, 107. [Google Scholar] [CrossRef]
Krathwohl, D.R. A Revision of Bloom’s Taxonomy: An Overview. Theory Into Pract. 2002, 41, 212–218. [Google Scholar] [CrossRef]
Zupic, I.; Čater, T. Bibliometric Methods in Management and Organization. Organ. Res. Methods 2015, 18, 429–472. [Google Scholar] [CrossRef]
van Eck, N.J.; Waltman, L. Software Survey: VOSviewer, a Computer Program for Bibliometric Mapping. Scientometrics 2010, 84, 523–538. [Google Scholar] [CrossRef] [PubMed]
Davis, B.G. Tools for Teaching; John Wiley & Sons: Hoboken, NJ, USA, 2009; ISBN 978-0-470-56945-0. [Google Scholar]
Friedman, A. Data Science Syllabi Measuring Its Content. Educ. Inf. Technol. 2019, 24, 3467–3481. [Google Scholar] [CrossRef]
Textor, C. Number of University Students Majoring in Education in Taiwan 2022, by Degree Program. Available online: https://www.statista.com/statistics/931223/taiwan-number-university-college-students-majoring-in-education-by-degree-program/ (accessed on 22 April 2024).
Moores, E.; Birdi, G.K.; Higson, H.E. Determinants of University Students’ Attendance. Educ. Res. 2019, 61, 371–387. [Google Scholar] [CrossRef]
Saenphon, T.; Silpasuphakornwong, P. Problem-Based Learning in Data Science Course: Analysis of Online Learning during the COVID-19 Pandemic. In Proceedings of the 2022 13th International Conference on E-Education, E-Business, E-Management, and E-Learning (IC4E), Tokyo, Japan, 14–17 January 2022; pp. 132–137. [Google Scholar]
Turner, C. Learn2mine: Data Science Practice and Education through Gameful Experiences. IJEEEE 2014, 4. [Google Scholar] [CrossRef]
Lewis, A.; Stoyanovich, J. Teaching Responsible Data Science: Charting New Pedagogical Territory. Int. J. Artif. Intell. Educ. 2022, 32, 783–807. [Google Scholar] [CrossRef]
Aragon, C.; Guha, S.; Kogan, M.; Muller, M.; Neff, G. Human-Centered Data Science: An Introduction; MIT Press: Cambridge, MA, USA, 2022; ISBN 978-0-262-36759-2. [Google Scholar]
Shah, C.; Anderson, T.; Hagen, L.; Zhang, Y. An iSchool Approach to Data Science: Human-Centered, Socially Responsible, and Context-Driven. J. Assoc. Inf. Sci. Technol. 2021, 72, 793–796. [Google Scholar] [CrossRef]
Aparicio, M.; Costa, C.J. Data Visualization. Commun. Des. Q. Rev. 2015, 3, 7–11. [Google Scholar] [CrossRef]
Nolan, D.; Perrett, J. Teaching and Learning Data Visualization: Ideas and Assignments. Am. Stat. 2016, 70, 260–269. [Google Scholar] [CrossRef]
Fairfield, J.; Shtein, H. Big Data, Big Problems: Emerging Issues in the Ethics of Data Science and Journalism. J. Mass Media Ethics 2014, 29, 38–51. [Google Scholar] [CrossRef]
Saltz, J.S.; Dewar, N. Data Science Ethical Considerations: A Systematic Literature Review and Proposed Project Framework. Ethics Inf. Technol. 2019, 21, 197–208. [Google Scholar] [CrossRef]
Baumer, B.S.; Garcia, R.L.; Kim, A.Y.; Kinnaird, K.M.; Ott, M.Q. Integrating Data Science Ethics into an Undergraduate Major: A Case Study. J. Stat. Data Sci. Educ. 2022, 30, 15–28. [Google Scholar] [CrossRef]
Palmer, M.S.; Bach, D.J.; Streifer, A.C. Measuring the Promise: A Learning-Focused Syllabus Rubric. Improv. Acad. 2014, 33, 14–36. [Google Scholar] [CrossRef]

Figure 1. PRISMA flowchart for search and syllabi screening process.

Figure 2. Knowledge map obtained by analyzing the topics covered in each week.

Table 2. Summary of revised Bloom’s taxonomy levels and actions.

Level	Construct	Example Actions
1	Remembering	Recognize, recall
2	Understanding	Summarize, compare, explain
3	Applying	Execute, implement
4	Analyzing	Differentiate, organize, attribute
5	Evaluating	Check, critique
6	Creating	Generate, plan, produce
Source. As described by [53].

Table 3. Frequency of instructional materials and their presence in syllabi.

Type	Number of Materials	Number of Syllabi
Python	27	17
Data science	5	4
Internet of things	1	1
Visualization	1	1
Power BI	7	4
Statistics	1	1
Excel	6	1
R	10	4
SPSS	4	2
STATA	3	3
SAS	4	2
AI DL	2	2
DM ML	9	4
Others	14	10
Articles and technique report	26	3
Popular books	41	12
Lecture notes	12	12

Table 4. Percentage of courses incorporating each type of assessment technique.

Assessment Techniques	Percentage of Course (%)
Participation	67%
Assignment	65%
Project	65%
Exam	55%
Discussion and Presentation	35%
Quizzes	15%

Table 5. Frequency of top 20 verbs used to describe learning objectives in data science syllabi.

Ranking	Learning Objectives Verbs	Count
1	Analyze	90
2	Learn	43
3	Apply	58
4	Research	21
5	Understand	24
6, 7	Possess, Use	14
8, 9	Conduct, Develop	13
10	Decide	12
11, 12	Master, Create	11
13, 14, 15, 16	Complete, Implement, Process, Solve	10
17, 18,	Design, Introduce	9
19, 20	Operate, Provide	8

Table 6. Verbs used, their frequency of occurrence, and corresponding Bloom’s taxonomy level in learning objectives of data science syllabi.

Level of Revised Bloom’s Taxonomy	Verbs	Occurrence	Total/Weighted (%)
Remembering	Choose	3	7 (2.7%)
	Explain	2
	Describe	2
Understanding	Classify	3	7 (2.7%)
	Interpret	2
	Infer	2
Applying	Apply	58	116 (44.8%)
	Use	14
	Solve	10
	Develop	11
	Operate	8
	Compute	5
	Construct	5
	Calculate	3
	Demonstrate	1
	Link	1
Analyzing	Analyze	90	93 (35.9%)
	Illustrate	2
	Compare	1
Evaluating	Review	2	5 (1.9%)
	Evaluate	1
	Associate	1
	Predict	1
Creating	Process	10	31 (12.0%)
	Design	9
	Collect	5
	Organize	5
	Arrange	1
	Modify	1

Table 7. Distribution and count of the top 15 most emphasized topic terms.

Domains	Percentage of Topics (n = 52)	Topic Terms	Co- Occurrence	Total Link Strength
Big data competitiveness	32.7% (17/52)	Big data	21	118
		Analysis	19	112
		Application	16	84
		Presentation	8	66
		Review	11	65
		Life	8	57
Analytical techniques	30.8% (16/52)	Data analysis	16	107
		Machine learning	10	95
		Example	9	79
		Case	6	64
		Applications	9	63
		Python	8	62
Programming competency	21.2% (11/52)	Data	29	191
Teaching strategies	15.4% (8/52)	Exercise	11	82
Teaching strategies	15.4% (8/52)	Data science	10	59

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hsu, Y.-C. Mapping the Landscape of Data Science Education in Higher General Education in Taiwan: A Comprehensive Syllabi Analysis. Educ. Sci. 2024, 14, 763. https://doi.org/10.3390/educsci14070763

AMA Style

Hsu Y-C. Mapping the Landscape of Data Science Education in Higher General Education in Taiwan: A Comprehensive Syllabi Analysis. Education Sciences. 2024; 14(7):763. https://doi.org/10.3390/educsci14070763

Chicago/Turabian Style

Hsu, Yu-Chia. 2024. "Mapping the Landscape of Data Science Education in Higher General Education in Taiwan: A Comprehensive Syllabi Analysis" Education Sciences 14, no. 7: 763. https://doi.org/10.3390/educsci14070763

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mapping the Landscape of Data Science Education in Higher General Education in Taiwan: A Comprehensive Syllabi Analysis

Abstract

1. Introduction

2. Literature Review

2.1. Data Science Competencies

2.2. Teaching Practices in Data Science Courses

2.3. Research and Methodology of Syllabus Analysis

3. Methodology

4. Results

4.1. Instructional Materials

4.2. Assessment Techniques

4.3. Bloom’s Taxonomy of Cognitive Learning Objectives

4.4. A Knowledge Map of the Topics Covered

5. Discussion

6. Conclusions

7. Limitations and Future Research

Supplementary Materials

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI