Next Article in Journal
Digital Transformation in the EU: Bibliometric Analysis and Digital Economy Trends Highlights
Previous Article in Journal
Building Reputed Brands Through Online Content Strategies: A Quantitative Analysis of Australian Hospitals’ Websites
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Clean Customer Master Data for Customer Analytics: A Neglected Element of Data Monetization

1
Fraunhofer Center for International Management and Knowledge Economy, Lipanum, Martin-Luther-Ring 13, 04109 Leipzig, Germany
2
Business Administration (FEK), Linköping University, 581 83 Linköping, Sweden
3
Institute of Technology Management, University of St. Gallen, Dufourstrasse 40a, 9000 St. Gallen, Switzerland
*
Author to whom correspondence should be addressed.
Digital 2024, 4(4), 1020-1039; https://doi.org/10.3390/digital4040051
Submission received: 24 July 2024 / Revised: 12 November 2024 / Accepted: 26 November 2024 / Published: 13 December 2024

Abstract

:
Despite the demonstrable benefits of data monetization initiatives for achieving competitive advantages, many of these efforts struggle to realize their potential. Companies often find it challenging to sustain even initially successful data monetization initiatives due to formidable data quality issues. This reflects a disconnect between advancements in data monetization research—which range from digitization to digitalization and digital transformation—and their practical implementation within companies. Consequently, misguided approaches to data monetization are relatively common. A critical prerequisite for successful data monetization is the establishment and maintenance of clean, high-quality data. This study underscores the importance of data quality by conducting an in-depth analysis of Medical Inc., a company that prepares pristine customer master data for advanced customer analytics. The investigation aims to elucidate Medical Inc.’s approach for addressing data cleanliness challenges and developing a general framework for the process of cleansing customer master data. This framework illuminates a relatively unexplored aspect of data monetization, thereby supplementing existing research on digitization, digitalization, and digital transformation.

1. Introduction

Scholars and practitioners generally concur that companies must increase their involvement and investment in digitization, digitalization, and digital transformation [1]. Accordingly, data monetization has emerged as a principal justification for investments in these endeavors [2,3]. However, despite the proven benefits of data monetization, company success in leveraging data to achieve competitive advantages remains inconsistent. While some pioneering companies have successfully used data, substantial evidence indicates that many organizations continue to struggle with effective data utilization and data monetization [4,5,6]. These challenges are not confined to the monetization of “big” data and advanced analytics, as exemplified in artificial intelligence, but also extend to the monetization of basic data and data analytics.
The consensus among various studies is that only a minority of companies have achieved success with their data initiatives. Most companies experience only modest benefits, and some even incur losses from their advanced data analytics initiatives. Nevertheless, the few companies that successfully implement data monetization strategies are reported to outperform their less data-centric counterparts in terms of revenue growth, profit margins, and returns on equity [4,5,6].
This is particularly surprising because data monetization has rapidly emerged as a prominent research topic and dominates the agendas of many companies. Previous research has highlighted various approaches to data monetization, such as direct and indirect data monetization; elucidated the necessary conditions for successful data monetization initiatives; and identified use cases across various industries [7]. However, prior studies have overlooked two critical aspects. First, it remains unclear how companies’ legacy information and communication systems influence the outcomes of their data monetization initiatives. Legacy systems naturally shape the starting points of these initiatives by either inhibiting or facilitating company goals, roadmaps, strategic plans, and implementations. Legacy systems can impose burdens related to data quality, harmonization, and consistency, thereby delaying the achievement of expected outcomes in data initiatives [2,8].
Second, existing research frequently examines data monetization initiatives at the company level while emphasizing the importance of implementing lighthouse projects and developing concrete use cases to create and sustain momentum for data initiatives. The literature often highlights the successes of individual use cases without delving deeply into selected examples or exploring the reasons behind the failures of these use cases. This oversight is notable, given that many companies report difficulties during the implementation and eventual discontinuation of promising use cases [7].
We contend that the existing evidence and theoretical considerations indicate a significant gap in the understanding of how companies can effectively benefit from data. To address these fundamental issues, we investigated the following research question through a single case study of customer analytics in collaboration with a medical technology company, Medical Inc.: How should companies prepare their customer master data to monetize their customer analytics efforts?
In answering this question, we make three important contributions. First, we identify key activities that are necessary for the successful deployment of data initiatives in customer analytics. Rather than aiming for an exhaustive list, we focus on specific activities deemed most crucial for advancing data utilization and monetization through customer analytics. Second, we integrate these key activities into an overarching framework, illustrating how companies can advance their data and advanced analytics initiatives throughout their digitization and digital transformation efforts. Third, we reveal a gap in the 3D concept—digitization, digitalization, and digital transformation—and show that digitization alone is insufficient for achieving the next step of digital advancement [1,9].
This paper proceeds by first explicating the study’s theoretical background and then outlining the research methodology, which was applied in collaboration with our case company. We then present our findings, introducing the phase “datatization” to describe the company’s journey through digitization, digitalization, and digital transformation. This phase is translated into a nine-step framework that establishes digitalization as the best foundation for digital transformation. This paper concludes with a concise summary of the study’s primary findings and implications.

2. Theoretical Background

2.1. Digitization, Digitalization, and Digital Transformation

The present research is embedded in the general research on digitization, digitalization, and digital transformation [9,10]. Researchers define digitization as the conversion of analog signals into digital signals—essentially, separating the data from their medium and converting them into binary code, thus making them digitally available [9,11,12]. Digitalization is seen as the process of integrating digital technologies into organizational processes and consequently developing new value opportunities while simultaneously changing the way companies use the data [9,13,14]. Digital transformation is defined as the immersion of the whole enterprise into digital methods, which are not limited to processes and data but also touch on operations, the company’s business model, and the competencies of the firm [14]. These definitions implicate one another; digital transformation cannot occur without digitalization, and digitalization cannot happen without digitization [9].

2.2. Data Monetization

Data monetization is a critical component of this broader discourse on digitization, digitalization, and digital transformation. The role of data monetization in digitization, for example, involves creating raw digital data that form the basis for all subsequent data-driven strategies. For instance, digitized records, sensor data, and digital communication logs are essential for generating valuable insights. Data monetization drives digitalization by enabling companies to streamline operations, improve customer experiences, and enhance decision making through data analysis. Data are increasingly treated as a strategic asset in digital transformations. Organizations leverage data to drive innovation, gain competitive advantages, and create entirely new business models [2,4,5,6,7,8].
Data monetization is thus a continuous thread that runs through these stages, progressively enhancing the ways organizations derive value from their data. It ranges from merely converting and storing data (digitization) to using data to improve and innovate business processes (digitalization) and finally, to fundamentally transforming the business landscape to create new value propositions and revenue streams (digital transformation). Data monetization refers to the process through which tangible economic advantages are derived from the use of available data sources [8,15]. This process encompasses both direct and indirect methods, each capitalizing on data assets in distinctive ways to generate value. Direct data monetization primarily involves the sale or licensing of data to external parties. This can manifest in various forms, including data sales, in which companies collect, package, and vend data to third-party entities; data licensing, wherein companies grant other organizations the right to utilize their data under specified conditions; and participation in data marketplaces, whereby companies engage in the buying and selling of datasets. Conversely, indirect data monetization revolves around the use of data to optimize internal operations, refine products, or enhance services, thus yielding economic benefits [8,16]. Key methodologies in this realm include leveraging data analytics to facilitate more informed decision making, utilizing customer data to tailor services and foster heightened satisfaction and loyalty, employing data analysis to streamline operations and minimize costs (as exemplified by predictive maintenance in manufacturing), and leveraging data insights to drive product innovation [2,7].
Existing research articulates a structured approach to data monetization, which is delineated in four main phases: data collection; data cleansing (the terms “data cleaning” and “data cleansing” are often used interchangeably in data management and analytics. However, subtle distinctions between these terms can sometimes occur depending on the context or specific industry practices. “Data cleaning” generally refers to the process of identifying and rectifying errors or inconsistencies in data to improve their quality. This includes tasks such as correcting typographical errors, handling missing values, and removing duplicate records. Data cleaning typically involves surface-level tasks that address immediate, apparent issues in the data. Data cleansing is a more comprehensive process that involves ensuring that the data are not only free from errors but also accurate, consistent, and usable for their intended purpose. This might include verifying data against external sources, ensuring data integrity, and standardizing formats across datasets. Data cleansing encompasses a broader range of activities, often with a deeper focus on the overall quality and reliability of the data. Thus, we use the term “data cleansing” in the paper.) and preparation; data analysis; and implementation, evaluation, and monitoring. The data collection phase entails gathering relevant data from diverse sources, such as customer interactions, transaction records, or Internet of Things devices. Subsequently, the data are refined and prepared for analysis, ensuring their quality and usability. In the analysis phase, various analytical tools and techniques are employed to derive actionable insights from the data. Finally, the implementation, evaluation, and monitoring phase focusses on deploying data-driven initiatives and continually assessing their outcomes to ensure that they align with the anticipated economic benefits [16,17].
Navigating through these phases presents companies with several challenges pertaining to data quality, privacy, and security; data integration across disparate systems; scalability of data infrastructure; and cultivation of the requisite expertise in data science and analytics. Addressing these challenges is crucial for organizations seeking to realize the full potential of data monetization initiatives [18].

2.3. Data, Data Structuring, and Data Analytics

Recent discussions of data monetization often neglect data cleansing, preparation, and harmonization. One possible reason for the limited emphasis on data cleansing, preparation, and harmonization is the recent hype around “big data”. The term “big data” refers to large and complex data assets characterized by four features, known as the 4Vs: (i) volume, which describes how the large scale of data requires innovative tools for their collection, storage, and analysis; (ii) velocity, which highlights the rate at which data are generated and updated in real time; (iii) variety, which indicates the variation in data types; and (iv) veracity, which refers to the complexity and uncertainty of data [19,20].
The big data analytics process comprises four key phases. Phase 1 involves turning data into insights by collecting, cleaning, analyzing, and processing large, diverse, and usually unstructured data from internal and external sources. These analytics generate insights for decision makers. In phase 2, these insights are transformed into decisions as managers contextualize and attach meaning to the insights. Phase 3 involves translating decisions into specific operational actions. In phase 4, additional data points are generated from these actions, which are then cycled back into the process for future decision making.
Implementing this process is not linear; rather, companies should embrace an evolutionary approach for developing their data analytics capabilities over time by progressing through four stages of maturity. Stage 1, data structuring, involves digitizing and organizing data to ensure cleanliness, structure, and usability for further analysis. This often includes “scrubbing” data to remove errors and ensure quality, addressing the common concern about “garbage in, garbage out”. Stage 2 focuses on making data available to relevant users, which ensures that the right data are accessible when and where it is needed. In stage 3, basic analytics are applied to the data, demonstrating that even simple analytic approaches can yield significant gains and serve as a foundation for more advanced analytics in stage 4. Stage 4 involves applying advanced analytics, which can generate radically new business insights but requires deep analytical expertise. As companies develop their capabilities, they move along this maturity map. Both the big data analytics cycle and the implementation maturity map highlight the importance of data cleansing [21,22,23]. However, detailed guidance on data cleansing remains sparse. Additionally, it is often argued that the necessary costs of data structuring (e.g., clean[s]ing, harmonization, consistency) may outweigh the benefits of data monetization.

2.4. Customer Analytics

Data cleansing is contingent upon the specific use case of data monetization. Customer analytics is a significant use case wherein data structuring is a crucial precondition for effective data monetization. Customer analytics generally empowers businesses to make data-driven decisions, enhance customer engagement, improve marketing effectiveness, and ultimately increase customer loyalty and profitability [24]. Customer analytics refers to the systematic examination and interpretation of customer data to understand and predict customer behaviors, preferences, and trends. This involves harnessing data collection, data processing, and analytical techniques to gain insights into customer interactions, experiences, and engagement [25,26].
Customer analytics is particularly important for companies with a sales force, as it aids sales representatives and managers in increasing revenue and enhancing customer satisfaction and loyalty [24,27,28]. Customer analytics facilitates the understanding of customer behavior and preferences through data analysis, driving tailored service or product promotions that contribute to business growth and data monetization [28].
The primary objective of customer analytics is to provide key executives with real-time data insights via accessible business intelligence (BI) platforms equipped with user-friendly dashboards [29,30]. Ideally, these dashboards present executives with the most relevant data points without requiring a comprehensive analysis of the entire dashboard [31].
The opportunities that BI platforms offer for customer analytics are vast and varied. To harness these opportunities, the company must develop a data model that integrates all relevant data from various IT systems. The success of a BI platform in exploring these opportunities hinges on the accurate recording and alignment of all relevant customer data across databases, which wards off issues such as the same customer being assigned different IDs from various datasets [32].
Achieving data consistency is reported to be challenging. While companies starting from scratch may find it relatively easy, for those with a long legacy of IT systems, product and service sales often encounter numerous systems with significant data inconsistencies, necessitating extensive data alignment and structuring efforts [33]. Despite the critical importance of data alignment, companies frequently underestimate the effort required to integrate systems with disparate data models and datasets.
A data model explicitly defines the structure of data, while the structured data are organized according to this model. A dataset is an ordered collection of data defined by its content, grouping, purpose, and relatedness [34]. Companies must recognize that without data alignment, valuable insights into increasing revenue or customer satisfaction cannot be generated [24,33].
Recent advancements in customer analytics include algorithms designed to identify data inconsistencies and quality issues, such as duplicates, outliers, or missing values [35]. However, these algorithms have limited applicability to the alignment of transactional and master data [36]. Transactional data, which are generated by business transactions and changes rapidly over time, contrast with master data, which change infrequently and describe fundamental company objects, such as customer or product master data [36,37,38].

2.5. Algorithms for Data Integration and Cleansing

Algorithms for data integration and cleansing are essential for transforming data into a useful and consistent format. The primary data integration algorithms include entity resolution, schema matching, data fusion, and ontology-based integration. For data cleansing, the key algorithms are missing value imputation, outlier detection and removal, normalization and standardization, text cleansing, and data transformation. Advanced techniques for data cleansing include deep learning algorithms, such as autoencoders and recurrent neural networks; reinforcement learning through active learning; and graph-based data cleansing based on the use of graph algorithms for detecting and resolving inconsistencies and duplicates. These algorithms are fundamental to effective data integration and cleansing, and they ensure that data are high quality, consistent, and reliable enough for analytics and decision-making processes [38].
Data migration and integration require master data records to be linked, which can be facilitated by algorithms available on modern master data management (MDM) platforms [36]. However, these algorithms have limitations. While numerous statistical methods are available, none can fully merge and create a clean customer master dataset when the initial datasets are not clean. Various algorithms based on comparative or classificatory statistical frameworks can be applied to data in general, primarily to address transactional data issues [38,39]. Master data issues differ significantly from those that impact transactional data. Common transactional data issues include duplicates, missing values, outliers, and contradictory values [40]. In contrast, master data issues often involve duplicates, incorrect attributes, incorrect or missing groupings, and missing values.
Comparative algorithms require a reference dataset for comparison and logic to determine the correct value [35]. Classification methods rely on rules or patterns to classify objects [35,36]. These requirements are not as prevalent in customer master data as they are in transactional data [36]. The problem of duplicates differs markedly in the cases of master and transactional data. With transactional data, a duplicate is a true tuple duplicate with identical or similar attributes. With master data, duplicates are not as obvious; for example, a customer may have the same name but different addresses, where one address is correct for the customer and the other is incorrect. Determining the true value in such cases is challenging because there is no consistent rule to follow when faulty attributes vary between tuples.
Furthermore, existing frameworks cannot address all issues simultaneously but must follow a sequence [40]. This sequential approach poses a problem when multiple issues coexist in a single record. While existing algorithms are effective at identifying these problems, they are less effective at solving them and may even create new data quality issues in the process [41]. Consequently, cleansing master data remains a largely manual process [42]. Table 1 summarizes the key themes of our theoretical background.

3. Research Methodology

3.1. Empirical Context

To extract insights into the use of customer analytics by companies, we conducted an in-depth case study of a leading global medical technology provider, which is referred to here as Medical Inc. for confidentiality purposes. Medical Inc. does not directly approach patients but rather sells its products to health care providers. When it comes to customer data, no private data on patients are involved, but only data on the professional health care providers. The discussion of data privacy is not applicable to Medical Inc.
Medical Inc.’s diverse product portfolio includes syringes, needles, lab automation systems, and cell sorters. Headquartered in the United States, the company maintains regional head offices and geographical hubs across major markets worldwide. It operates as a centralized organization, granting minimal autonomy to its geographical markets in altering the operating model or redefining data management processes.
The strategic objective of Medical Inc. is to enhance customer centricity by prioritizing and proactively serving its customers (professional health care providers). Consequently, the company has been striving to integrate customer analytics into its sales department. This case study adopts an interorganizational perspective, examining the company’s internal departments and their interactions. Specifically, the sales department is treated as an internal customer of the analytics department.

3.2. Data Collection

Our methodology combined traditional case study methods [42,43] with a processual view and process theorizing [44]. We examined the evolution of Medical Inc.’s IT landscape and data structures in recent years. Data were collected through a series of interviews with technical subject matter experts who have been with the company for several years and through participation in internal workshops. The primary interview questions focused on Medical Inc.’s past, present, and future customer analytics initiatives, targeting the IT landscape, supported processes, and relevant data.
These primary data were supplemented with secondary data from internal documents, technical specifications, and project reports. The data collection spanned the entire data cleansing project and included input from both the core team and a broader project support team (e.g., analytic project leaders, customer master data managers, solution architects, data scientists, analysts, and operations managers). The project proceeded through internal brainstorming workshops, wherein key issues were classified into five main topic areas. All collected data were synthesized into a comprehensive case study description.

3.3. Data Analysis

A content analysis of the case description was conducted to identify the phases of the customer analytics initiatives and pinpoint turning points within these phases, thus segmenting the journey into phases to elucidate the interconnected events and simplify temporal flows. During the phase development, we identified where issues arose, evaluated how they were addressed, and anticipated future states. The framework, developed during brainstorming sessions, was iterated through several cycles, with after-action reviews being incorporated each time. Initially, the problem-solving team focused on German data, providing an opportunity to apply the framework to Austria and Switzerland, thereby refining the theory [42].
Acknowledging the inherent complexity of process data in data cleansing projects [45], our data analysis began with the construction of a timeline of key dates and milestones. Subsequently, an inductive method was used to create a detailed chronological narrative of the data cleansing initiative by triangulating data from documents, observations, and interviews. In this way, we applied the temporal bracketing strategy [44] to the examination of the chronological progression of the data cleansing initiative.

4. Results—Insights into the Data Cleansing Initiative

Our data analysis revealed that the data cleansing initiative, which is designed to enhance customer analytics, encompasses three broad phases. These phases interact with other strategic milestones and events driven by Medical Inc. Each phase exhibits distinct characteristics in Medical Inc.’s systems, databases, processes, and governance (see Figure 1).

4.1. The Three Main Phases of Using Data for Customer Analytics

Phase 1 commenced with the culmination of two significant acquisitions, precipitating a scenario in which Medical Inc. found itself operating disparate IT systems, each of which was reliant on individual databases. The transition from Phase 1 to Phase 2 ensued as Medical Inc. embarked on the integration of these disjointed systems and databases, concurrently endeavoring to standardize data processes and governance protocols.
Phase 2 was instigated by the implementation of a new enterprise resource planning (ERP) system within the framework of an integration initiative, which went live in 2022, with the master data cleansing project being slated for completion by 2025. These successive phases were characterized by distinct focal points and actions. Phase 1 was marked by the ramifications of the fragmented system landscape for data and operational processes—a predicament initially sparked by the inaugural acquisition but further exacerbated by subsequent acquisitions. Unlike the larger acquisition, the decision occurred during the smaller acquisition in 2015 to refrain from full integration. This smaller acquisition catered to the same customer base, operating on autonomous systems that necessitated disparate data processes across departments. This led to the duplicative and disjointed treatment of customer data. Consequently, difficulties in customer master data management surfaced.
The transition to Phase 2 was underpinned by concerted integration and harmonization endeavors. The objective here was to establish a unified system marked by standardized data processing across all departments, culminating in the creation of a singular data model encompassing all integrated systems. However, the severity of the master data issues became evident during this phase. While processes were harmonized and an integrated system was implemented, data migration into the new source system proved to be fraught with cleanliness issues. The completion of the final stage of Phase 2 was contingent upon the rectification and alignment of the data, thus leading to the subsequent development of a framework.
Phase 3 heralds a forward-looking perspective that envisages the use of refined master data for advanced analytics, which is facilitated by the establishment of robust data governance frameworks and delineated data flows. This phase will encompass enhancements to the BI system and database infrastructure alongside the standardization of processes. It will encompass detailed descriptions of data structures and identify contributors to specific analytics tools and functions.

4.2. Data Cleanliness, Integration, and Harmonization: A Key Challenge

Medical Inc.’s revenue growth through mergers and acquisitions (Phases 1 and 2) allowed the data cleaning, integration, and harmonization to become increasingly intricate, transitioning from the mere amalgamation of physical assets to the incorporation of the digital ecosystem into an established closed ecosystem. Given that Medical Inc. as well as the acquired companies rely heavily on digital processes to support their operations, integrating these processes into existing systems entails disruptions and necessitates meticulous planning. Despite adopting a greenfield approach by establishing a new system for the parent (Medical Inc.) environment, the integration process was not truly greenfield for the data aspect. Based on the recognition that data are one of the most significant assets today, it was imperative to migrate the data into new systems to preserve historical transactions.
Medical Inc. made the strategic decision to consolidate and modernize its operations by implementing a new ERP system while also integrating the latest acquisition. The project was structured with an implementation plan that proceeded in waves, gradually rolling out components of the parent company and the majority of the newest acquisition. As depicted in Figure 2, Medical Inc. operated on one customer relationship management (CRM) system and two ERP systems, each of which served distinct purposes. One ERP system primarily handled financial tasks, while the other managed other operational aspects. These ERP systems, which were sourced from different timeframes and based on different software platforms (SAP ECC R/3 and IBM from the 90s), fed into a BI system. The third ERP system, which was introduced with the latest acquisition, was an Oracle system that fed into its own BI module, which was subsequently integrated into Medical Inc.’s own BI system.
The project aimed to consolidate all functions into a single ERP system. Initially, the acquired company and the financial segment of Medical Inc. were integrated, followed by the consolidation of the remaining operational functions managed by the last ERP system. Given the disparate data models and datasets across these systems, the first crucial step was data migration. For the integration and migration of customer master data, a cloud-based master data management (MDM) platform was employed, enabling the aggregation of customer master data from various systems while standardizing the data by eliminating hierarchies.
As depicted in Figure 2, all customer records were uniformly input into the platform. Subsequently, the MDM platform began to create golden records, consolidating identical customers from different systems into singular entities. In this process, which was automated and facilitated by an underlying algorithm, various customer-related data elements, such as names and addresses, were compared.
The establishment of customer hierarchies ensued, which used the CRM hierarchies as a blueprint and replicated them within the MDM platform. On a specified date, all pertinent and refined data was migrated into the new ERP system.
When the new ERP system went live as part of the initial phase, challenges pertaining to master data became pervasive. Notably, erroneous orders were dispatched to incorrect customers, pricing discrepancies emerged for specific customers, and the consignment process encountered impediments. A thorough examination of the BI landscape pinpointed the root cause: inadequate cleanliness and organization of customer master data. Instances arose in which unrelated customers were merged, resulting in convoluted customer hierarchies and discrepancies between billing and delivery addresses across systems. These challenges necessitated a temporary suspension of the project.
The illustrative depiction of the fragmented IT landscape in Figure 2 highlights the ongoing efforts to streamline operations. ERP 3 has been replaced by ERP 4, while existing ERP systems are now fed by the new ERP system. All systems are interconnected through the MDM platform, which acts as a unifying conduit, and efforts are underway to identify and resolve key data-related issues before resuming work on the original project.
All systems (CRM, ERP, BI) are synchronized, with the MDM platform managing data sources and ensuring alignment across systems. The data model for customer master data entails hierarchical structuring, delineating customer trees with up to eight levels. This hierarchical structure, exemplified by a hospital setting, necessitates meticulous alignment between CRM and ERP systems.
Technical disparities between CRM and ERP systems, particularly in level setup and nomenclature, necessitate precise alignment. The ERP system designates the top customer as a hierarchy node, and it is responsible for grouping accounts and imparting terms to associated transactional accounts. Conversely, the CRM system employs the top-level account (TLA) as the highest level, with subsequent subaccounts being categorized hierarchically. Specific accounts must be harmonized in their roles across systems.
The MDM platform harmonizes information from both systems, ensuring consistency in account roles and levels across all systems (see Figure 2). This process is guided by a framework developed in response to identified issues and structured around their severity and hierarchical sequence.

4.3. Challenge Identification—Impeding Progress

The data cleansing project team identified five primary areas of concern: (i) hierarchy, (ii) business partners, (iii) golden records, (iv) data quality monitoring, and (v) customer structure. Furthermore, the expansion of the customer database resulted in a significant increase, with 235,000 customer records alone in the Germany, Switzerland, and Austria (GSA) region. This surge was attributed to the decision to migrate all customer records, both active and inactive, from all systems, exacerbating the pre-existing challenges within the customer master database. It is worth noting that these issues often manifested concurrently within individual customer records, complicating the identification of accurate data segments and the associated customer identities.
Hierarchy issues stem from discrepancies in the legal customer hierarchies established within the MDM platform. Faulty hierarchies originating from the CRM system led to instances in which customers were either not linked to any hierarchy, creating “orphans”, or linked to incorrect hierarchies, resulting in the amalgamation of unrelated customers within hierarchies. The absence of a clear understanding of the top-ranked customer’s role further compounded these issues, leading to the formation of disparate customer hierarchies with varying structures. Moreover, the hierarchical complexity, which had up to eight levels, was deemed excessive by the project team.
Business partner concerns arose from disparities across systems, as the MDM platform was intended to unify and cleanse data sources for migration to the new ERP system, while the existing systems remained unaltered, perpetuating data discrepancies. Additionally, inconsistencies in the functions attributed to each customer across systems, along with misaligned naming conventions, contributed to the divergence of data across systems.
The concept of golden records within the MDM platform entails consolidating multiple records of the same customer into a single entity. However, discrepancies among customer records, such as slight variations in addresses or postal codes, created challenges in identifying accurate information for the creation of golden records, resulting in erroneous customer setups and duplicate golden records.
Data quality monitoring and customer structure issues were identified as separate yet interconnected challenges. While data quality monitoring was hindered by the absence of a governance framework, customer structure issues stemmed from a lack of guidelines on how customer master data should be structured. Although these issues were not directly linked to data cleanliness, they served as foundational contributors to data discrepancies.

4.4. Proposed Solutions for Overcoming These Challenges

To address these challenges, the project team proposed a multifaceted approach. Initially, a comprehensive cleansing of the existing data was proposed to rectify the hierarchy, business partner, and golden record issues. Subsequently, the establishment of a governance process to prevent future instances of data uncleanliness was recommended. Additionally, the creation of a data handbook outlining guidelines for structuring customer master data was proposed to ensure consistency and accuracy in future data management efforts.
At the start of the rollout, there were 235,000 customer records in the system. Within a few months, 126,000 inactive accounts were removed. Beginning with Germany, the GSA team conducted an in-depth analysis to further reduce unnecessary customer records. They defined “needed” and “unneeded” records based on an activity list, which included CRM object usage, sales from the last three years, open invoices older than three years, technical service records, and asset placements.
The team interviewed stakeholders from each business unit to understand the actions taken with each customer record in the CRM system. They created reports using the relevant systems and mapped this information onto each customer record. Records with no activity were considered unnecessary and deactivated. This reduced the German customer base from 74,000 to 34,000 records.
Next, the team began cleansing activities by addressing the hierarchical customer setup. They started at the top of the customer trees, breaking down the customer base into manageable parts and creating a new basis for each tree. They created an overview of all TLAs, including orphans, and included records from ERP4 (the new ERP system), MDM, CRM, ERP1, and ERP2. ERP4, MDM, and CRM records had a one-to-one relationship, but MDM–ERP2 and ERP1–ERP2 records had many-to-one relationships, introducing complexity. Accordingly, the team focused on cleansing the newest ERP system and the CRM system first.
Each team member reviewed assigned packages of TLA records to decide whether they were true TLAs, were orphans that needed to be upgraded or attached to an existing TLA, or required the creation of a new TLA. These decisions were based on the customer’s name and address information found in the systems and online. Duplicate records were identified and consolidated. This first validation round created a new TLA base to work with.
The next step was validating the “child” accounts linked to the TLAs. Each customer record overview was updated with the new TLAs, including all levels of the customer trees and their respective IDs, addresses, and sales data. The team checked whether the parent account was correct and remapped it if necessary. This step ensured clean reporting at the TLA level, which was critical for BI and other functions that determine prices and agreements.
The first part of the data handbook was prepared to guide teams on how customers should be set up in the system. This enabled clean reporting for group purchasing organizations or lab groups, ensuring that pricing and terms set at the TLA level translated correctly into child accounts, solving the pricing and delivery issues caused by incorrect parenting.
The next stage involved correcting the levels within each customer tree, merging duplicate accounts, assigning customer categories, and adding specific data points, such as the number of beds for hospitals and the German unique hospital identifier. This required visualizing and sorting each customer tree correctly according to the data handbook, understanding each entity, and enriching the tree with additional data. Customer categories helped with strategic business planning by addressing the needs of different markets, such as hospitals, laboratories, and outpatient care.
Finally, the team corrected names, addresses, and other customer information and aligned business partners across systems. This step improved the readability and transparency of the customer records, making it easier to identify the correct customer for any function, such as order management, and reducing the creation of new accounts due to incorrect information. Streamlining the business partner ensured consistency across systems. Table 2 summarizes these challenges, the proposed solution, and a short description.

4.5. A Framework for a Customer Master Data Cleansing Process

The insights from our case study of Medical Inc. can be translated into a more general framework for customer master data cleansing. The emerging framework would consist of two core activities with nine subsequent key tasks. The core activities encompass the (1) problem collection activities and, afterward, (2) master data cleansing activities. The initial problem collection activity included (1a) problem identification, (1b) major problem categorization, (1c) problem prioritization, and (1d) problem preparation tasks. The later master data cleansing activity consists of five tasks: (2a) setting up a customer base overview, (2b) deactivating unnecessary customers, (2c) reassigning top customers, (2d) assigning sub-customers to top customers, and (2e) cleansing customer trees. Table 3 summarizes these key activities and key tasks according to their objectives, steps, and outcomes.

5. Conclusions

This study highlights the challenges of and critical activities necessary for successful data monetization through customer analytics [20], particularly within the context of a legacy IT environment [16,18]. Our insights emerged from joint research activities together with our single case company, Medical Inc. Naturally, the results yielded by our single case study are limited in terms of external validation. Thus, while our findings are not generalizable, we believe that they are transferable to many other companies [43]. Future research could apply our findings to other companies struggling with legacy IT environments.
Overall, our findings supplement the existing literature on digitization, digitalization, and digital transformation; data monetization, structuring, and analytics; customer analytics; and algorithms for data integration and cleaning in three different ways [3,16,18]. First, we identify key activities that are necessary for the successful deployment of data initiatives in customer analytics. Rather than aiming for an exhaustive list, we focus on specific key activities we deem most crucial for advancing data utilization and monetization through customer analytics. Second, we integrate these key activities into an overarching framework, illustrating how companies can advance their data and advance analytics initiatives throughout their digitization and digital transformation efforts. Third, we address a gap in the 3D concept—digitization, digitalization, and digital transformation—highlighting that digitization alone is insufficient for achieving the next step in digital advancement [1,9].
In more detail, our case study of Medical Inc. reveals that data cleaning, preparation, and harmonization are foundational for deriving value from customer data [22,23]. Insights gleaned from Medical Inc.’s endeavors offer a valuable framework for a generalized customer master data cleansing process, comprising problem identification, categorization, prioritization, and subsequent cleansing activities. This structured framework underscores the iterative nature of data management, emphasizing continuous evaluation and refinement to maintain data integrity and drive informed decision making [35].
We propose a nine-step framework within the phase of pre-digitalization (datatization), emphasizing that digitization alone is insufficient for achieving digital transformations. Companies must integrate key activities into a comprehensive approach, addressing both the theoretical and practical aspects of data utilization. We call this datatization. Datatization refers to the process of converting various forms of information into data that can be quantified, analyzed, and utilized in analytics systems. This involves capturing, storing, and organizing data from diverse sources, thereby transforming it into structured formats suitable for computational analysis and decision-making processes. Datatization enables the extraction of actionable insights, supports data monetization strategies, and facilitates the integration of data into broader digital ecosystems.
This research not only fills gaps in the existing literature but also offers a structured path for companies aiming to leverage customer analytics effectively. Future research should further explore the dynamic interactions between legacy systems and data monetization initiatives to provide deeper insights into overcoming the associated challenges [2,3].
In conclusion, our comprehensive analysis of Medical Inc.’s data cleansing initiative for enhancing customer analytics has shed light on a structured approach encompassing three vital phases: initial assessment and planning, implementation and execution, and evaluation and refinement. These phases have unfolded within the context of Medical Inc.’s strategic milestones and events, elucidating distinct characteristics across its systems, databases, processes, and governance [23].

Author Contributions

Conceptualization, J.S. and H.G.; methodology, J.S.; validation, J.S.; formal analysis, J.S.; investigation, J.S.; resources, J.S.; data curation, J.S.; writing—original draft preparation, J.S.; writing—review and editing, J.S. and H.G.; visualization, J.S.; supervision, H.G.; project administration, H.G.; funding acquisition, H.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Acknowledgments

We would like to acknowledge the administrative and technical support from Fraunhofer IMW, Linkoping University, the University of St. Gallen, and Becton Dickinson.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Danuso, A.; Giones, F.; da Silva, E.R. The digital transformation of industrial players. Bus. Horiz. 2022, 65, 341–349. [Google Scholar] [CrossRef]
  2. Wixom, B.H.; Piccoli, G.; Rodriguez, J. Fast-track data monetization with strategic data assets. MIT Sloan Manag. Rev. 2021, 62, 1–4. [Google Scholar]
  3. Zhang, X.; Yue, W.T.; Yu, Y.; Zhang, X. How to monetize data: An economic analysis of data monetization strategies under competition. Decis. Support Syst. 2023, 173, 114012. [Google Scholar] [CrossRef]
  4. Top Trends in Data and Analytics 2024. Available online: https://www.gartner.com/smarterwithgartner/gartner-top-10-data-and-analytics-trends-for-2021 (accessed on 24 July 2024).
  5. How to Monetize Your Customer Data. Available online: https://www.gartner.com/smarterwithgartner/how-to-monetize-your-customer-data/ (accessed on 24 July 2024).
  6. How to Create a Business Case for Data Quality Improvement. Available online: https://www.gartner.com/smarterwithgartner/how-to-create-a-business-case-for-data-quality-improvement/ (accessed on 24 July 2024).
  7. Ritala, P.; Keränen, J.; Fishburn, J.; Ruokonen, M. Selling and monetizing data in B2B markets: Four data-driven value propositions. Technovation 2024, 130, 102935. [Google Scholar] [CrossRef]
  8. Najjar, M.S.; Kettinger, W.J. Data monetization: Lessons from a retailer’s journey. MIS Q. Exec. 2013, 12, 213–225. [Google Scholar]
  9. Saarikko, T.; Westergren, U.H.; Blomquist, T. Digital transformation: Five recommendations for the digitally conscious firm. Bus. Horiz. 2020, 63, 825–839. [Google Scholar] [CrossRef]
  10. Kokkinou, A.; van Kollenburg, T.; Mandemakers, A.; Hopstaken, H.; van Elderen, J. The data analytic capability wheel: An implementation framework for digitalization. In Proceedings of the 36th Bled eConference: Digital Economy and Society: The Balancing Act for Digital Innovation in Times of Instability, Bled, Slovenia, 25–28 June 2023. [Google Scholar]
  11. Legner, C.; Eymann, T.; Heß, T.; Matt, C.; Böhmann, T.; Drews, P.; Mädche, A.; Urbach, N.; Ahlemann, F. Digitalization: Opportunity and challenge for the business and information systems engineering community. Bus. Inform. Syst. Eng. 2017, 59, 301–308. [Google Scholar] [CrossRef]
  12. Tilson, D.; Lyytinen, K.; Sørensen, C. Research commentary—Digital infrastructures: The missing IS research agenda. Inf. Syst. 2010, 21, 748–759. [Google Scholar] [CrossRef]
  13. Brynjolfsson, E.; McAfee, A. The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies; WW Norton & Company: New York, NY, USA, 2014. [Google Scholar]
  14. Machado, C.G.; Winroth, M.; Carlsson, D.; Almström, P.; Centerholt, V.; Hallin, M.C. Industry 4.0 readiness in manufacturing companies: Challenges and enablers towards increased digitalization. Procedia CIRP 2019, 81, 1113–1118. [Google Scholar] [CrossRef]
  15. Ofulue, J.; Benyoucef, M. Data monetization: Insights from a technology-enabled literature review and research agenda. Manag. Rev. Q. 2024, 74, 521–565. [Google Scholar] [CrossRef]
  16. Wixom, B.H.; Ross, J.W. How to monetize your data. MIT Sloan Manag. Rev. 2017, 58, 10–13. [Google Scholar]
  17. Faroukhi, A.Z.; El Alaoui, I.; Gahi, Y.; Amine, A. Big data monetization throughout big data value chain: A comprehensive review. J. Big Data 2020, 7, 3. [Google Scholar] [CrossRef]
  18. Wixom, B.; Yen, B.; Rellich, M. Maximizing value from business analytics. MIT Sloan Manag. Rev. 2013, 12, 111–123. [Google Scholar]
  19. Kietzmann, J.; Paschen, J.; Treen, E. Artificial intelligence in advertising: How marketers can leverage artificial intelligence along the consumer journey. J. Advert. 2018, 58, 263–267. [Google Scholar] [CrossRef]
  20. Sivarajah, U.; Kamal, M.M.; Irani, Z.; Weerakkody, V. Critical analysis of big data challenges and analytical methods. J. Bus. Res. 2017, 70, 263–286. [Google Scholar] [CrossRef]
  21. Sanders, N.R. How to use big data to drive your supply chain. Calif. Manag. Rev. 2016, 58, 26–48. [Google Scholar] [CrossRef]
  22. Tabesh, P.; Mousavidin, E.; Hasani, S. Implementing big data strategies: A managerial perspective. Bus. Horiz. 2019, 62, 347–358. [Google Scholar] [CrossRef]
  23. Shah SI, H.; Peristeras, V.; Magnisalis, I. Government big data ecosystem: Definitions, types of data, actors, and roles and the impact in public administrations. ACM J. Data Inform. Qual. 2021, 13, 1–25. [Google Scholar] [CrossRef]
  24. Meena, P.; Sahu, P. Customer relationship management research from 2000 to 2020: An academic literature review and classification. Vision 2021, 25, 136–158. [Google Scholar] [CrossRef]
  25. Erevelles, S.; Fukawa, N.; Swayne, L. Big data consumer analytics and the transformation of marketing. J. Bus. Res. 2016, 69, 897–904. [Google Scholar] [CrossRef]
  26. Hossain, M.A.; Akter, S.; Yanamandram, V.; Wamba, S.F. Data-driven market effectiveness: The role of a sustained customer analytics capability in business operations. Technol. Forecast. Soc. Chang. 2023, 194, 122745. [Google Scholar] [CrossRef]
  27. Velcu-Laitinen, O.; Yigitbasioglu, O. The use of dashboards in performance management: Evidence from sales managers. Int. J. Digit. Account. Res. 2012, 12, 36–58. [Google Scholar]
  28. What Is Customer Analytics? Available online: https://www.forbes.com/advisor/business/customer-analytics/ (accessed on 24 July 2024).
  29. Chen, H.; Chiang, R.H.; Storey, V.C. Business intelligence and analytics: From big data to big impact. MIS Q. 2012, 36, 1165–1188. [Google Scholar] [CrossRef]
  30. Dover, C. How dashboards can change your culture. Strat. Fin. 2004, 86, 42. [Google Scholar]
  31. Pappas, L.M.; Whitman, L. Riding the technology wave: Effective dashboard data visualization. In Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2011; pp. 249–258. [Google Scholar]
  32. Watson, H.J.; Goodhue, D.L.; Wixom, B.H. The benefits of data warehousing: Why some organizations realize exceptional payoffs. Inform. Manag. 2002, 39, 491–502. [Google Scholar] [CrossRef]
  33. Gudivada, V.N.; Apon, A.; Ding, J. Data quality considerations for big data and machine learning: Going beyond data cleaning and transformations. Intern. J. Adv. Soft. 2017, 10, 1–20. [Google Scholar]
  34. Renear, A.H.; Sacchi, S.; Wickett, K.M. Definitions of dataset in the scientific and technical literature. Proc. Assoc. Inf. Sci. Technol. 2010, 47, 1–4. [Google Scholar] [CrossRef]
  35. Lee, G.Y.; Alzamil, L.; Doskenov, B.; Termehchy, A. A survey on data cleaning methods for improved machine learning model performance. arXiv 2021, arXiv:2109.07127. [Google Scholar]
  36. Pansara, R.R. Maturity Model of Master Data Management at Enterprise Level. Sch. J. Eng. Tech. 2024, 2, 31–39. [Google Scholar] [CrossRef]
  37. McGilvray, D. Executing Data Quality Projects: Ten Steps to Quality Data and Trusted Information TM; Academic Press: Cambridge, MA, USA, 2008. [Google Scholar]
  38. Wedekind, H. Bestandsdaten, Bewegungsdaten, Stammdaten. In Lexikon der Wirtschaftsinformatik; Springer: Berlin/Heidelberg, Germany, 1997; p. 61. [Google Scholar]
  39. Arnold, J.; Hammwöhner, R. Data Integration and Data Cleaning: Solutions for Improving Data Quality; Springer: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
  40. Mahdavi, M.; Neutatz, F.; Visengeriyeva, L.; Abedjan, Z. Towards automated data cleaning workflows. Mach. Learn. 2019, 15, 16. [Google Scholar]
  41. Ridzuan, F.; Zainon, W. A review on data cleansing methods for big data. Procedia Comput. Sci. 2019, 161, 731–738. [Google Scholar] [CrossRef]
  42. Eisenhardt, K.M. Building theories from case study research. Acad. Manag. Rev. 1989, 14, 532–550. [Google Scholar] [CrossRef]
  43. Yin, R.K. Qualitative Research from Start to Finish, 2nd ed.; The Guilford Press: New York, NY, USA, 2015. [Google Scholar]
  44. Langley, A. Strategies for theorizing from process data. Acad. Manag. Rev. 1999, 24, 691–710. [Google Scholar] [CrossRef]
  45. Langley, A.; Smallman, C.; Tsoukas, H.; Van De Ven, A.H. Process studies of change in organization and Management: Unveiling temporality, activity, and flow. Acad. Manag. J. 2013, 56, 1–13. [Google Scholar] [CrossRef]
Figure 1. Temporal brackets describing the three key phases.
Figure 1. Temporal brackets describing the three key phases.
Digital 04 00051 g001
Figure 2. Key issues with customer master data cleaning.
Figure 2. Key issues with customer master data cleaning.
Digital 04 00051 g002
Table 1. Overview of the key themes of the study’s theoretical background.
Table 1. Overview of the key themes of the study’s theoretical background.
Key Research Theme Summary
Digitization, Digitalization, and Digital TransformationResearchers distinguish the concepts of digitization, digitalization, and digital transformation. Digitization involves converting analog signals into digital ones, thereby separating data from their medium. Digitalization involves integrating digital technologies into organizational processes to create new value opportunities. Digital transformation immerses the entire enterprise in digital methods, extending beyond processes and data to impact operations, business models, and competencies.
Data MonetizationData monetization is integral to discussions of digitization, digitalization, and digital transformation. It involves deriving economic benefits from available data sources through direct or indirect methods. Direct methods include data sales, licensing, and participation in data marketplaces. Indirect methods involve utilizing data to optimize internal operations, refine products, or enhance services.
Data Structuring and Data AnalyticsData cleaning, preparation, and harmonization are crucial for effective data monetization. Big data analytics is a process that comprises four phases: turning data into insights, transforming insights into decisions, translating decisions into actions, and generating data points for future decision making. Companies should progress through four stages of maturity in developing data analytics capabilities: data structuring, data availability, basic analytics, and advanced analytics.
Customer AnalyticsData cleaning is essential for customer analytics as it enables businesses to make data-driven decisions and enhance customer engagement, marketing effectiveness, and profitability. It involves examining and interpreting customer data to understand and predict behaviors, preferences, and trends. Customer analytics empowers sales representatives and managers to increase revenue and improve customer satisfaction and loyalty.
Algorithms for Data Integration and CleaningVarious algorithms facilitate data integration and cleaning, ensuring the quality, consistency, and reliability of data for analytics and decision making. These algorithms include entity resolution, schema matching, data fusion, ontology-based integration, missing value imputation, outlier detection and removal, normalization and standardization, text cleaning, and data transformation. Advanced techniques, such as deep learning, reinforcement learning, and graph-based algorithms, enhance the effectiveness of data cleaning.
Table 2. Proposed solutions.
Table 2. Proposed solutions.
ChallengeProposed SolutionDescription
Data CleansingComprehensive data cleansing of existing recordsAddressed hierarchy issues, business partner discrepancies, and inconsistencies with golden records by removing inactive accounts and unnecessary records. Started with the GSA team conducting analysis to identify and deactivate records with no recent activity.
Data GovernanceEstablishing a data governance processRecommended a formal governance process to prevent future data issues, ensuring long-term data quality and consistency.
Consistency in Customer Master DataCreation of a data handbookDeveloped guidelines for structuring customer master data to ensure consistency and accuracy in future data management efforts, supporting clean reporting and reducing pricing and delivery issues.
Inactive and Unnecessary AccountsRemoval of inactive accounts and reduction of customer recordsRemoved 126,000 inactive accounts and further reduced records in Germany by categorizing records as “needed” or “unneeded”, significantly reducing unnecessary records and enhancing CRM efficiency.
Customer Hierarchy and StructureCleansing of hierarchical customer setupBroke down customer trees into manageable parts, identified true TLAs (Top-Level Accounts), resolved orphan records, and created a structured TLA base. Updated each customer record with accurate hierarchy details, ensuring accurate reporting and pricing at all levels of the customer hierarchy.
Duplicate RecordsIdentification and consolidation of duplicatesConducted a validation round to identify and merge duplicate records, creating a streamlined customer database.
Parent — Child Account RelationshipsValidation and correction of linked “child” accountsVerified and corrected parent account assignments to ensure accurate reporting and consistent pricing and terms for child accounts.
Data Handbook for Customer SetupDevelopment of a data handbook for customer setup guidelinesProvided guidelines on setting up customer data in the system, solving pricing and delivery issues caused by incorrect account hierarchies and structures.
Accurate Customer Tree LevelsCorrection of levels within customer treesMerged duplicate accounts, assigned customer categories, and added data points (e.g., beds for hospitals), enhancing tree structure for strategic planning and market differentiation.
Business Partner AlignmentAlignment of names, addresses, and customer information across systemsEnhanced readability, reduced duplicate account creation, and ensured consistency across systems to support various functions such as order management.
Table 3. A framework for customer master data cleansing.
Table 3. A framework for customer master data cleansing.
Key Activities(1) Problem Collection Activities(2) Master Data Cleaning Activities
Key Tasks(1a) Problem Identification Process(1b) Categorization into High-Level Topics(1c) Setting Problem Priorities(1d) Division of Problems into Categories(2a) Setting Up a Customer Base Overview(2b) Deactivating Unnecessary Customers(2c) Reassigning Top Customers(2d) Assigning Sub- Customers to Top Customers(2e) Cleaning Up Customer Trees
ObjectivesTo achieve the point where issues are repeatedly mentioned during brainstorming sessions.To categorize all identified problems into high-level topics based on their sources.To prioritize problems based on the urgency of issues related to uncleanliness.To separate problems into technical and non- technical categories for targeted resolution.To create a comprehensive overview of the customer base with all relevant data for a functional working file.To reduce the customer base by deactivating unnecessary customers.To ensure the accuracy of top customer assignments by reassigning incorrectly designated top customers.To accurately assign sub-customers to their corresponding top customers, creating initial customer trees.To prioritize and clean up each customer tree based on customer importance.
StepsConduct brainstorming sessions by gathering a diverse team for multiple sessions and encouraging open discussion and idea sharing.
Document Issues through recording all mentioned issues without filtering and ensuring all participants’ input is captured.
Identify repetition through reviewing the documented issues from each session and highlighting issues that are mentioned multiple times.
Analyze patterns by looking for common themes and recurring problems and prioritize issues based on their frequency of mention.
Collect all problems identified in the previous problem identification process.
Determine high-Level Categories by establishing high-level categories based on the sources of the problems and ensuring categories are comprehensive and cover all problem areas.
Categorize Problems by sorting each problem into the appropriate high-level category and verifying that each problem is placed accurately according to its source.
Identify criteria for Prioritization
Identify the most urgent problems caused by uncleanliness.
Evaluate the impact of each problem, such as its effect on reporting.
Identify problem categories by reviewing the list of identified problems.
Classifying problems by dividing problems into non-technical issues and technical issues
Assigning responsibility to the appropriate technical teams and non- technical issues to the relevant non-technical teams.
Collect customer data by gather all relevant data pertaining to the customer base.
Organize data by structuring the data in a clear and systematic manner.
Create a working file by compiling the organized data into a single, functional file.
Identify unnecessary customers by reviewing the customer base and identifying customers that are no longer needed.
Deactivate identified customers from the active database.
Update the customer base by ensuring the updated customer base to reflect only the necessary and active customers.
Review current top customers by analyzing existing list of top customers and identifying inaccuracies.
Identify incorrect assignments by determining which customers have been incorrectly labeled as top customers.
Reassign customers correctly to ensuring the top customer list reflects the actual top customers.
Identify sub- customers by listing all sub-customers needing assignment.
Determine correct top customers by identifying appropriate top customer for each sub-customer.
Assign sub-customers for linking each sub- customer to their correct top customer.
Create customer trees by establish initial customer trees by organizing sub- customers under their respective top customers.
Prioritize customer trees by ranking customer trees according to the importance of the top customers.
Review each customer tree through examine the structure and details of each customer tree.
Perform cleanup by removing inaccuracies, redundancies, and outdated information within each tree.
OutcomeA clear identification of key problems, as indicated by their repeated mention during brainstorming sessions.A structured categorization of all problems into a specified number of high-level topics, ensuring clarity and focus on the sources of each issue.A prioritized list of problems, with the most urgent issues related to uncleanliness and their impacts on key areas like reporting given top priorityA clear division of problems into technical and non-technical categories, ensuring that each type of issue is addressed by the appropriate teams using the right approaches.A complete and structured overview of the customer base, contained in a working file with all relevant data readily accessible.A streamlined and reduced customer base, focusing only on active and necessary customersA correct and updated list of top customers, accurately representing the most important clients.Accurately assigned sub-customers, forming structured customer trees under the correct top customers.A set of well-organized and accurate customer trees, prioritized by the importance of each customer
The process begins with problem identification, which entails recognizing recurring issues during brainstorming sessions. Following this, problems are categorized into high-level topics, emphasizing the importance of aligning categories with their respective problem sources. Subsequently, problem priorities are adjusted to address the most urgent issues, particularly those stemming from uncleanliness, such as its impact on reporting. Problems are then classified as technical or non-technical to facilitate tailored solutions being developed by different teams. Non-technical issues, which are closely linked to uncleanliness, diverge into a separate path. This path involves establishing a comprehensive overview of the customer base, deactivating unnecessary customers, and correcting top customer assignments. Next, sub-customers are assigned to their correct top customers to form initial customer trees. Finally, each customer tree is systematically cleaned up, with priority being given to customers based on their importance.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Singh, J.; Gebauer, H. Clean Customer Master Data for Customer Analytics: A Neglected Element of Data Monetization. Digital 2024, 4, 1020-1039. https://doi.org/10.3390/digital4040051

AMA Style

Singh J, Gebauer H. Clean Customer Master Data for Customer Analytics: A Neglected Element of Data Monetization. Digital. 2024; 4(4):1020-1039. https://doi.org/10.3390/digital4040051

Chicago/Turabian Style

Singh, Jasmin, and Heiko Gebauer. 2024. "Clean Customer Master Data for Customer Analytics: A Neglected Element of Data Monetization" Digital 4, no. 4: 1020-1039. https://doi.org/10.3390/digital4040051

APA Style

Singh, J., & Gebauer, H. (2024). Clean Customer Master Data for Customer Analytics: A Neglected Element of Data Monetization. Digital, 4(4), 1020-1039. https://doi.org/10.3390/digital4040051

Article Metrics

Back to TopTop