Next Article in Journal
Research on the Transformer Failure Diagnosis Method Based on Fluorescence Spectroscopy Analysis and SBOA Optimized BPNN
Previous Article in Journal
An End-to-End General Language Model (GLM)-4-Based Milling Cutter Fault Diagnosis Framework for Intelligent Manufacturing
Previous Article in Special Issue
Location Privacy Protection for the Internet of Things with Edge Computing Based on Clustering K-Anonymity
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Transparent and Privacy-Preserving Mobile Crowd-Sensing System with Truth Discovery

College of Computer and Information Science College of Software, Southwest University, Chongqing 400010, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(7), 2294; https://doi.org/10.3390/s25072294
Submission received: 8 March 2025 / Revised: 30 March 2025 / Accepted: 3 April 2025 / Published: 4 April 2025
(This article belongs to the Special Issue Advanced Mobile Edge Computing in 5G Networks)

Abstract

:
The proliferation of numerous portable mobile devices has made mobile crowd-sensing (MCS) systems a promising new trend. Traditional MCS systems typically outsource sensing tasks to the data aggregator (e.g., cloud server). They collect and analyze the provided sensing data through an appropriate truth discovery (TD) method to identify valuable data sets. However, existing privacy-preserving MCS systems lack transparency, enabling data aggregators to deviate from the specified protocols and allowing malicious users to provide false or invalid sensing data, thereby contaminating the resulting data sets. The lack of transparency and public verifiability in MCS systems undermines widespread adoption by preventing data requesters from confidently verifying data integrity and accuracy. To address this issue, we propose a transparent and privacy-preserving mobile crowd-sensing system with truth discovery (TP-MCS) constructed using zero-knowledge proof (ZKP) and the Merkle commitment tree. This scheme enables data requesters to effectively verify the correctness of the truth discovery service while ensuring data privacy. Furthermore, theoretical analysis and extensive experiments demonstrate that this scheme is secure and efficient.

1. Introduction

With the rapid development of mobile devices and wireless technologies, the scale of data and computational complexity are increasing [1]. In this context, the widespread application of various mobile intelligent terminal devices and sensing technologies in numerous applications has promoted mobile crowd sensing (MCS) to become a popular data sensing and aggregation paradigm, fully leveraging collective intelligence. MCS systems have established a relatively mature framework in various domains, e.g., urban sensing [2], healthcare [3,4], and environmental monitoring [5]. Nevertheless, sensor quality, the surrounding environmental noise, and device mobility inevitably lead to differences and even contradictions between sensing data. As a new sensing paradigm, crowd sensing typically relies on a centralized crowd-sensing platform [6,7], which uses truth discovery to distribute sensing tasks to individuals or groups for collecting and analyzing sensing data from mobile devices.
In practice, when sensing users provide personal sensing data, there is a risk of privacy leakage. If the reliable information users provide is compromised and leaked, the service provider (SP) may abuse it, resulting in negative consequences and potential dangers for participating users [8]. It may even lead to user resistance to participating in sensing tasks. If users lack enthusiasm for participating in sensing tasks, the MCS system may fail to provide satisfactory data and service. Therefore, maintaining the privacy of users’ sensing data is of paramount importance. Existing research addresses privacy issues using methods such as homomorphic Paillier encryption [9], Shamir’s secret sharing technology [10], or the introduction of two non-colluding cloud servers [11]. They ensure secure data transmission by strictly encrypting the sensing data or weights during interaction. However, these existing schemes overlook the risk of malicious users submitting anomalous sensing data.
We notice malicious users injecting random or invalid data within this task-outsourcing framework for unforeseen reasons. Undoubtedly, such behavior not only increases the workload on the server side but also causes aggregation pollution, affecting system performance and computational accuracy, leading to a deviation of the final data results from the truth. Current research has employed the blockchain-driven zero-knowledge range proof (ZKRP) technique [12] and an authentication-based approach [13] to improve the reliability of results derived from mixed sensing data. These methods aim to prevent anomalous data values or fake users. Meanwhile, the work [13] could not detect abnormal data from legitimate users. In addition, these schemes are executed by the cloud server, rendering them unable to defend against tampering with aggregation results by the cloud server. The centralized architecture lacks operational transparency during the aggregation process [14,15], leading to the trust authority (TA) being unable to verify the complete algorithm execution process and refusing to receive incorrect aggregation results, thus reducing the credibility of MCS systems and preventing widespread adoption by cautious TA. Therefore, it is an urgent problem to realize verification in existing MCS systems.
To address the above problem, we propose a transparent and privacy-preserving mobile crowd-sensing system with truth discovery (TP-MCS). The scheme is based on zero-knowledge proof, ensuring that TA can efficiently verify the correctness of the truth discovery service while safeguarding data privacy. Meanwhile, we utilize the Merkle commitment tree data structure to optimize efficiency, which makes the transmission size of data logarithmic. In addition, we analyze the scheme’s security properties, including privacy and transparency. Finally, we describe the performance analysis and implement our scheme, and the results show that our scheme performed well.
Contribution. We summarize our main contributions below:
  • We propose an efficient dual-verification protocol based on ZKRP and the Merkle commitment tree. On the one hand, clients can independently verify whether their data are correctly included in the algorithmic execution through Merkle commitment tree paths, thereby ensuring data authenticity and participation. On the other hand, TA achieves batch filtering of anomalous data by verifying the Merkle commitment tree root node, guaranteeing that aggregation results exclusively originate from valid data provided by honest clients. This protocol effectively ensures the integrity and tamper-resistance of the data aggregation process, consequently enhancing the system’s robustness against outliers.
  • We propose a transparent and privacy-preserving mobile crowd-sensing system with truth discovery (TP-MCS). By innovatively restructuring the truth discovery framework and introducing the Merkle commitment tree range proof protocol (MTRP) and the inner-product argument weighted aggregation protocol (IAWAP) based on zero-knowledge range proof and zero-knowledge inner-product argument, our system enables comprehensive supervision and verification throughout the truth discovery process in MCS system. This approach maintains data privacy while achieving computational transparency and verifiability.
  • We conduct targeted security analysis addressing potential threat models in the MCS system, demonstrating that our proposed scheme’s security properties can effectively counter various security challenges. Furthermore, we simulate realistic MCS environments to implement our solution and conduct comprehensive evaluations across multiple dimensions, including accuracy, convergence, security properties, and performance overhead. The results conclusively show that our approach maintains original accuracy and convergence characteristics while providing robust defense capabilities. Through the combination of theoretical analysis and experimental validation, we fully substantiate the solution’s practicality and scalability in real-world scenarios.
Organization. The remaining sections of this paper are organized as follows: In Section 2, we describe the related work of the MCS system. Section 3 provides an overview of the preliminary preparations. In Section 4, we describe the problem formulation, including the system model, threat model, and security requirements. Section 5 details the workflow of the proposed transparent and privacy-preserving mobile crowd-sensing mechanism. Section 6 describes the security analysis and communication overhead of TP-MCS. Experimental evaluations are reported in Section 7, and the discussion and limitations are described in Section 8. Finally, the conclusions are summarized in Section 9.

2. Related Work

In this section, we will provide a comprehensive overview of the research progress of MCS systems and explicitly state the inheritance and innovation of this selected research with the existing research results.
It is worth noting that not all data collected by users are valid. In fact, due to the diversity of data sources, there are significant differences in the quality of different data sources. If the raw data are directly used for processing, the final aggregated results will likely deviate significantly from the truth. Before the advent of the truth discovery method, MCS systems typically employed traditional approaches such as voting mechanisms or averaging calculations to resolve data conflicts [16,17]. However, these methods exhibit notable limitations, failing to account for the quality disparities among data sources adequately. Instead, they treat all data uniformly without differentiation, which significantly compromises the accuracy of the results. To address this challenge, the MCS system introduces a truth discovery scheme, which aims to improve the result accuracy through a more refined data processing mechanism. From the perspective of system design, the existing schemes mainly focus on the dimensions of privacy preservation and verifiability, which we explain next.
Privacy-preserving MCS scheme. In 2014, Li et al. [18] proposed a general framework called CRH, capable of resolving data conflicts and discovering truths, which could handle both discrete and continuous data. However, this framework did not take into account the protection of user data privacy. To address privacy problems in the truth discovery process, some researchers have developed solutions based on cryptographic techniques or perturbation-based methods to ensure the privacy protection of raw data. In 2015, Miao et al. [9] were the first to highlight the necessity of incorporating privacy protection into truth discovery schemes. They proposed the first privacy-preserving truth discovery protocol for MCS. The core idea of this scheme is the use of the threshold Paillier cryptosystem to securely aggregate users’ encrypted data, thereby safeguarding users’ sensitive information. However, the secure summation protocol requires frequent participation of users and servers in data encryption and decryption, imposing significant computational overhead on both entities. More critically, the scheme risks exposing the privacy of truth values during the iterative truth discovery process. To reduce computational and communication overhead, Miao et al. [11] proposed an innovative lightweight privacy-preserving truth discovery algorithm building upon the previous scheme. This scheme introduces two non-colluding cloud servers, which process the data from data sources in the ciphertext domain using the Paillier homomorphic encryption algorithm and complete the computation of truths through the interaction between the two servers. In 2019, Xu et al. [10] designed a new lightweight truth discovery scheme in combination with the Shamir secret sharing technique and multi-user key agreement to protect users’ sensitive information while realizing that the cloud servers and users are anti-collusion. In 2022, Gao et al. [19] proposed a location privacy-preserving truth discovery scheme, which similarly employs Paillier homomorphic encryption to safeguard users’ location privacy. In the same year, Sun et al. [20] introduced a contract-based personalized privacy-preserving scheme to perturb the aggregated results, allowing users to allocate privacy budgets for their data independently. Compared to traditional differential privacy, this approach offers a more flexible and equitable privacy protection mechanism. In 2024, Peng et al. [21] further advanced the field by proposing a scheme that utilizes additive secret sharing (ASS) to protect data privacy. This ASS-based scheme demonstrates superior computational and communication efficiency over previous approaches based on Shamir secret sharing.
Verifiability MCS scheme. As mentioned above, most existing research schemes primarily focus on user privacy protection while paying little attention to the critical issue of data validity. Notably, only a minimal number of research schemes have been designed to address both of these vital dimensions simultaneously. In 2019, Zhang et al. [13] proposed a novel scheme based on Paillier homomorphic encryption, one-way hash chains, and super-increasing sequences. This scheme ensures the validity of each user by verifying their identity, thereby preventing fake users from submitting illegal data. However, it fails to detect anomalous data from legitimate users. Duan et al. [12] introduced a blockchain-driven MCS scheme to address the detection of anomalous data from legitimate users. In this scheme, the cloud server employs ZKRP to filter out outliers, thereby ensuring the validity of the data submitted by users. Nonetheless, this approach requires verifying each data value individually, resulting in low verification efficiency and an inability to defend against the server’s malicious tampering of aggregated results.
Existing schemes exhibit the following limitations: Firstly, the verification process in these schemes entirely relies on the server’s validation of client proofs, making them vulnerable to attacks from malicious servers. Secondly, current schemes based on zero-knowledge proofs struggle to prevent malicious clients from deceiving the server with forged proofs. For instance, a client might generate a valid range proof using correct data but submit invalid data to the server, bypassing the verification mechanism. Thirdly, the existing schemes employ a one-by-one verification mechanism, leading to low verification efficiency and making them unsuitable for large-scale system requirements. Lastly, these schemes lack comprehensive transparency verification throughout the entire truth discovery process of MCS systems. Ideally, a trusted third party should be able to verify every stage of the system, not just the detection of anomalous data, thereby ensuring the overall trustworthiness and fairness of the system.
Our proposed TP-MCS scheme effectively addresses the issues above, as demonstrated in the following three aspects: Firstly, the scheme innovatively integrates ZKP with a Merkle tree structure to establish a dual-verification mechanism. This mechanism not only enables each client to independently verify the integrity of their data during the server’s algorithm execution but also leverages the efficient verification properties of the Merkle commitment tree, allowing a trusted third party to quickly confirm data integrity and consistency by verifying the root node. This collaborative verification mechanism between clients and the trusted third party effectively ensures the integrity and tamper-resistance of the server’s aggregated results. Secondly, the scheme innovatively adapts the truth discovery framework by reorganizing and integrating components, enabling the trusted third party to effectively oversee the entire truth discovery process in the MCS system through zero-knowledge range proof and zero-knowledge inner-product argument. Lastly, the scheme employs an encryption mechanism based on Pedersen commitments, whose hiding properties fully meet the requirements for data privacy protection. This system’s design advances research in federated learning regarding privacy protection and defense against malicious clients and servers, providing a more reliable solution for data processing and analysis in practical application scenarios.

3. Preliminaries

In this section, we describe truth discovery, cryptographic technology, Merkle tree, and Merkle commitment tree, which are the foundations of the proposed scheme.

3.1. Truth Discovery

In conventional MCS systems, the task publisher first submits the sensing task T a s k to a cloud server platform, which then recruits suitable participants based on task requirements to collect high-quality sensing data. However, the contribution of sensing data collected in MCS systems is usually different, as the reliability of sensing data provided by different users may differ due to factors such as device variations. Previous research works have introduced the concept of truth discovery to obtain accurate information. Specifically, truth discovery always starts from estimating the reliability w k of each user. Then, it integrates the weights w k of different users for the same object m and the sensing data x m k to estimate its g r o u n d t r u t h further. Based on a series of studies in this field [18,22,23], we summarize the general procedure algorithm as follows:
Weight update. In this step, the calculation formula for the weight w k of each user k [ K ] is determined based on the given ground truth x m * m [ M ] of each object, as follows:
w k = log k [ K ] m [ M ] d x m k , x m * m [ M ] d x m k , x m *
This function assesses the disparity between the data provided by users and the ground truth, assigning higher weights to users whose data are closer to the ground truth. For continuous functions, the construction of the distance function is s k = d ( x m k , x m * ) = ( x m k x m * ) 2 , and for categorical data, we use the vector x m k = ( 0 , , 1 ( q -th ) , , 0 ) to represent that the q -th option is selected by user k for the object m. The distance function can be represented as s k = d ( x m k , x m * ) = ( x m k x m * ) T ( x m k x m * ) . Therefore, for all objects of user k, the distance is represented as s k = m [ M ] d ( x m k , x m * ) .
Truth update. In this step, based on each user’s weight w k and sensing data x m k , we derive the ground truth for each object m as follows:
x m * = k [ K ] w k · x m k k [ K ] w k
By employing the rule, users with a higher weight w k contribute more significantly to the result of the ground truth x m * . This mechanism ensures that x m * tends to favor higher reliability.

3.2. Cryptographic Technology

Commitment scheme. A commitment scheme [24] Com consists of two algorithms ( Com . Setup , Com . Commit ). The setup algorithm Com . Setup ( 1 λ ) takes the security parameter 1 λ as input and outputs a commitment parameter p c . The commitment algorithm Com . Commit ( p c , x , r ) takes p c , a value x, and a randomness r as input and outputs a commitment c. Finally, the scheme is (additively) homomorphic for all x 1 , x 2 M c and r 1 , r 2 R c , and the following exists:
Com . Commit ( p c , x 0 , r 0 ) + Com . Commit ( p c , x 1 , r 1 ) = Com . Commit ( p c , x 0 + x 1 , r 0 + r 1 )
Non-interactive zero-knowledge proof scheme. A non-interactive zero-knowledge proof scheme [25] consists of three algorithms ( NIZK . Setup , NIZK . Prove , NIZK . Verify ) . The setup algorithm NIZK . Setup ( 1 λ ) takes the security parameter 1 λ as input and outputs a system parameter p z k . The prove algorithm NIZK . Prove ( p z k , s m , w ) takes p z k and a pair ( s m , w ) as input and outputs a proof π . The verify algorithm NIZK . Verify ( p z k , s m , π ) takes p z k , a statement s m , and a proof π , and outputs TRUE or FALSE .
Pseudo-random functions. The function consists of two algorithms ( PRF . KGen , PRF . Eval ) [26]. PRF . KGen ( 1 λ ) takes the security parameter 1 λ as input and outputs a random key k. PRF . Eval ( k , x ) takes k and a message x as input and outputs a pseudo-random number r.
Zero-knowledge range proof. By utilizing the definition of the range proof, we can generate proof that enables the opening of a commitment c r , thereby committing to a value within a specific range. To achieve this objective, we establish a set with total ordering on the message space M c and define the range proof for the relationship R r [27].
R r : ( p c r , ( c r , R ) , ( x , r ) ) R r c r = Com . Commit ( p c r , x , r ) x [ 0 , R ]
Zero-knowledge inner-product argument. In the inner-product argument, the prover utilizes the discrete logarithm assumption and employs group exponentiation techniques to conceal the specific values of vectors a and b . Through a series of computations, the prover convinces the verifier that the inner product of vectors a and b equals a common scalar c . The inner-product argument is an efficient proof system for the relation R i [28].
R i : ( p c i , ( c i , c ) , ( a , b , r ) ) R i c i = Com . Commit ( p c i , a , b , r ) c = a , b

3.3. Merkle Tree

The Merkle tree [29] is a unique binary tree type that facilitates quick data verification and retrieval. In a Merkle tree, each data block stores the hash value h j corresponding to node j. For intermediate (non-leaf) node v, the value is computed as follows: h v = H ( h l e f t ( v ) | | h r i g h t ( v ) ) , where H denotes the cryptographic hash function, l e f t ( v ) and r i g h t ( v ) represent the left and right child nodes of v, respectively. If a child node does not exist, its hash value defaults to 0. Through pairwise hashing and iterative aggregation layers by layers, this structure generates a unique root hash r o o t h , which is the digest of the entire data set. This mechanism guarantees data integrity: any modification to the original data alters the hash values along the corresponding path, inevitably changing the root hash. Consequently, the Merkle tree enables efficient tamper detection.

3.4. Merkle Commitment Tree

In TP-MCS, we introduce a Merkle commitment tree [30] structure with additively homomorphic properties to enable the TA to verify zero-knowledge range proofs efficiently. Specifically, we consider using commitment values to fill the leaf nodes, assuming that the commitment stored in the j-th leaf node is c j = Com . Commit ( p c , x j , r j ) . Based on the additive homomorphism property of the commitment scheme, the commitment of each internal node v is the concatenation of the commitment values of its left and right child nodes, i.e., c v = c l e f t ( v ) + c r i g h t ( v ) . (See Figure 1) In this way, the root node value r o o t c contains information from all the sub-nodes in the Merkle commitment tree. The prover can construct an inclusion proof about the target data item j by providing the root node value r o o t c , the commitment of the j -th leaf node, and the commitments of a series of intermediate nodes on the path from that leaf node to the root node. The verifier can confirm whether the target data item j is contained in the tree by calculating the commitments layer by layer and comparing the computed root commitment with the root commitment provided by the prover. This process verifies the integrity and consistency of the data.

4. Problem Formulation

We first introduce the notations used in the scheme and describe the system model. Next, we present the threat model and design goals.

4.1. Notations

In this work, we use the notation K to denote 1 , 2 , , K for K N . At the same time, we denote the observation entities of the sensing tasks as o b j e c t s , each user’s reliability is marked as w e i g h t , and the truth of each object is represented as g r o u n d t r u t h . A summary of the other notations used in this paper can be found in Table 1.

4.2. Design Goals

  • Transparency: For the entire process of the truth discovery, TA can verify the correctness of each step. In other words, SP cannot convince TA to accept incorrect, incomplete, or manipulated data that lead to erroneous results.
  • Privacy: Users’ privacy data will remain undisclosed to other users, ensuring their confidentiality throughout information transmission, storage, and processing.
  • Verifiability: SP cannot tamper with data without detection, as the TA and users can verify the received results for tampering through query requests. This capability ensures the consistency and trustworthiness of information.
  • Efficiency: Computational costs and communication overhead should be optimal while supporting many users.

4.3. System Model

We describe our system model in Figure 2, which consists of four main entities: trust authority, service provider, users, and public bulletin board. The definitions of related entities are provided below.
Trust authority (TA): TA motivates users to participate in sensing tasks by requesting through SP. Once operational, TA can efficiently verify the correctness of truth discovery.
Service provider (SP): SP can collaborate with users to achieve high-quality truth discovery and generate proofs through ZKP technology.
Users: Each user k collects sensing data through mobile devices and participates in partial verification processes.
Public bulletin board (PBB): SP regularly publishes digests to an immutable PBB [30], such as utilizing a public blockchain. Once information is posted on the PBB, it becomes tamper-proof, ensuring the security and traceability of the information.

4.4. Threat Model

We consider the TA completely trustworthy, as it does not collude with any participating parties (users or SP), just like in previous MCS systems [31,32,33]. First of all, we also consider two types of threats. One is an adversarial server, which attempts to manipulate data or algorithm execution to produce false aggregation. The other includes semi-adversarial and adversarial users. Semi-adversarial users are honest but curious. They faithfully follow protocols but attempt to learn other users’ private information. Adversarial users may submit anomalous data or deviate from protocol execution.
Then, we assume that a limited set of f semi-honest users may collude with SP in a potential attack. In this case, the remaining K f values will be decrypted only if K f < 2 . In other words, at least K f 2 honest clients can defend against this attack.
Next, we assume all adversaries are probabilistic polynomial-time (PPT) entities. Under this assumption, they can only break the security of cryptographic primitives—such as discrete logarithm-based commitment schemes and zero-knowledge proofs—with a negligible probability in polynomial time.
Finally, we assume the existence of a PBB, such as a blockchain, which ensures that the stored messages are immutable.

5. TP-MCS

In this section, we detail the main steps in designing TP-MCS. Our TP-MCS focuses on two main aspects: (1) We modify the algorithm to reduce verification complexity, making it more efficient and feasible. (2) We develop two protocols: the Merkle commitment tree range proof protocol (MTRP) and the inner-product argument weighted aggregation protocol (IAWAP). These protocols ensure that TA and users can cooperate to supervise each computational step, protecting data privacy.

5.1. Truth Discovery Mechanism

The original truth discovery mechanism includes division operations, as shown in Equation (2). In ZKP, division operations typically require calculating inverses, thereby increasing computational overhead. Generally, we strive to prevent the need to verify division operations whenever possible. For ease of verification, we adjust the weight update formula as follows:
u k = log k [ K ] m [ M ] d x m k , x m * m [ M ] d x m k , x m *
w k = u k k [ K ] u k
Then, the truth update formula can be modified as follows:
x m * = k [ K ] w k · x m k
We can treat Equations (6) and (7) as a black box function f ( · ) . Users directly use f ( · ) to compute their private w k , allowing subsequent matching verification with the commitments declared by SP, thereby avoiding the original division of the plains in Equation (8).
In addition, when processing TP-MCS parameters, we observe that ZKP arithmetic circuits exclusively support integer operations. This limitation arises because ZKP operates over finite fields as a cryptographic construct. Consequently, floating-point numbers must be mapped to the integer domain. To address this, we adopt a fixed-point quantization approach inspired by prior work [34]: defining a scaling factor L = 2 16 and converting a floating-point value x to an integer x via x = x · L . Notably, this will inevitably introduce truncation errors. Therefore, we briefly analyze the error bound as follows: We define the quantization error as ϵ = x · L x · L [ 0 , 1 ) . Since the converted integer value is x , its corresponding restored value is x ^ = x / L . Thus, the absolute error satisfies x x ^ = ϵ L < 1 L . Consequently, when L = 2 16 , the error bound is [ 0 , 2 16 ) , which meets the precision requirements of most practical application scenarios. Of course, we can achieve finer quantization for application scenarios requiring higher precision by dynamically adjusting the value of L.
Next, we will combine Figure 2 to explain the steps involved in the truth discovery of our scheme. For ease of description, we still represent the transformed data as x.
Step 1. TA T a s k SP: TA posts the sensing task to SP, providing information about the task objectives, specific requirements, and other details.
Step 2. SP T a s k Users: SP recruits K users to complete the sensing task.
Step 3. Users s k SP: Each user k computes s k = m [ M ] d ( x m k , x m * ) of the difference between the sensing data x m k and ground truth x m * . Subsequently, users transmit s k to SP.
Step 4. SP E v i PBB: SP calculates S u m s k = k [ K ] s k . We consider that sensing data beyond the specified range will affect the reliability of the ground truth. Therefore, we assume the sensing data are limited to a specific range x m [ 0 , δ ] . Since s k is calculated based on the difference between each user’s sensing data and the ground truth x m * , which is publicly available, we assume s k [ 0 , δ d i s t ] , δ d i s t = ( δ x m * ) 2 , and S u m s k [ δ d i s t , γ d i s t ] , where γ d i s t = δ d i s t · K 1 . Here, we establish a universal and efficient verification protocol, MTRP, to exclude abnormal sensing data and verify the correctness of the aggregation process. SP executes MTRP to generate evidence E v i , i.e., some commitments and proofs about range proofs, and posts it on the PBB. This evidence is used for subsequent verification.
Step 5. SP S u m s k , S u m w k Users: SP calculates the weights of all users through the function f ( · ) for weight updating, and the results, S u m s k and S u m w k , are sent to the users to update their weights.
Step 6. SP E v i , x m * m [ M ] PBB: SP performs weighted aggregation to complete the truth update. The IAWAP protocol is set to verify the correctness of the execution of Equation (8) based on inner-product argument, preventing malicious behavior by SP. Next, SP takes the evidence E v i and x m * m [ M ] to PBB.
Step 7. SP c k , c i , π k Users: SP sends the commitments and the inclusion proof to each user.
Step 8. SP x m * m [ M ] TA: SP sends the ground truths to TA.
Step 9. TA: TA obtains the digests from the PBB at any time and verifies the evidence E v i and E v i to ensure the correctness of the truth discovery process.
Step 10. Users: Each user k verifies the consistency of the commitments by computing the commitments based on their private data and comparing them with the commitments declared by SP. In addition, users verify the validity of the inclusion proof provided by the SP by reconstructing the root commitment of the Merkle commitment tree.

5.2. MTRP

We describe the specific construction of MTRP in this section. We use commitment schemes and zero-knowledge range proofs to verify the integrity of data aggregation results without compromising user data. Our verification idea is that SP publishes the commitments and proofs on the PBB. Since the PBB is immutable, SP cannot submit the commitments of the original data to the users while simultaneously using the commitments of abnormal data for aggregation. Users execute queries using their private data to verify the consistency of the commitments of the nodes. At the same time, users verify the inclusion proof to prove that their data are included in the tree. TA excludes abnormal sensing data and verifies the correctness of the aggregation process through r o o t c , proofs and the commitments of the leaf nodes, thereby maintaining the system’s robustness. The MTRP consists of the four algorithms as follows:
  • ( p t p r , k r ) Initialize ( 1 λ ) : run by SP. The algorithm initializes the secret key k r : = PRF . KGen ( 1 λ ) and initializes the commitment and non-interactive zero-knowledge parameter as p c r com . Setup ( 1 λ ) and p z k r NIZK . Setup ( 1 λ ) . SP publishes the system parameter p t p r = ( p c r , p z k r , δ d i s t , γ d i s t ) and keeps the k r private.
  • ( r k ) k [ K ] SecretGen ( p t p r , k r , k , t ) : run by SP. The algorithm computes the randomness r k : = PRF . Eval ( k r , k | | t ) for each user k, where t is the number of iterations.
  • E v i EviGen ( p t p r , ( s k , r k ) k [ K ] ) : run by SP. The algorithm computes the commitment of each leaf node c k COM . Commit ( p c r , s k , r k ) and the proof π k NIZK . Prove ( p z k r , s m , w ) , for the statement s m = ( c k , δ d i s t ) and the witness w = ( s k , r k ) . Then, SP computes the commitment of S u m s k as c * COM . Commit ( p c r , S u m s k , r * ) , and the proof π * NIZK . Prove ( p z k r , s m , w ) , for s m = ( c * , γ d i s t ) and w = ( S u m s k , r * ) , where r * = k [ K ] r k . In addition, the server generates the inclusion proofs π k for each user. In summary, SP shares the evidence E v i = ( c * , π * , ( c k , π k ) k [ K ] , ( π k ) k [ K ] ) to TA and all users. Subsequently, SP publishes a hash h = H a s h ( E v i ) on the public PBB.
  • 0 , 1 VerifyEvidence ( p t p r , ( s k , r k ) k [ K ] , E v i ) : run by TA and users. This verification algorithm returns TRUE if four subroutines hold.
    • 0 , 1 VerifyConsistency ( E v i , h ) : TA computes h ¯ = H a s h ( E v i ) . If h ¯ = h , then it outputs TRUE ; otherwise, it outputs FALSE .
    • 0 , 1 VerifyCommitment ( p c r , s k , r k , c k ) : each user verifies that if c ¯ k : = COM . Commit ( p c r , s k , r k ) , calculated by each user’s private value s k , matches the leaf node promised by SP, i.e., c ¯ k = c k , then it outputs TRUE ; otherwise, it outputs FALSE .
    • 0 , 1 VerifyInclusionProof ( p c r , π k ) : each user verifies the validity of the inclusion proof; i.e., the commitment value of each intermediate node v belonging to the proof path is indeed calculated from the commitments of v’s children. Then, it outputs TRUE if the proof is valid; otherwise, it outputs FALSE .
    • 0 , 1 VerifySum ( p z k r , π * , c * , ( c k ) k [ K ] ) : TA computes the root commitment c ¯ based on ( c k ) k [ K ] . TA verifies whether these three conditions are met: c ¯ = c * , c * matches π * , and NIZK . Verify ( s m , π * ) returns true for the statement s m = ( c * , γ d i s t ) . If all these conditions hold, TA outputs TRUE ; otherwise, it outputs FALSE .
    • 0 , 1 VerifyRangeProof ( p z k r , ( c k , π k ) k [ K ] ) : TA checks that c k matches the π k stored in the leaf nodes and verifies that NIZK . Verify ( s m , π k ) is true for the statement s m = ( c k , δ d i s t ) . If both conditions hold, it outputs TRUE ; otherwise, it outputs FALSE .

5.3. IAWAP

Similarly, IAWAP has a verification idea similar to MTRP. Users can calculate commitments based on their private weight w k and sensing data x m k . Then, users verify that their commitments are consistent with the commitments declared by the cloud server SP. The third-party TA ensures that SP correctly performs the weighted aggregation step without leaking sensitive information through the inner-product argument.
We use the weight sequence x = [ w 1 , w 2 , , w K ] and the sensing data sequence y = [ x m 1 , x m 2 , , x m K ] m [ M ] to calculate the inner product of x and y to obtain an accurate weighted aggregation result.
x m * = x · y = [ w 1 , w 2 , w K ] · [ x m 1 , x m 2 , x m K ]
For the convenience of subsequent verification, users apply a zero padding strategy to set all data points except for their data values to zero; i.e., the user modifies x w = w k x m k as follows:
x w = p i · q i = [ 0 , w k , 0 ] · [ 0 , x m k , 0 ]
We define IAWAP as consisting of the four algorithms below.
  • ( p t p i , k i ) Initialize * ( 1 λ ) : run by SP. The algorithm initializes the secret key k i : = PRF . KGen ( 1 λ ) and initializes the parameter as p c i com . Setup ( 1 λ ) and p z k i NIZK . Setup ( 1 λ ) . Then, SP publishes the system parameter p t p i = ( p c i , p z k i , x w , x m * , K ) and keeps k i private.
  • ( r i ) i [ K ] SecretGen * ( p t p i , k i , i , t ) : run by SP. The algorithm computes r i : = PRF . Eval ( k i , i | | t ) for i [ K ] .
  • E v i EviGen * ( p t p i , ( p i , q i , r i ) i [ K ] , x , y ) : run by SP. The algorithm computes the commitment for each user’s weight and sensing data c i : = COM . Commit ( p c i , p i , q i , r i ) and for the summed weighted aggregation result c ^ * : = COM . Commit ( p c i , x , y , r ^ * ) , where r ^ * = i [ K ] r i . Then, SP generates the proof π ^ * NIZK . Prove ( p z k i , s m , w ) , for s m = ( c ^ * , x m * ) and w = ( x , y , r ^ * ) . In summary, SP shares the evidence E v i = ( ( c i ) i [ K ] , c ^ * , π ^ * ) to TA and users and sends h = H a s h ( E v i ) to the public PBB.
  • 0 , 1 VerifyEvidence * ( p t p i , ( p i , q i , r i ) i [ K ] , E v i ) : run by TA and users. If all three subroutines included in the algorithm pass verification, return TRUE .
    • 0 , 1 VerifyConsistency * ( E v i , h ) : TA computes h ¯ = H a s h ( E v i ) . If h ¯ = h , then it outputs TRUE ; otherwise, it outputs FALSE .
    • 0 , 1 VerifyCommitment * ( p c i , p i , q i , r i , c i ) : users compute c ¯ i : = COM . Commit ( p c i , p i , q i , r i ) , calculated by each user’s private value w k and x m k . If c ¯ i = c i , then it outputs TRUE ; otherwise, it outputs FALSE .
    • 0 , 1 VerifyAgg ( p z k i , π ^ * , c ^ * , ( c i ) i [ K ] ) : TA computes c ¯ * = i [ K ] c i and verifies the following conditions: c ¯ * = c ^ * , c ^ * matches π ^ * , and NIZK . Verify ( s m , π ^ * ) returns true for s m = ( c ^ * , x m * ) . If the conditions hold, TA outputs TRUE ; otherwise, it outputs FALSE .

6. Security Analysis

In this section, we focus on TP-MCS’s advantages in terms of transparency and privacy by analyzing the security attributes implemented by the scheme and demonstrating its ability to effectively defend against the security threats described in Section 4.3. Specifically, the combination of transparency, which ensures the public verifiability of system operations, and privacy, which guarantees the confidentiality of user data, enables TP-MCS to comprehensively address various security challenges.
Semi-honest users faithfully follow the protocol but try to learn private information about other users. They may try to disguise themselves as honest users to obtain sensitive information, so we propose Theorem 1: Privacy. Next, we will proceed to prove this theorem.
Theorem 1
(Privacy). If the commitment scheme COM is additively homomorphic, satisfies the hiding property, and adopts an NIZK with zero-knowledge property, which can ensure that sensitive information is not leaked during verification, then TP-MCS guarantees the privacy of the system.
Proof. 
Although semi-honest clients do not actively break protocols, their behavior may still threaten to protect the system’s privacy. Their possible malicious behaviors are mainly manifested in the following two aspects:
  • Case 1. Semi-honest user j tries to disguise itself as an honest user and commits s k to generate false evidence: Due to the hidden attribute of the commitment, even if the semi-honest user obtains the sum of all the values in the Merkle commitment tree according to the protocol, it is still unable to obtain any specific node information. This is because guessing which iteration cycle these values belong to is challenging, and correctly guessing the value of a random seed r k or r i is nearly impossible.
  • Case 2. Semi-honest user j tries to masquerade as an honest user and tries to use w k or x m k to participate in the truth update in order to generate false evidence: even if the semi-honest user computes the homomorphic aggregation result of x m * and all promises based on w k or x m k , the semi-honest user obtains the result of the homomorphic aggregation c ^ * , but since the hidden attribute is a unidirectional attribute, the semi-honest user is still unable to derive the original input from { c i } i [ K ] { j } .
  • Case 3. Assume that f semi-honest users collude with the SP in an attempt to infer the sensitive information of the remaining K f honest users. Based on the additive homomorphism of commitments and the reversibility of homomorphic operations, SP can remove the commitments of the f semi-honest users (whose private data have already been exposed) from the root commitment. This allows the SP to obtain the combined commitment of the K f 2 honest clients, denoted as i [ K f ] { c i } . Since our assumptions satisfy the security bound of K f 2 , the data of a single honest user are masked by the randomness of other honest users, making their private information non-deconstructible in isolation. Conversely, with K f = 1 as an example, the SP has direct access to the commitment of the only honest user, and it knows the sum of the blind factors to infer the original sensitive information. Therefore, the protocol’s security requires at least two honest users and the hidden attributes of the joint commitment, which are sufficient to defend against the conspiracy attack of the semi-honest user and the SP.
Therefore, our solution, TP-MCS, ensures the privacy of the system. □
In the security threat model, both malicious servers and malicious users have the potential to deviate from the protocol. Specifically, a malicious server might tamper with input data or manipulate the algorithm execution process, generating incorrect aggregated computation results. On the other hand, malicious users could submit abnormal or falsified data to corrupt the aggregation results, undermining the system’s integrity and reliability. The transparency attribute of our proposed solution, TP-MCS, can effectively defend against these threats. Consequently, we have proposed Theorem 2: Transparency. Next, we will proceed to prove this theorem.
Theorem 2
(Transparency). If the commitment scheme COM is additively homomorphic and satisfies the binding property, and NIZK is simulation-extractable, which can verify the integrity of data within a polynomial time T , then TP-MCS guarantees the transparency of the system.
Proof. 
Both malicious servers and malicious clients may deviate from the established protocols in an attempt to generate erroneous aggregation results, and their malicious behavior may manifest itself in the following three aspects:
  • Case 1. If the malicious server tampers with the s v submitted by the honest client v, modifying it to s v , the aggregation result becomes S u m ˜ s k = k [ K ] { v } s k + s ˜ v . One scenario is that, in order to hide the fact of tampering with s v , the malicious server submits to the bulletin board the correct leaf node commitment values { c k } k [ K ] in order to evade the client’s inspection c ¯ v = COM . Commit ( p c r , s v , r v ) = c v . In this case, TA performs the computation c ˜ * = Com . Commit ( p c r , S u m ˜ s k , r * ) , which will reject the aggregated result with a probability of 1 n e g l ( κ ) . In another case, in order to evade the TA’s validation, the malicious server submits the commitments c k [ K ] { v } and c ˜ v to the bulletin board, which will not be filtered out by the TA performing the computation. However, since the bulletin board is tamper-proof and the commitments have binding properties, the probability that a client will collide with c ˜ v = COM . Commit ( p c r , s v , r v ) in verifying its real commitment c v is almost negligible, and thus it will be detected by an honest client.
  • Case 2. If a malicious server tampers with the w u or x m u submitted by an honest client, modifying it to w u or x m u , the aggregation result becomes x ˜ m * = i [ K ] { u } w k · x m k + w u · x m u . One scenario is that the server submits the correct leaf node commitment values { c i } i [ K ] to the bulletin board in order to hide the tampering to avoid the client’s inspection. However, in this case, the TA performs the computation c ^ ˜ * = Com . Commit ( p c i , x ˜ m * , r ^ * ) and will reject the result with a probability of 1 n e g l ( κ ) . Another scenario is to submit the commitment c i [ K ] { u } and c u to the bulletin board and thus be able to evade the TA’s computation. But since an honest user can generate the correct commitment c ˜ u = COM . Commit ( p c i , ( w u , x m u ) , r u ) with his own private data, and due to the binding property of the promises, the probability of c u and c ˜ u colliding is negligible.
  • Case 3. If a malicious client submits anomalous data s v such that s v > δ d i s t , or w u and x m u , it will result in the server failing to validate the zero-knowledge range proof and zero-knowledge inner-product argument that it generates on this basis. This is because the TA will be able to identify such anomalous behavior by detecting verification failures and verifying the evidence uploaded to the bulletin board by the server.
In summary, the TP-MCS is fully transparent because the TA can verify the correctness of each step throughout the process of truth discovery. □

7. Experimentation and Performance Evaluation

To evaluate our proposal, we developed a prototype system utilizing a laptop with an AMD Ryzen 9 5900HX processor and a 3.30 GHz Radeon Graphics card, and experimented with it in an Ubuntu 22.04 environment. We implemented TP-MCS based on the ZKRP library [35], which uses Pedersen commitment with the secp256k1 elliptic curve to construct the commitment scheme. In practice, we employed a baseline solution that does not consider any security factors, i.e., CRH [18], to evaluate the accuracy and convergence of our scheme. We also conducted a performance assessment to demonstrate the feasibility of the scheme.

7.1. Accuracy

Similar to previous research methods [9,11], we employed the root mean squared error (RMSE) to quantify the difference between the estimated ground truths and the actual ground truths. We set the number of objects as 16, the number of iterations as 20, and varied the number of users across 2 2 , 2 3 , 2 10 . We used a random initialization method to generate sensing data for each object, where the range of values for each data point is controlled to be between the minimum and maximum values, and the spacing is limited to five units. Subsequently, we initialized the actual ground truth by calculating the arithmetic mean of these randomly generated data. In Figure 3, we present the measurement results, showing that TP-MCS and CRH have nearly the same estimation accuracy.
Similarly, we set the number of objects as 16 and the number of iterations as 20, and we used the sensing data generation scheme described above. As shown in Figure 4, to evaluate the system’s robustness, we systematically investigated the change in RMSE when the percentage of malicious users was incremented from 10 % to 50 % at user sizes of 250, 500, and 1000, respectively. The experimental design assumed that malicious users generated data beyond the constraint range, and all unvalidated user data were excluded from the final aggregation process. The experimental results show that in the scenario of 500 clients, the RMSE difference between different malicious user ratios was only about 0.1. When the user scale was expanded to 1000, the RMSE value was stable below 0.1. It fluctuated within 0.02, which indicates that our scheme has a significant anti-interference ability against anomalous data injected by malicious users.

7.2. Convergence

We further analyzed the convergence of the proposed scheme. We defined x m * t x m * t 1 to evaluate the difference between the estimated truths in consecutive iterations. The initial value x m * t = 0 was randomly initialized. As shown in Figure 5, TP-MCS exhibits similar convergence capabilities to CRH, rapidly converging within a few iterations.

7.3. Performance Evaluation of TP-MCS

Communication overhead. To optimize performance, we focus on two main aspects. We first use the Merkle tree to reduce the overhead of range proof. For K users, SP generates commitments for 2 K 1 nodes and constructs range proofs based on the leaf and root nodes. When TA verifies the range proof of a node in the tree, SP only needs to send the commitments included in the range proof, which means verifying log 2 K commitments along the adjacent path and the corresponding log 2 K + 1 additional child node commitments, totaling 2 log 2 K + 1 commitments, without traversing the entire Merkle tree, significantly improving efficiency. Then, users are often constrained by device resources. Therefore, we discuss blockchain-based protocol extensions to reduce the communication overhead on the user side. We observe that only VerifyCommitment , VerifyCommitment * , and VerifyInclusionProof need to be verified by users. Therefore, other verifications can be executed by TA, undoubtedly reducing the users’ cost. TA executes VerifyConsistency , VerifyConsistency * , VerifySum , VerifyRangeProof , and VerifyAgg on a blockchain smart contract platform to verify the commitments and proofs generated in our protocol. If these verifications pass, the blockchain will store the relevant messages. Then, each user can directly read the stored commitments and proofs and use their private data to compute c ¯ k and c ¯ i to evaluate the consistency of the commitments without incurring consensus overhead.
We show in Table 2 the performance cost of performing the TP-MCS scheme when using smart contracts. We use the message size to represent the broadband cost in the network. We denote the size of commitments and proofs in MTRP as M c r , M π r , and IAWAP as M c i , M π i . M t represents the message stored in the blockchain, waiting for read requests. We find in Table 2 that unloading messages from the blockchain not only reduces communication overhead between each entity, but also allows for the public verification of the authenticity and integrity of the information submitted by SP, due to the immutable nature of the commitments and proofs stored on the blockchain, thereby enhancing the system’s transparency and credibility.
Computation overhead. Unlike existing research that primarily focuses on privacy protection mechanisms, the scheme proposed in this study innovatively introduces the MTRP protocol based on ZKRP and the Merkle tree and the IAWAP protocol based on inner-product arguments. This scheme ensures privacy and, for the first time, achieves the function of transparent and public verification, effectively addressing the shortcomings in the verifiability of existing solutions. Given this scheme’s innovative breakthrough in functionality, it is imperative to evaluate its performance systematically. To this end, this study designs a set of comparative experiments aimed at deeply exploring the impact of changes in user scale (from 2 1 to 2 10 ) and object scale (from 2 1 to 2 9 ) on system performance, providing important empirical evidence for its application deployment in real-world scenarios.
In the first set of experiments, as shown in Figure 6 and Figure 7, we examine the effect of varying the number of users on the time overhead for a fixed number of objects, respectively. Specifically, Figure 6a,b show the performance when the number of objects is fixed at 2 8 and the number of users varies from 2 1 to 2 10 , and Figure 7a,b show the corresponding results when the number of objects is fixed at 2 9 .
We observe that in the MTRP protocol, the verification time overhead is low and stable, which is due to the introduction of the Merkle tree data structure, allowing the TA to only need to verify the root hash to complete the integrity verification of all the data, thus avoiding the tedious process of verifying all the participant’s data one by one, ultimately achieving a constant-level verification time complexity. In addition, the overhead of generating proofs for the MTRP and IAWAP protocols and the overhead of TA verification proofs in the IAWAP protocol grow linearly with the number of users, which is controllable and acceptable in practice. Also, in both protocols, users only need to validate their data, which gives them inexpensive computational complexity.
In the second set of experiments, as shown in Figure 8 and Figure 9, we examine the effect of varying the number of objects on the time overhead for a fixed number of users, respectively. Specifically, Figure 8a,b show the performance when the number of users is fixed at 2 9 and the number of objects varies from 2 1 to 2 9 , and Figure 9a,b show the corresponding results when the number of users is fixed at 2 10 .
In scenarios with a fixed number of users, the proof generation time of the MTRP protocol exhibits notable stability, with its time cost being entirely independent of the number of objects, consistently maintaining a constant level. In contrast, the proof generation time of the IAWAP protocol and the TA verification time remain linearly positively correlated with changes in the number of objects. Meanwhile, the verification time and the MTRP protocol’s user verification time in the IAWAP protocol consistently tend toward zero.
Next, we set a fixed value of 2 10 users and 16 objects to study the variation in execution time under different numbers of cores. As shown in Figure 10, we find that TP-MCS can effectively scale with more CPU cores. For example, when the number of cores num c o r e = 16 , the runtime reduces by 66.38 s compared to num c o r e = 2 .

7.4. Comparison to Existing Schemes

We compare our scheme with other schemes, including security properties and performance overheads.
Comparison of properties. Table 3 compares our scheme with other schemes regarding security attributes, including privacy, transparency, verifiability, and efficiency. The baseline scheme [18] can only identify truthful values in crowd-sensing systems but fails to meet other security attributes, such as user privacy. The works [11,13,19] use Paillier encryption to ensure user privacy, though this approach incurs high overhead. Among these, the work [13] introduced an identity verification mechanism capable of effectively identifying and authenticating malicious users, thereby reliably excluding illegal users from the system. Sun et al. [20] optimized traditional differential privacy schemes by allowing users to customize their privacy budgets, significantly enhancing the flexibility and fairness of the approach. In addition, mbox [21] utilizes additive secret sharing (ASS) to protect data privacy, which is superior in terms of computational efficiency and communication overhead compared to the traditional Shamir secret sharing scheme. In addition, the work [12] combines blockchain, Paillier encryption, and zero-knowledge proof technology to obtain the privacy and verifiability of submitted data, but this scheme is inefficient and non-transparent. In contrast, our scheme combines commitment schemes and zero-knowledge proofs, ensuring operational transparency and verifiability while protecting user privacy. We construct a Merkle commitment tree to improve protocol efficiency further, significantly reducing the protocol’s computational overhead.
Comparison of overhead. We assume that there exists a server provider SP, K clients, and M objects, and that the size of the task is a, the size of x m * is d, and the size of s k is n. From Equations (6)–(8), we derive the size of S u m s k to be K n , and the size of S u m w k to be log K . We ignore the overhead of TA for this analysis since we do not include the entity in our comparison scheme. TA has performed only one simple verification computation, which has a negligible impact on the asymptotic complexity of the overall scheme. We report our scheme’s computational overhead and communication overhead versus the other schemes on both the server and client sides in Table 4.
Server computation overhead. (1) SP generates for each client an inclusion proof π k that includes all the commitments on the path from the self-leaf node to the root node. Thus, the total computational complexity of generating inclusion proofs for K clients is O ( K ( log K + log n ) ) . (2) SP generates a range proof for s k for each client. The Bulletproof protocol we employ compresses the proof size by recursively reducing it from linear complexity to a logarithmic level. Specifically, each step of recursive compression splits a vector of length n into two vectors of length n / 2 , gradually reducing the computational size. As a result, the complexity of a range proof for K clients is O ( K log n ) . (3) SP computes the summing operation of S u m s k and S u m w k with a computational complexity of O ( K ) . (4) SP generates the inner-product argument and performs truth discovery. The vector dimensions of x and y are K. Then, the computational complexity of performing the inner-product argument protocol is O ( log K ) , and the computational complexity of computing the inner product of vectors is O ( K ) . Thus, the computational complexity is O ( K + log K ) . In summary, the total computational complexity on the server side is O ( K ( log K + log n ) ) . Compared to other schemes, the work [18] does not introduce any cryptography and has minimal computational overhead. In the previous work [11,13,19], SP has to decrypt the result of Paillier encryption for K clients. The work [11] is a dual server, and separate decryption is required between the servers. Assuming that the size of the noise in the scheme Sun et al. [20] is M n , the communication overhead of the server will increase by O ( K M n ) compared to the privacy-preserving scheme without privacy protection [18]. In work [21], the computational overhead of the server mainly comes from the random number generation and addition operations in the secret sharing phase and the addition operations in the secret reconstruction phase. Specifically, generating K shared values requires K 1 random number generation and one addition operation, while reconstructing the secret requires K 1 addition operations. Therefore, the computational overhead is O ( K ) . Moreover, the work [12] requires not only verifying the zero-knowledge proof but also decrypting Paillier encryption, and thus has the highest computational complexity.
Server communication overhead. (1) SP sends the task, S u m s k , S u m w k , and x m * to K clients, and we assume that V = a + d + K n + log K , so the communication complexity is O ( V ) . (2) From Table 2, we assume that M s = ( K + 1 ) ( M c r + M π r + M c i + M π i ) + K ( 2 l o g 2 K + 1 ) M c r , so the communication complexity is O ( M s ) . Thus, the communication overhead of the server is O ( V + M s ) . Assuming that the size of the transmitted Paillier homomorphic encrypted data is M e , in general, M e > > M s . Thus, the communication complexity of the server is lower than in the works [11,12,13,19], but higher than in the schemes [18,20,21], because the server has to send commitments and proofs.
Client computation overhead. (1) The client computes its own s k with a complexity of O ( M ) . (2) The client computes the commitment for verification with a complexity of O ( log n ) . (3) The client verifies the inclusion proof with a complexity of O ( log K ) . Therefore, the total computational complexity for the client is O ( M + log n + log K ) . The client in scheme [18] only needs to compute the value of s k , and the client of [20] only needs to perform a noise add operation on the local data, which makes its computational complexity comparable to that of [18]. Thus, they have the lowest computational complexity of all the schemes. At the same time, the client in [11,12,13,19] requires additional Paillier encryption operations. The client in [21] needs to perform element-by-element randomized partitioning and shared computation on data s k of size n. Therefore, the TP-MCS client is less efficient than most schemes in terms of computational complexity and allows the client to perform query validation.
Client communication overhead. (1) The client sends s k to SP with O ( n ) communication complexity. (2) The client feeds back an error flag to the PBB when it verifies that the commitment is inconsistent or that the inclusion proof is incorrect, with a communication complexity of O ( M u ) , where M u = ( 2 l o g 2 K + 1 ) M c r + M c i + 2 M t . Therefore, the total communication complexity for the client is O ( n + M u ) . Compared to other schemes, the client of TP-MCS has higher communication complexity, because the client needs to verify the inclusion proof and commitment consistency through Merkle tree path information to prevent tampering by malicious servers. Such a verification mechanism is not available in other schemes.

8. Discussion and Limitations

The TP-MCS system proposed in this paper realizes the public verification of the execution process of the MCS truth discovery algorithm in privacy-preserving scenarios. It effectively filters out malicious clients, thus enhancing the trust of TA in the MCS system. However, the system still has some limitations. First, the use of zero-knowledge proof techniques introduces additional computational overhead. For example, the computational complexity of the IAWAP protocol is positively correlated with the number of users and objects, which can be a performance bottleneck in large-scale data deployment scenarios. In future work, distributed computing schemes such as federated learning can reduce the computational overhead and improve the system’s scalability by training data locally on the user side and aggregating them securely on the server side [36]. Second, the transparent design of the system, especially the validation part, relies on the active participation of users. However, some users may not fully utilize this feature, resulting in a limited practical effect of transparency. To improve user participation, reasonable incentives [37], such as financial rewards or reputation points, can be designed in the future to encourage users to participate actively in the verification process.
In summary, although the TP-MCS system has made significant progress in privacy protection and transparency, it needs to be further optimized regarding computational efficiency and user participation to better adapt to large-scale practical application scenarios.

9. Conclusions

In this paper, we creatively designed a transparent and privacy-preserving truth discovery system. The scheme combines commitment schemes and zero-knowledge proofs to guarantee data privacy while allowing the verification of the correctness of data aggregation results. We also designed Merkle commitment trees to reduce computational overhead further. Meanwhile, we performed a security analysis of the scheme and discussed the optimization scheme based on blockchain extension. In addition, we conducted comprehensive experiments and compared our scheme with other state-of-the-art schemes, demonstrating that our scheme outperforms them in terms of accuracy and computational cost.

Author Contributions

Conceptualization, R.J.; Methodology, R.J., J.M., Z.Y. and M.Z.; Validation, J.M.; Formal analysis, R.J.; Writing—original draft, R.J. and Z.Y; Writing—review and editing, R.J., J.M. and M.Z.; Supervision, J.M. and M.Z.; Project administration, M.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

All data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Liu, H.; Zhou, Y.; Fang, B.; Sun, Y.; Hu, N.; Tian, Z. PHCG: PLC Honeypoint Communication Generator for Industrial IoT. IEEE Trans. Mob. Comput. 2025, 24, 198–209. [Google Scholar] [CrossRef]
  2. Baier, P.; Dürr, F.; Rothermel, K. Efficient distribution of sensing queries in public sensing systems. In Proceedings of the 2013 IEEE 10th International Conference on Mobile Ad-Hoc and Sensor Systems, Hangzhou, China, 14–16 October 2013; pp. 272–280. [Google Scholar]
  3. Zhang, C.; Zhu, L.; Xu, C.; Lu, R. PPDP: An efficient and privacy-preserving disease prediction scheme in cloud-based e-Healthcare system. Future Gener. Comput. Syst. 2018, 79, 16–25. [Google Scholar] [CrossRef]
  4. Mukherjee, S.; Weikum, G.; Danescu-Niculescu-Mizil, C. People on drugs: Credibility of user statements in health communities. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 65–74. [Google Scholar]
  5. Cheng, Y.; Li, X.; Li, Z.; Jiang, S.; Li, Y.; Jia, J.; Jiang, X. AirCloud: A cloud-based air-quality monitoring system for everyone. In Proceedings of the 12th ACM Conference on Embedded Network Sensor Systems, Memphis, TN, USA, 3–6 November 2014; pp. 251–265. [Google Scholar]
  6. Guo, B.; Wang, Z.; Yu, Z.; Wang, Y.; Yen, N.Y.; Huang, R.; Zhou, X. Mobile crowd sensing and computing: The review of an emerging human-powered sensing paradigm. ACM Comput. Surv. (CSUR) 2015, 48, 1–31. [Google Scholar] [CrossRef]
  7. Liu, J.; Shen, H.; Narman, H.S.; Chung, W.; Lin, Z. A survey of mobile crowdsensing techniques: A critical component for the internet of things. ACM Trans. Cyber-Phys. Syst. 2018, 2, 1–26. [Google Scholar]
  8. Singla, A.; Krause, A. Incentives for privacy tradeoff in community sensing. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Palm Springs, CA, USA, 7–9 November 2013; Volume 1, pp. 165–173. [Google Scholar]
  9. Miao, C.; Jiang, W.; Su, L.; Li, Y.; Guo, S.; Qin, Z.; Xiao, H.; Gao, J.; Ren, K. Cloud-enabled privacy-preserving truth discovery in crowd sensing systems. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, Seoul, Republic of Korea, 1–4 November 2015; pp. 183–196. [Google Scholar]
  10. Xu, G.; Li, H.; Liu, S.; Wen, M.; Lu, R. Efficient and privacy-preserving truth discovery in mobile crowd sensing systems. IEEE Trans. Veh. Technol. 2019, 68, 3854–3865. [Google Scholar]
  11. Miao, C.; Su, L.; Jiang, W.; Li, Y.; Tian, M. A lightweight privacy-preserving truth discovery framework for mobile crowd sensing systems. In Proceedings of the IEEE INFOCOM 2017-IEEE Conference on Computer Communications, Atlanta, GA, USA, 1–4 May 2017; pp. 1–9. [Google Scholar]
  12. Duan, H.; Zheng, Y.; Du, Y.; Zhou, A.; Wang, C.; Au, M.H. Aggregating crowd wisdom via blockchain: A private, correct, and robust realization. In Proceedings of the 2019 IEEE International Conference on Pervasive Computing and Communications PerCom, Kyoto, Japan, 11–15 March 2019; pp. 1–10. [Google Scholar]
  13. Zhang, C.; Zhu, L.; Xu, C.; Liu, X.; Sharif, K. Reliable and privacy-preserving truth discovery for mobile crowdsensing systems. IEEE Trans. Dependable Secur. Comput. 2019, 18, 1245–1260. [Google Scholar] [CrossRef]
  14. Li, M.; Weng, J.; Yang, A.; Lu, W.; Zhang, Y.; Hou, L.; Liu, J.N.; Xiang, Y.; Deng, R.H. CrowdBC: A blockchain-based decentralized framework for crowdsourcing. IEEE Trans. Parallel Distrib. Syst. 2018, 30, 1251–1266. [Google Scholar] [CrossRef]
  15. Lu, Y.; Tang, Q.; Wang, G. Zebralancer: Private and anonymous crowdsourcing system atop open blockchain. In Proceedings of the 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), Vienna, Austria, 2–5 July 2018; pp. 853–865. [Google Scholar]
  16. Dong, X.L.; Naumann, F. Data fusion: Resolving data conflicts for integration. Proc. VLDB Endow. 2009, 2, 1654–1655. [Google Scholar]
  17. Bleiholder, J.; Naumann, F. Data fusion. ACM Comput. Surv. 2009, 41, 1–41. [Google Scholar] [CrossRef]
  18. Li, Q.; Li, Y.; Gao, J.; Zhao, B.; Fan, W.; Han, J. Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In Proceedings of the 2014 ACM Sigmod International Conference on Management of Data, Snowbird, UT, USA, 22–27 June 2014; pp. 1187–1198. [Google Scholar]
  19. Gao, J.; Fu, S.; Luo, Y.; Xie, T. Location Privacy-Preserving Truth Discovery in Mobile Crowd Sensing. In Proceedings of the 2020 29th International Conference on Computer Communications and Networks (ICCCN), Honolulu, HI, USA, 3–6 August 2020; pp. 1–9. [Google Scholar] [CrossRef]
  20. Sun, P.; Wang, Z.; Wu, L.; Feng, Y.; Pang, X.; Qi, H.; Wang, Z. Towards Personalized Privacy-Preserving Incentive for Truth Discovery in Mobile Crowdsensing Systems. IEEE Trans. Mob. Comput. 2022, 21, 352–365. [Google Scholar] [CrossRef]
  21. Peng, T.; Zhong, W.; Wang, G.; Luo, E.; Yu, S.; Liu, Y.; Yang, Y.; Zhang, X. Privacy-Preserving Truth Discovery Based on Secure Multi-Party Computation in Vehicle-Based Mobile Crowdsensing. IEEE Trans. Intell. Transp. Syst. 2024, 25, 7767–7779. [Google Scholar] [CrossRef]
  22. Li, Q.; Li, Y.; Gao, J.; Su, L.; Zhao, B.; Demirbas, M.; Fan, W.; Han, J. A confidence-aware approach for truth discovery on long-tail data. Proc. VLDB Endow. 2014, 8, 425–436. [Google Scholar]
  23. Meng, C.; Jiang, W.; Li, Y.; Gao, J.; Su, L.; Ding, H.; Cheng, Y. Truth discovery on crowd sensing of correlated entities. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, Seoul, Republic of Korea, 1–4 November 2015; pp. 169–182. [Google Scholar]
  24. Pedersen, T.P. Non-interactive and information-theoretic secure verifiable secret sharing. In Proceedings of the Annual International Cryptology Conference, Santa Barbara, CA, USA, 11–15 August 1991; pp. 129–140. [Google Scholar]
  25. Blum, M.; Feldman, P.; Micali, S. Non-interactive zero-knowledge and its applications. In Providing Sound Foundations for Cryptography: On the Work of Shafi Goldwasser and Silvio Micali; Association for Computing Machinery: New York, NY, USA, 2019; pp. 329–349. [Google Scholar]
  26. Lehmann, A. Scrambledb: Oblivious (chameleon) pseudonymization-as-a-service. Proc. Priv. Enhancing Technol. 2019, 2019, 289–309. [Google Scholar] [CrossRef]
  27. Camenisch, J.; Chaabouni, R.; Shelat, A. Efficient protocols for set membership and range proofs. In Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security, Melbourne, Australia, 7–11 December 2008; pp. 234–252. [Google Scholar]
  28. Bünz, B.; Bootle, J.; Boneh, D.; Poelstra, A.; Wuille, P.; Maxwell, G. Bulletproofs: Short proofs for confidential transactions and more. In Proceedings of the 2018 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 20–24 May 2018; pp. 315–334. [Google Scholar]
  29. Merkle, R.C. A digital signature based on a conventional encryption function. In Proceedings of the Conference on the Theory and Application of Cryptographic Techniques, Amsterdam, The Netherlands, 13–15 April 1987; pp. 369–378. [Google Scholar]
  30. Reijsbergen, D.; Yang, Z.; Maw, A.; Dinh, T.T.A.; Zhou, J. Transparent electricity pricing with privacy. In Proceedings of the Computer Security—ESORICS 2021: 26th European Symposium on Research in Computer Security, Darmstadt, Germany, 4–8 October 2021; Proceedings, Part II 26. Springer: Berlin/Heidelberg, Germany, 2021; pp. 439–460. [Google Scholar]
  31. Ni, J.; Lin, X.; Zhang, K.; Shen, X. Privacy-preserving real-time navigation system using vehicular crowdsourcing. In Proceedings of the 2016 IEEE 84th Vehicular Technology Conference (VTC-Fall), Montreal, QC, Canada, 18–21 September 2016; pp. 1–5. [Google Scholar]
  32. Ni, J.; Zhang, A.; Lin, X.; Shen, X.S. Security, privacy, and fairness in fog-based vehicular crowdsensing. IEEE Commun. Mag. 2017, 55, 146–152. [Google Scholar] [CrossRef]
  33. Xue, K.; Hong, J.; Ma, Y.; Wei, D.S.; Hong, P.; Yu, N. Fog-aided verifiable privacy preserving access control for latency-sensitive data sharing in vehicular cloud computing. IEEE Netw. 2018, 32, 7–13. [Google Scholar] [CrossRef]
  34. Mohassel, P.; Zhang, Y. Secureml: A system for scalable privacy-preserving machine learning. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 22–26 May 2017; pp. 19–38. [Google Scholar]
  35. Morais, E.; Koens, T.; Van Wijk, C.; Koren, A. A survey on zero knowledge range proofs and applications. SN Appl. Sci. 2019, 1, 1–17. [Google Scholar] [CrossRef]
  36. De, D. FedLens: Federated learning-based privacy-preserving mobile crowdsensing for virtual tourism. Innov. Syst. Softw. Eng. 2024, 20, 137–150. [Google Scholar]
  37. Wu, E.; Peng, Z. Research Progress on Incentive Mechanisms in Mobile Crowdsensing. IEEE Internet Things J. 2024, 11, 24621–24633. [Google Scholar] [CrossRef]
Figure 1. Merkle commitment tree.
Figure 1. Merkle commitment tree.
Sensors 25 02294 g001
Figure 2. System model of TP-MCS.
Figure 2. System model of TP-MCS.
Sensors 25 02294 g002
Figure 3. Accuracy evaluation.
Figure 3. Accuracy evaluation.
Sensors 25 02294 g003
Figure 4. Accuracy evaluation with involvement of malicious users.
Figure 4. Accuracy evaluation with involvement of malicious users.
Sensors 25 02294 g004
Figure 5. Convergence evaluation.
Figure 5. Convergence evaluation.
Sensors 25 02294 g005
Figure 6. (a) Proof of generation and verification time with varying numbers of users of MTRP. (b) Proof of generation and verification time with varying numbers of users of IAWAP.
Figure 6. (a) Proof of generation and verification time with varying numbers of users of MTRP. (b) Proof of generation and verification time with varying numbers of users of IAWAP.
Sensors 25 02294 g006
Figure 7. (a) Proof of generation and verification time with varying numbers of users of MTRP. (b) Proof of generation and verification time with varying numbers of users of IAWAP.
Figure 7. (a) Proof of generation and verification time with varying numbers of users of MTRP. (b) Proof of generation and verification time with varying numbers of users of IAWAP.
Sensors 25 02294 g007
Figure 8. (a) Proof of generation and verification time with varying numbers of users of MTRP. (b) Proof of generation and verification time with varying numbers of users of IAWAP.
Figure 8. (a) Proof of generation and verification time with varying numbers of users of MTRP. (b) Proof of generation and verification time with varying numbers of users of IAWAP.
Sensors 25 02294 g008
Figure 9. (a) Proof of generation and verification time with varying numbers of users of MTRP. (b) Proof of generation and verification time with varying numbers of users of IAWAP.
Figure 9. (a) Proof of generation and verification time with varying numbers of users of MTRP. (b) Proof of generation and verification time with varying numbers of users of IAWAP.
Sensors 25 02294 g009
Figure 10. Impact of core count on performance.
Figure 10. Impact of core count on performance.
Sensors 25 02294 g010
Table 1. Summary of the notations used in this paper.
Table 1. Summary of the notations used in this paper.
SymbolValueDescription
K N number of users
M N number of objects
C N number of time periods per operational cycle
T1trust authority
T a s k /the sensing task
x m k { x m 1 , , x m K } the sensing data of the object m for each user k K
x m * N the ground truth of each object m M
s k { s 1 , , s K } the k-th user’s summed distance
S u m s k N the summed distance of each user k K
w k { w 1 , , w K } the weight of each user k K
δ N the threshold of sensing data
δ d i s t N the threshold of s k
γ d i s t 0 , δ d i s t K 1 the threshold of S u m s k
c k C c the commitment of x m k
c * C c the sum of c k
r k R c random secret of each user k K
π k /zero-knowledge range proof of x m k [ 0 , δ d i s t ]
π k /the inclusion proof of each user k K
π * /zero-knowledge range proof S u m s k [ δ d i s t , γ d i s t ]
c i C c commitment of the i-th user’s w k and x m k
c ^ * C c commitment of the sequence w k and x m k for i K
r i R c random secret of user i K
π ^ * /inner-product argument of the w k and x m k for i K
Table 2. The communication overhead of TP-MCS in smart contracts.
Table 2. The communication overhead of TP-MCS in smart contracts.
EntityComputationBandwidth
com . Commit nizk . Prove nizk . Verify
SP 3 K 1 2 K + 2 0 ( K + 1 ) ( M c r + M π r + M c i + M π i ) + K ( 2 l o g 2 K + 1 ) M c r
User202 ( 2 l o g 2 K + 1 ) M c r + M c i + 2 M t
TA002 M c r + M π r + M c i + M π i + 2 M t
PBB000 2 ( K + 1 ) M t
Table 3. Comparison of properties.
Table 3. Comparison of properties.
SchemePrivacyTransparencyVerifiabilityEfficiency
Li et al. [18]
Miao et al. [11]
Zhang et al. [13]
Duan et al. [12]
Gao et al. [19]
Sun et al. [20]
Peng et al. [21]
TP-MCS
✓ indicates that the property is satisfied. ✗ indicates that the property is not satisfied.
Table 4. Performance comparison.
Table 4. Performance comparison.
SchemeServer OverheadClient Overhead
Computation Communication Computation Communication
Li et al. [18] O ( K ) O ( V ) O ( M ) O ( n )
Miao et al. [11] O ( ( K + 1 ) n 2 ) O ( V + M e ) O ( M + n 2 ) O ( n )
Zhang et al. [13] O ( K n 2 ) O ( V + M e ) O ( M + n 2 ) O ( n )
Duan et al. [12] O ( K ( log n + n 2 ) ) O ( V + M e + K M π r ) O ( M + n 2 ) O ( n + M π r )
Gao et al. [19] O ( K n 2 ) O ( V + M e ) O ( M + n 2 ) O ( n )
Sun et al. [20] O ( K ) O ( V + K M n ) O ( M ) O ( n + M n )
Peng et al. [21] O ( K n ) O ( V + K n ) O ( M + n ) O ( ( K 1 ) n )
TP-MCS O ( K ( log K + log n ) ) O ( V + M s ) O ( M + log n + log K ) O ( n + M u )
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jia, R.; Ma, J.; You, Z.; Zhang, M. Transparent and Privacy-Preserving Mobile Crowd-Sensing System with Truth Discovery. Sensors 2025, 25, 2294. https://doi.org/10.3390/s25072294

AMA Style

Jia R, Ma J, You Z, Zhang M. Transparent and Privacy-Preserving Mobile Crowd-Sensing System with Truth Discovery. Sensors. 2025; 25(7):2294. https://doi.org/10.3390/s25072294

Chicago/Turabian Style

Jia, Ruijuan, Juan Ma, Ziyin You, and Mingyue Zhang. 2025. "Transparent and Privacy-Preserving Mobile Crowd-Sensing System with Truth Discovery" Sensors 25, no. 7: 2294. https://doi.org/10.3390/s25072294

APA Style

Jia, R., Ma, J., You, Z., & Zhang, M. (2025). Transparent and Privacy-Preserving Mobile Crowd-Sensing System with Truth Discovery. Sensors, 25(7), 2294. https://doi.org/10.3390/s25072294

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop