**5. Experimental Results**

The objective of these experiments is to evaluate the performance of the proposed algorithm with respect to data utility, security and efficiency by comparing with existing anonymous approach *Clutree* [16] in hierarchical data which achieves *l*-diversity. The algorithms are implemented in Python, and ran on a computer with a four-core 3.4 GHz CPU and 8 GB RAM running Windows 7. We experimented on two synthetic datasets, which are obtained by the authors in [16]. They were modeled synthetically based on the real information of graduates from Sabanci University in Turkey. The synthetic dataset *A* has two levels (*h* = 2), in the order of (*major program*, *year of birth*) → *courses*, which contains 1000 students and nearly 20 courses per student. The synthetic data set *B* has three levels (*h* = 3), in the order of (*major program*, *year of birth*) → *courses* → *teachers*, in which there are 1000 students, every student studies nearly 20 courses, and every course has one to two teachers.

## *5.1. Evaluation Metrics*

We evaluate data utility, security and efficiency of our method by using *LM* cost [16,28], dissimilarity degree of the equivalence class [22] and the execution time, respectively.

For a hierarchical data record *T*, the cost of *T* is computed as follows:

$$\cos t(T) = \sum\_{\upsilon \in \Omega} \sum\_{q \in \upsilon\_{QI}} LM'(q) + \sum\_{\omega \in \Psi} |\omega\_{QI}| \tag{5}$$

where Ω and Ψ are the sets of vertices which are not suppressed and suppressed, respectively, <sup>|</sup>*<sup>ω</sup>QI*<sup>|</sup> is the number of QI attributes in *ω*, and *LM* (*q*) = (|*uq*<sup>|</sup> − 1)/(|*u*| − 1) is the information loss of generalizing *q* to *uq*. The larger information loss is, the lower utility is. *LM* cost is an important index to evaluate the utility of the anonymous method.

The equivalence class dissimilarity is proposed in [22] for relational data, and we extend it to hierarchical data. Let *Q* be an equivalence class and its class representative be *Crep*. *v* is a vertex in *Crep*, *m* is the number of sensitive values in *v*, and *z* is the number of sensitivity levels. The dissimilarity degree of *v* is defined as:

$$DSimDegree(v) = \frac{\sum\_{i=1}^{m-1} \sum\_{j=i+1}^{m} m\_{ij}}{\sum\_{i=1}^{z-1} \sum\_{j=i+1}^{z} z\_{ij}} \tag{6}$$

where *mij* is the distance between the sensitivity levels of the *i*th and *j*th sensitive values, and *zij* is the distance between the *i*th and *j*th sensitivity levels. The dissimilarity degree of *Q* is

$$DSimDegree(Q) = \frac{\sum\_{i=1}^{N} Degree(v\_i)}{N} \tag{7}$$

where *N* is the number of vertices of *Crep*. The larger *Degree*(*Q*) is, the larger the difference between the sensitive values is, the stronger the ability to resist attacks is and the higher the security is.
