*4.2. Metrics Abstract*

In the last section, we build a general model based on the ego central network for anomaly detection. In addition to it, abstracting metrics is another important thing. However, since the different egos may have different behaviors in different social communication methods, it requires us to deeply abstract the data, and extract the common metrics of the egos. After investigating various social networks, we designed the following metrics for analysis. For the ego *i* in the social network, we can divide its alters into two parts: *Siin* and *<sup>S</sup>iout*, representing the alter-in set and the alter-out set respectively. *kiin* = --*Siin*-- and *kiout* = --*Siout*-- mean the in-degree and out-degree. They can characterize the influence of the ego, which means the role of ego in the whole network. For example, the ego with more out-degree indicates that he is more willing to maintain relationships, while the one with more in-degree shows that he is more attractive to alters [47], thus suggesting the ego's size of the network. From the previous subsection, we can infer that the *w<sup>t</sup> i*,*j* is important for us to measure the importance of alter *j* to ego *i*. Therefore, p

$$\mathcal{W}\_{\text{i}} = \sum\_{j \in S\_{\text{out}}^{i}} \sum\_{t} w\_{i,j}^{t} \tag{1}$$

can quantify how ego *i* pays attention to his/her community. For normal egos, their total weight should not in an anomaly range. Contact-in and contact-out can help us judge whether the ego has a strong attraction and the importance of alters to egos, respectively. So we use the balance of attraction *τi*:

$$
\pi\_i = \frac{k\_{in}^i}{k\_{out}^i} \tag{2}
$$

to measure the relationship between alters and egos. Among them, the closer the ratio is to 1, the more balanced the ego is, and the more stable the network structure is. The closer to 0, the greater the attraction of the ego, while the larger than 1, the weaker the ego's attraction. In the last 2 cases, they all mean anomalous. From the previous section we know that the direction of contact is of grea<sup>t</sup> research value. Bidirectional alters are more likely to show intimacy than unidirectional alters, and if the ego's alters are basically unidirectional, then he is very likely to be an anomalous user. We introduce relationship balance *δi*:

$$\delta\_{\dot{i}} = \frac{\left| S\_{in}^{\dot{i}} \cap S\_{out}^{\dot{i}} \right|}{\left| S\_{in}^{\dot{i}} \cup S\_{out}^{\dot{i}} \right|} \tag{3}$$

to measure the abnormal degree of the ego's relationship. *δ* = 1 means all the alters in the network are bidirectional, while *δ* = 0 means all the alters are unidirectional. The proportion of bidirectional and unidirectional alters in the normal user's network should be maintained within a normal range. In other words, too big or too small are both anomalous.

**Figure 1.** Data structure for social model based on egocentral network. The square box represents the ego and the color of the box indicates which group the ego belongs to. The color of the circle indicates whether they have something in common. The length of arrow tells us the relationship between ego and the alter. The color of an arrow represent the direction of contact. The dotted border indicates the alter is a bidirectional alter.

The temporal features of egos, such as posing/calling interval /frequency, can characterize their behavior habits, patterns and properties. Therefore, we use the time sequence vector *Ti*:

$$T\_i = \{t\_{k'}^i k = 0, 1 \cdots \text{, } 2\} \tag{4}$$

to show the behavior of the ego *i*. *t i k* means at time k the ego *i*'s features. As normal egos' energy is limited, the reflect on the *Ti* is that the time sequence of normal egos should be regular and have or resemble a shape of hump (having meals or break) and there should have none-active place (sleeping), which means there is no behavior during this period.

Above all, we propose the following metrics:

M1. Ego network's in-degree *kiin* and out-degree *<sup>k</sup>iout*.

M2. Ego network's weight *Wi*.

M3. Attractiveness Balance *τi*.

M4. Relationship Balance *δi*.

M5. Time sequence vector *Ti*

These features can help us to make a preliminary classification of egos from a group perspective. The egos who unlike others can be found out. In order to analyze and judge these suspicious egos more deeply, we also need sociological methods to help make decisions. We will introduce them in next section.
