*5.1. System Design*

With the development of social communication, the data in it is getting larger and larger, which is very painful for experts to analyze and thus they want to have a convenient tool to do research. After our study, we find that multi-level analysis can help them make decisions, especially in unlabeled data. In addition to these, although there have been many studies aiming at detecting anomalies in social networks, they only focus on specific types of social networks, such as blogs, e-mail or telecommunications, and fail to make full use of the common patterns. Based on these requirements, we design a novel and general visualization system, egoDetect, which can be explored from macroscopic, mesoscopic and microscopic levels. The design requirements for these three levels are as follows:

Through the model in the previous chapter, we can select some suspicious users. While displaying these users, we should also show those users who are suspected to be normal, because the results of the algorithm cannot guarantee that they are all correct and we need a comparison to find out the difference between them. Therefore, the macroscopic level view's aims are to display the whole picture of the network and help us make a preliminary classification of data.


Through the mesoscopic level view, the experts can select the ego they want to study deeply. Then, they need a more detailed view to help them make decisions. It is found that the relationship between two persons is important and a person's relationship with others should be stratified [5–7]. In other words, everyone's energy is limited, so people choose to spend a lot of time socializing with a small number of people and a small part of time communicating with others, but the hierarchical structure of abnormal users is generally vague. For advertisers, they are more likely to show an outward-spreading structure, that is, they like to contact other people but the relationship is not strong, while robot accounts show a high degree of intimacy with many people. Therefore, we need to reveal the relationship between the egos and alters in a mesoscopic level view.


The mesoscopic view is to analyze egos from a holistic perspective, so it does not provide more detailed information about egos. Anomalous egos not only differ in topology from others, but also in many other aspects, such as the behavior, patterns, active time and so on. Sometimes abnormal egos' topology cannot be judged directly. Above all, the microscopic level view needs to provide the behavior, patterns and other detailed information.

• T5 Exploring Time Sequence. Normal egos should have a specific active time, which may vary according to their occupations, but people's energy is limited, so it should have a hump shape and 0 valued region. Robot accounts and anomalous accounts will show long-term or even full-time behavior, while others may display local and random behaviors.

• T6 Analyzing Alters. The alters are those who egos contact-in or contact-out with. They form the networks of egos. Through the abnormal score of alters, we can judge egos from the other side. Besides, when we find an interesting alter from a mesoscopic perspective and want to dig deeper, or when we want to have a deeper understanding of the ego's behavior with each alter, it can give us more information.

### *5.2. Solar Ego Network Model*

In order to use the egocentric network to display the relationship between users and alters, we did a lot of studies and find that the traditional egocentric network view adopts node-link mode, where each edge represents the strength of the relationship between the ego and alter. However, this method does not intuitive enough, and with the increase of data, visual clutter is very serious. Therefore, we believe that a new view should be designed to clearly show the network structure and relationship with each contact in any situation.

Previous research indicates that all alters of an ego should have a hierarchy [6,7]. However, we find that anomalous users do not have this feature. Therefore, we decided to use this model to display the egocentric network and analyze egos by observing its network structure. However, here comes another question: What method should we use to determine the number of layers? There are already many researches and mature algorithms, such as Jenks Natural Breaks Classification [49] and Head/Tail method [50]. The main problem of them is that different people may have different network structures. Especially, anomaly egos have various structures, so it is hard to use algorithms to define uniform measurement indicators. Thus, we have come up with a compromise method to quantify each alter of ego according to the formula, and then map it to different layers in the graph according to the value. This ensures that different users can be properly displayed. Besides, previous studies have shown that most users' alters can be divided into five categories, so our model uses a five-layer network structure.

Through the above research, and inspired by the solar system, we design a novel view to solve it. The pipeline of it is shown in Figure 2. It consists of a central node and five layers of tracks, the closer to the central node, the more intimate with egos. The distance *θi*,*j* represents the relationship between the ego *i* and the alter *j*, The equation of *θi*,*j* is as follows:

$$k\_{\vec{i},\vec{j}} = \frac{\max(\mathbb{C}\_{\vec{i},\vec{j}\prime}\mathbb{C}\_{\vec{j},\vec{i}})}{\min(\mathbb{C}\_{\vec{i},\vec{j}\prime}\mathbb{C}\_{\vec{j},\vec{i}}) }\tag{10}$$

$$\theta\_{i,j} = k\_{i,j} \cdot \frac{1}{\left(\mathcal{C}\_{i,j} + \mathcal{C}\_{j,i}\right)}\tag{11}$$

*Ci*,*<sup>j</sup>* means the number of contact-out from *i* to *j*. From the equation, we can ge<sup>t</sup> that it is determined by the number of bidirectional contacts, so if the alter is unidirectional, the *ki*,*j* = −1, which means the relationship between them is weak. With this glyph, experts can analyze more efficiently. With *ki*,*j*, each alter can be placed on the corresponding layer.

**Figure 2.** Workflow of the solar ego model.
