*5.3. System Overview*

Motivated by the above requirements, we design the egoDetect to detect and analyze the anomalous users at three different scales: A group view to show the scatter of all egos through their features and anomaly scores, the topology and features in ego network with an ego view, and the more detail of the ego and between egos and alters showing in the detail view.

The data pipeline of the system is shown in Figure 3. The raw data storage in HDFS (Hadoop Distributed File System). We use Spark to model graph and compute the metrics of it, and use the results to build egos' features. With the features, we can ge<sup>t</sup> the anomaly detection score of each ego in the network. Then, we use multidimensional scaling (MDS) [45] to reveal the scatter of them and use a novel glyph to show the ego network of them. We design our view through the D3 and Echarts, using Flask as our framework.

**Figure 3.** The data processing pipeline.
