Sketch-Based Anomaly Detection in Streaming Graphs

Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges and subgraphs in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? For example, in intrusion detection, existing work seeks to detect either anomalous edges or anomalous subgraphs, but not both. In this paper, we first extend the count-min sketch data structure to a higher-order sketch. This higher-order sketch has the useful property of preserving the dense subgraph structure (dense subgraphs in the input turn into dense submatrices in the data structure). We then propose 4 online algorithms that utilize this enhanced data structure, which (a) detect both edge and graph anomalies; (b) process each edge and graph in constant memory and constant update time per newly arriving edge, and; (c) outperform state-of-the-art baselines on 4 real-world datasets. Our method is the first streaming approach that incorporates dense subgraph search to detect graph anomalies in constant memory and time.


INTRODUCTION
Consider an intrusion detection system, in which anomalous behavior can be described as an individual or a group of attackers making a large number of connections to some set of targeted machines to restrict accessibility or look for potential vulnerabilities.We can model this as a dynamic graph, where nodes correspond to machines, and each edge represents a timestamped connection from one machine to another.
In this graph, edge anomalies include individual connections that are significantly more malicious than the rest of the connections in the network.In addition, anomalous behavior often also takes the form of a dense subgraph which could represent a group of malicious nodes that are communicating with each other in a way that is unusual compared to the rest of the graph, as shown in several real-world datasets in [1][2][3].Detecting both these edge and subgraph anomalies together provides valuable insights into the structure and behavior of the network and can help identify trends or patterns that might not be evident by considering only one type of anomaly.
Similarly, in a financial system, edge anomalies might include transactions that are significantly larger or more frequent than the rest of the transactions in the system.Subgraph anomalies might include groups of individuals or businesses who are significantly more likely to be involved in fraudulent activity than the rest of the system.Identifying both types of anomalies simultaneously can help detect fraudulent activity and protect against financial loss.Thus, we ask the question: Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to both edges and subgraphs in an online manner, for the purpose of detecting unusual behavior, using constant memory and constant update time per newly arriving edge?
Several approaches [4][5][6][7][8][9][10] aim to detect anomalies in graph settings.However, these approaches focus on static graphs, whereas many real-world graphs are time-evolving in nature.In streaming or online graph scenarios, some methods can detect the presence of anomalous edges, [3,[11][12][13], while others can detect anomalous subgraphs [1,2,14].However, all existing methods are limited to either anomalous edge or graph detection but not able to detect both kinds of anomalies, as summarized in Table 1.As we discuss in Section 7, our approach outperforms existing methods in both accuracy and running time; and on both anomalous edge and subgraph detection scenarios.Moreover, our approach is the only streaming method that makes use of dense subgraph search to detect graph anomalies while only requiring constant memory and time.
We first extend the two-dimensional sketch to a higher-order sketch to enable it to embed the relation between the source and destination nodes in a graph.A higher-order sketch has the useful property of preserving the dense subgraph structure; dense subgraphs in the input turn into dense submatrices in this data structure.Thus, the problem of detecting a dense subgraph from a large graph reduces to finding a dense submatrix in a constant size matrix, which can be achieved in constant time.The higherorder sketch allows us to propose several algorithms to detect both anomalous edges and subgraphs in a streaming manner.We introduce two edge anomaly detection methods, AnoEdge-G, and AnoEdge-L, and two graph anomaly detection methods AnoGraph, and AnoGraph-K, that use the same data structure to detect the presence of a dense submatrix, and consequently anomalous edges, or subgraphs respectively.All our approaches process edges and graphs in constant time, and are independent of the graph size, i.e., they require constant memory.We also provide theoretical guarantees on the higher-order sketch estimate and the submatrix density measure.In summary, the main contributions of our paper are: (
Edge Stream Methods: HotSpot [50] detects nodes whose egonets suddenly change.RHSS [51] focuses on sparsely-connected graph parts.CAD [52] localizes anomalous changes using commute time distance measurement.More recently, DenseStream [1] maintains and updates a dense subtensor in a tensor stream.SedanSpot [11] identifies edge anomalies based on edge occurrence, preferential attachment, and mutual neighbors.PENminer [12] explores the persistence of activity snippets, i.e., the length and regularity of edge-update sequences' reoccurrences.F-FADE [13] aims to detect anomalous interaction patterns by factorizing their frequency.MIDAS [3,53] identifies microcluster-based anomalies.However, all these methods are unable to detect graph anomalies.
Graph Stream Methods: DTA/STA [54] approximates the adjacency matrix of the current snapshot using matrix factorization.Copycatch [55] spots near-bipartite cores where each node is connected to others in the same core densely within a short time.SPOT/DSPOT [33] use extreme value theory to automatically set thresholds for anomalies.IncGM+ [56] utilizes incremental method to process graph updates.More recently, DenseAlert identifies subtensors created within a short time and utilizes incremental method to process graph updates or subgraphs more efficiently.SpotLight [2] discovers anomalous graphs with dense bi-cliques, but uses a randomized approach without any search for dense subgraphs.AnomRank [14], inspired by PageRank [57], iteratively updates two score vectors and computes anomaly scores.However, these methods are slow and do not detect edge anomalies.Moreover, they do not search for dense subgraphs in constant memory/time.

PROBLEM
Let ℰ = { 1 ,  2 , • • • } be a stream of weighted edges from a timeevolving graph G.Each arriving edge is a tuple   = (  ,   ,   ,   ) consisting of a source node   ∈ V, a destination node   ∈ V, a weight   , and a time of occurrence   , the time at which the edge is added to the graph.For example, in a network traffic stream, an edge   could represent a connection made from a source IP address   to a destination IP address   at time   .We do not assume that the set of vertices V is known a priori: for example, new IP addresses or user IDs may be created over the course of the stream.
We model G as a directed graph.Undirected graphs can be handled by treating an incoming undirected edge as two simultaneous directed edges, one in each direction.We also allow G to be a multigraph: edges can be created multiple times between the same pair of nodes.Edges are allowed to arrive simultaneously: i.e.  +1 ≥   , since in many applications   is given as a discrete time tick.
The desired properties of our algorithm are as follows: • Detecting Anomalous Edges: To detect whether the edge is part of an anomalous subgraph in an online manner.Being able to detect anomalies at the finer granularity of edges allows early detection so that recovery can be started as soon as possible and the effect of malicious activities is minimized.• Detecting Anomalous Graphs: To detect the presence of an unusual subgraph (consisting of edges received over a period of time) in an online manner, since such subgraphs often correspond to unexpected behavior, such as coordinated attacks.
• Constant Memory and Update Time: To ensure scalability, memory usage and update time should not grow with the number of nodes or the length of the stream.Thus, for a newly arriving edge, our algorithm should run in constant memory and update time.

HIGHER-ORDER SKETCH & NOTATIONS
Count-min sketches (CMS) [58] are popular streaming data structures used by several online algorithms [59].CMS uses multiple hash functions to map events to frequencies, but unlike a hash table uses only sub-linear space, at the expense of overcounting some events due to collisions.Frequency is approximated as the minimum over all hash functions.CMS, shown in Figure 1(a), is represented as a two-dimensional matrix where each row corresponds to a hash function and hashes to the same number of buckets (columns).We introduce a Higher-order CMS (H-CMS) data structure where each hash function maps multi-dimensional input to a generic tensor instead of mapping it to a row vector.H-CMS enhances CMS by separately hashing the individual components of an entity thereby Edge Anomaly --- maintaining more information.Figure 1(b) shows a 3-dimensional H-CMS that can be used to hash two-dimensional entities such as graph edges to a matrix.The source node is hashed to the first dimension and the destination node to the other dimension of the sketch matrix, as opposed to the original CMS that will hash the entire edge to a one-dimensional row vector (Figure 1(a)).
We use a 3-dimensional H-CMS (operations described in Algorithm 1) where the number of hash functions is denoted by   , and matrix M  corresponding to -th hash function ℎ  is of dimension   ×   , i.e., a square matrix.For each  ∈ [  ], the -th hash function denoted by ℎ  (, ) maps an edge (, ) to a matrix index (ℎ ′  (), ℎ ′′  ()), i.e., the source node is mapped to a row index and the destination node is mapped to a column index.That is, ℎ  (, ) = (ℎ ′  (), ℎ ′′  ()).Therefore, each matrix in a 3dimensional H-CMS captures the essence of a graph adjacency  buckets for each hash function (b) Higher-order CMS with   x   buckets for each hash function.matrix.Dense subgraph detection can thus be transformed into a dense submatrix detection problem (as shown in Figure 2) where the size of the matrix is a small constant, independent of the number of edges or the graph size.
For any (, ), let  (, ) be the true count of (, ) observed thus far and ŷ (, ) ] be the estimate of the count via the 3-dimensional H-CMS.Since the H-CMS can overestimate the count by possible collisions (but not underestimate because we update and keep all the counts for every hash function), we have  (, ) ≤ ŷ (, ).We define  to be the number of all observations so far; i.e.,  = ,  (, ).The following theorem shows that the 3-dimensional H-CMS has the estimate guarantees similarly to the CMS: where each of hash functions ℎ ′  and ℎ ′′  is chosen uniformly at random from a pairwise-independent family.Here, we allow both cases of ℎ ′ = ℎ ′′ and ℎ ′ ≠ ℎ ′′ .Fix  > 0 and set   = ln 1   and   = ⌈   ⌉.Then, with probability at least 1 − , ŷ (, ) ≤  (, ) + .
Theorem 1 shows that we have the estimate guarantee even if we use the same hash function for both the source nodes and the destination node (i.e., ℎ ′ = ℎ ′′ ).Thus, with abuse of notation, we write ℎ(, ) = (ℎ(), ℎ()) when ℎ ′ = ℎ ′′ by setting ℎ = ℎ ′ = ℎ ′′ on the right-hand side.On the other hand, in the case of ℎ ′ ≠ ℎ ′′ , maximum reported submatrix density it would be possible to improve the estimate guarantee in Theorem 1.For example, if we can make ℎ to be chosen uniformly at random from a weakly universal set of hash functions (by defining corresponding families of distributions for ℎ ′ and ℎ ′′ under some conditions), then we can set   = ⌈ √︃   ⌉ to have the same estimate guarantee as that of Theorem 1 based on the proof of Theorem 1.The analysis for such a potential improvement is left for future work as an open problem.
Frequently used symbols are discussed in Table 2, and we leverage the subgraph density measure discussed in [60] to define the submatrix (  ,  ) density.Definition 1.Given matrix M, density of a submatrix of M represented by   ⊆  and   ⊆  , is:

EDGE ANOMALIES
In this section, using the H-CMS data structure, we propose AnoEdge-G and AnoEdge-L to detect edge anomalies by checking whether the received edge when mapped to a sketch matrix element is part of a dense submatrix.AnoEdge-G finds a Global dense submatrix and AnoEdge-L maintains and updates a Local dense submatrix around the matrix element.

AnoEdge-G
AnoEdge-G, as described in Algorithm 2, maintains a temporally decaying H-CMS, i.e. whenever 1 unit of time passes, we multiply all the H-CMS counts by a fixed factor  (lines 2,4).This decay simulates the gradual 'forgetting' of older, and hence, more outdated information.When an edge (, ) arrives, ,  are mapped to matrix indices ℎ(), ℎ() respectively for each hash function ℎ, and the corresponding H-CMS counts are updated (line 5).Edge-Submatrix-Density procedure (described below) is then called to compute the density of a dense submatrix around (ℎ(), ℎ()).
Density is reported as the anomaly score for the edge; a larger density implies that the edge is more likely to be anomalous.Edge-Submatrix-Density procedure calculates the density of a dense submatrix around a given index (ℎ(), ℎ()).A 1×1 submatrix represented by   and   , is initialized with row-index ℎ() and column index ℎ() (line 9).The submatrix is iteratively expanded by greedily selecting a row   from   (or a column   from   ) that obtains the maximum row (or column) sum with the current submatrix (lines 11,12).This selected row   (or column   ) is removed from   (or   ), and added to   (or   ) (lines 14,16).The process is repeated until both   and   are empty (line 10).Density of the current submatrix is computed at each iteration of the submatrix expansion process and the maximum over all greedily formed submatrix densities is returned (lines 17,18).

AnoEdge-L
Inspired by Definition 1, we define the likelihood measure of a matrix index (ℎ(), ℎ()) with respect to a submatrix (  ,  ), as the sum of the elements of submatrix (  ,  ) that either share row with index ℎ ′′ () or column with index ℎ ′ () divided by the total number of such elements.
Definition 2. Given matrix M, likelihood of an index ℎ(, ) with respect to a submatrix represented by   ⊆  and   ⊆  , is: AnoEdge-L, as described in Algorithm 3, maintains a temporally decaying H-CMS to store the edge counts.We also initialize a mutable submatrix of size 1 × 1 with a random element, and represent it as (  ,  ).As we process edges, we greedily update (  ,  ) to maintain it as a dense submatrix.When an edge arrives, H-CMS counts are first updated, and the received edge is then used to check whether to expand the current submatrix (line 7).If the submatrix density increases upon the addition of the row (or column), then the row-index ℎ() (or column-index ℎ()) is added to the current submatrix, (  ,  ).To remove the row(s) and column(s) decayed over time, the process iteratively selects the row (or column) with the minimum row-sum (or column-sum) until removing it increases the current submatrix density.This ensures that the current submatrix is as condensed as possible (line 9).As defined in Definition 2, AnoEdge-L computes the likelihood score of the edge with respect to (  ,  ) (line 10).A higher likelihood measure implies that the edge is more likely to be anomalous.

GRAPH ANOMALIES
We now propose AnoGraph and AnoGraph-K to detect graph anomalies by first mapping the graph to a higher-order sketch, and then checking for a dense submatrix.These are the first streaming algorithms that make use of dense subgraph search to detect graph anomalies in constant memory and time.AnoGraph greedily finds a dense submatrix with a 2-approximation guarantee on the density measure.AnoGraph-K leverages Edge-Submatrix-Density from Algorithm 2 to greedily find a dense submatrix around  strategically picked matrix elements performing equally well in practice.

AnoGraph
AnoGraph, as described in Algorithm 4, maintains an H-CMS to store the edge counts that are reset whenever a new graph arrives.
The edges are first processed to update the H-CMS counts.Ano-Graph-Density procedure (described below) is then called to find the dense submatrix.AnoGraph reports anomaly score as the density of the detected (dense) submatrix; a larger density implies that the graph is more likely to be anomalous.AnoGraph-Density procedure computes the density of a dense submatrix of matrix M. The current dense submatrix is initialised as matrix M and then the row (or column) from the current submatrix with minimum row (or column) sum is greedily removed.This process is repeated until   and   are empty (line 11).The density of the current submatrix is computed at each iteration of the submatrix expansion process and the maximum over all densities is returned (lines 18,19).
Algorithm 4 is a special case of finding the densest subgraph in a directed graph problem [60] where the directed graph is represented as an adjacency matrix and detecting the densest subgraph essentially means detecting dense submatrix.We now provide a guarantee on the density measure.Lemma 1.Let  * and  * be the optimum densest sub-matrix solution of M with density D (M,  * , * ) =   .Then ∀ ∈  * and where: Proof.Leveraging the proof from [60], we greedily remove the row (or column) with minimum row-sum (or column-sum).At some iteration of the greedy process, ∀ ∈   ; ∀ ∈   , R (M, ,  ) ≥   *

AnoGraph-K
Similar to AnoGraph, AnoGraph-K maintains an H-CMS which is reset whenever a new graph arrives.It uses the AnoGraph-K-Density procedure (described below) to find the dense submatrix.AnoGraph-K is summarised in Algorithm 5.
AnoGraph-K-Density computes the density of a dense submatrix of matrix M. The intuition comes from the heuristic that the matrix elements with a higher value are more likely to be part of a dense submatrix.Hence, the approach considers  largest elements of the matrix M and calls Edge-Submatrix-Density from Algorithm 2 to get the dense submatrix around each of those elements (line 13).The maximum density over the considered  dense submatrices is returned.In this section, we evaluate the performance of our approaches as compared to all baselines discussed in Table 1 and aim to answer the following questions: Table 3 shows the statistical summary of the four real-world datasets that we use: DARPA [61] and ISCX-IDS2012 [62] are popular datasets for graph anomaly detection used by baselines to evaluate their algorithms; [63] surveys more than 30 datasets and recommends to use the newer CIC-IDS2018 and CIC-DDoS2019 datasets [64,65] containing modern attack scenarios.|| corresponds to the total number of edge records, | | and | | are the number of unique nodes and unique timestamps, respectively.

EXPERIMENTS
Similar to baseline papers, we report the Area under the ROC curve (AUC) and the running time.AUC is calculated by plotting the true positive rate (TPR) against the false positive rate (FPR) at various classification thresholds and then calculating the area under the resulting receiver operating characteristic (ROC) curve.The appropriate classification threshold for an anomaly detection system will depend on the specific application and the cost of false positives and false negatives, however, since AUC is independent of the classification threshold, one can evaluate the overall performance of the system without having to choose a specific threshold.Unless explicitly specified, all experiments including those on the baselines are repeated 5 times and the mean is reported.
Appendix D describes the experimental setup.Hyperparameters for the baselines are provided in Appendix E. All edge (or graph)based methods output an anomaly score per edge (or graph), a higher score implying more anomalousness.

Edge Anomalies
Accuracy: Table 4 shows the AUC of edge anomaly detection baselines, AnoEdge-G, and AnoEdge-L.We report a single value for DenseStream and PENminer because these are non-randomized methods.PENminer is unable to finish on the large CIC-DDoS2019 within 24 hours.SedanSpot uses personalised PageRank to detect anomalies and is not always able to detect anomalous edges occurring in dense block patterns while PENminer is unable to detect structural anomalies.Among the baselines, MIDAS-R is most accurate, however, it performs worse when there is a large number of timestamps as in ISCX-IDS2012.Note that AnoEdge-G and AnoEdge-L outperform all baselines on all datasets.
Running Time: Table 4 shows the running time (excluding I/O) and real-time performance of AnoEdge-G and AnoEdge-L.Since AnoEdge-L maintains a local dense submatrix, it is faster than AnoEdge-G.DenseStream maintains dense blocks incrementally for every coming tuple and updates dense subtensors when it meets an updating condition, limiting the detection speed.SedanSpot requires several subprocesses (hashing, random-walking, reordering, sampling, etc), PENminer and F-FADE need to actively extract patterns for every graph update, resulting in a large computation time.When there is a large number of timestamps like in ISCX-IDS2012, MIDAS-R performs slower than AnoEdge-L which is fastest.
AUC vs Running Time: Figure 3 plots accuracy (AUC) vs. running time (log scale, in seconds, excluding I/O) on ISCX-IDS2012 dataset.AnoEdge-G and AnoEdge-L achieve much higher accuracy compared to all baselines, while also running significantly faster.AnoEdge-G vs AnoEdge-L: AnoEdge-G finds a Global dense submatrix and therefore is more accurate than AnoEdge-L as shown in the performance on CIC-IDS2018.AnoEdge-L on the other hand maintains and updates a Local dense submatrix around the matrix element and therefore has better time complexity and scalability to larger datasets.

Graph Anomalies
Accuracy: Table 5 shows the AUC of graph anomaly detection baselines, AnoGraph, and AnoGraph-K.We report a single value for DenseAlert and AnomRank because these are non-randomized methods.AnomRank is not meant for a streaming scenario, therefore the low AUC.DenseAlert can estimate only one subtensor at a time and SpotLight uses a randomized approach without any actual search for dense subgraphs.Note that AnoGraph and Ano-Graph-K outperform all baselines on all datasets while using a simple sketch data structure to incorporate dense subgraph search as opposed to the baselines.
Running Time: Table 5 shows the running time (excluding I/O).DenseAlert has  (|ℰ|) worse case time complexity (per incoming edge).AnomRank needs to compute a global PageRank, which does not scale for stream processing.Note that AnoGraph and AnoGraph-K run much faster than all baselines.
AUC vs Running Time:    + |ℰ| *   ).Therefore, AnoGraph-K is faster when  is significantly smaller than .AnoGraph-K is also more robust because it only considers a small number of matrix elements.

Hyperparameter Study
Table 6 shows the performance of AnoGraph and AnoGraph-K for multiple time windows and edge thresholds.The edge threshold is varied in such a way that a sufficient number of anomalies are present within the time window.AnoGraph and AnoGraph-K achieve comparable results to those in Table 5.Table 7 shows the robustness of AnoEdge-G and AnoEdge-L as we vary the temporal decay factor .

CONCLUSION
In this paper, we extend the CMS data structure to a higher-order sketch to capture complex relations in graph data and to reduce the problem of detecting suspicious dense subgraphs to finding a dense submatrix in constant time.We then propose four sketch-based streaming methods to detect edge and subgraph anomalies in constant update time and memory.Furthermore, our approach is the first streaming work that incorporates dense subgraph search to detect graph anomalies in constant memory and time.We also provide a theoretical guarantee on the submatrix density measure and prove the time and space complexities of all methods.Experimental results on four real-world datasets demonstrate our effectiveness as opposed to popular state-of-the-art streaming edge and graph baselines.Future work could consider incorporating rectangular H-CMS matrices, node and edge representations, more general types of data, including tensors, and parallel computing to process large dynamic graphs with a high volume of incoming edges.
Proof.Fix  ∈ [  ].Let  = (  ,   ) and  = (  ,   ) such that  ≠ .This implies that at least one of the following holds: Thus, in the both cases, the probability of the collision is  (ℎ  () = ℎ  ()) =   .Thus, by defining By the Markov's inequality on the right-hand side, we have that ). Memory complexity of Algorithm 2 is  (  *  2  ).Proof.Procedure Edge-Submatrix-Density removes rows (or columns) iteratively, and the total number of rows and columns that can be removed is   +   − 2. In each iteration, the approach performs the following three operations: (a) pick the row with minimum row-sum; (b) pick the column with minimum columnsum; (c) calculate density.We keep   -sized arrays for flagging removed rows (or columns), and for maintaining row-sums (or column-sums).Operations (a) and (b) take maximum   steps to pick and flag the row with minimum row-sum (or column-sum).Updating the column-sums (or rows-sums) based on the picked row (or column) again takes maximum   steps.Time complexity of (a) and (b) is therefore  (  ).Density is directly calculated based on subtracting the removed row-sum (or column-sum) and reducing the row-count (or column-count) from the earlier density value.Row-count and column-count are kept as separate variables.Therefore, the time complexity of the density calculation step is  (1).Total time complexity of procedure Edge-Submatrix-Density is  ((  +   − 2) * (  +   + 1)) =  ( 2  ).Time complexity to initialize and decay the H-CMS data structure is  (  *  2  ).Temporal decay operation is applied whenever the timestamp changes, and not for every received edge.Update counts operation updates a matrix element value ( (1) operation) for   matrices, and the time complexity of this step is  (  ).Anomaly score for each edge is based on the submatrix density computation procedure which is  ( 2  ); the time complexity of   matrices becomes  (  *  2  ).Therefore, the total time complexity of Algorithm 2 is  (|ℰ| * (  +   *  2  )) =  (|ℰ| *   *  2  ).For procedure Edge-Submatrix-Density, we keep an   -sized arrays to flag rows and columns that are part of the current submatrix, and to maintain row-sums and column-sums.Total memory complexity of Edge-Submatrix-Density procedure is  (4 *   ) =  (  ).
Memory complexity of H-CMS data structure is  (  *  We keep an   -sized array to flag the current submatrix rows (or column), and also to maintain row-sums (or column-sums).Expand submatrix operation depends on the elements from row ℎ() and column ℎ(), and the density is calculated by considering these elements, thus requiring maximum   steps.Upon addition of the row (or column), the dependent column-sums (or row-sums) are also updated taking maximum   steps.Time complexity of expand operation is therefore  (  ).(b) Condense submatrix operation removes rows and columns iteratively.A row (or column) elimination is performed by selecting the row (or column) with minimum row-sum (or column-sum) in  (  ) time.Removed row (or column) affects the dependent column-sums (or row-sums) and are updated in  (  ) time.Time complexity of a row (or column) removal is therefore  (  ).Condense submatrix removes rows (or columns) that were once added by the expand submatrix operation which in worse case is  |ℰ|.
Expand and condense submatrix operations are performed for   matrices.Likelihood score calculation depends on elements from row ℎ() and column ℎ(), and takes  (  *   ) time for   matrices.Therefore, the total time complexity of Algorithm 3 is  (  *

Figure 1 :
Figure 1: (a) Original CMS with  2 buckets for each hash function (b) Higher-order CMS with   x   buckets for each hash function.

Algorithm 2 :
AnoEdge-G Scoring Input: Stream ℰ of edges over time Output: Anomaly score per edge 1 Procedure AN OED G E-G(ℰ) 2 Initialize H-CMS matrix M for edge count 3 while new edge  = (, , , ) ∈ ℰ is received do / * decay count * / 4 Temporal decay H-CMS with timestamp change 5 Update H-CMS matrix M for new edge (, ) with value  // update count 6 output  () ← Edge-Submatrix-Density(M, ℎ(), ℎ()) 7 Procedure ED G E-SU B M A T R I X-DE N S I T Y(M, , )

Figure 3 :
Figure 3: AUC vs running time when detecting edge anomalies on ISCX-IDS2012

Figure 4 :
Figure 4: (a) Linear scalability with number of hash functions.(b) Linear scalability with number of edges.

Figure 5
plots accuracy (AUC) vs. running time (log scale, in seconds, excluding I/O) on the CIC-DDoS2019 dataset.AnoGraph and AnoGraph-K achieve much higher accuracy compared to the baselines, while also running significantly faster.

Figure 5 :
Figure 5: AUC vs running time when detecting graph anomalies on CIC-DDoS2019

Figure 6 :
Figure 6: (a) AnoGraph-K scales linearly with factor . (b) Linear scalability with number of hash functions.(c) Linear scalability with number of edges.

Table 1 :
Comparison of relevant anomaly detection approaches.

Table 2 :
Table of symbols.

Table 3 :
Statistics of the datasets.

Table 4 :
AUC and Running Time when detecting edge anomalies.Averaged over 5 runs.

Table 5 :
AUC and Running Time when detecting graph anomalies.Averaged over 5 runs.

Table 6 :
Influence of time window and edge threshold on the ROC-AUC when detecting graph anomalies.

Table 7 :
Influence of temporal decay factor  on the ROC-AUC in AnoEdge-G and AnoEdge-L on DARPA.