Preemptive Detection of Fake Accounts on Social Networks via Multi-Class Preferential Attachment Classifiers

In this paper, we describe a new algorithm called Preferential Attachment k-class Classifier (PreAttacK) for detecting fake accounts in a social network. Recently, several algorithms have obtained high accuracy on this problem. However, they have done so by relying on information about fake accounts' friendships or the content they share with others--the very things we seek to prevent. PreAttacK represents a significant departure from these approaches. We provide some of the first detailed distributional analyses of how new fake (and real) accounts first attempt to request friends after joining a major network (Facebook). We show that even before a new account has made friends or shared content, these initial friend request behaviors evoke a natural multi-class extension of the canonical Preferential Attachment model of social network growth. We use this model to derive a new algorithm, PreAttacK. We prove that in relevant problem instances, PreAttacK near-optimally approximates the posterior probability that a new account is fake under this multi-class Preferential Attachment model of new accounts' (not-yet-answered) friend requests. These are the first provable guarantees for fake account detection that apply to new users, and that do not require strong homophily assumptions. This principled approach also makes PreAttacK the only algorithm with provable guarantees that obtains state-of-the-art performance on new users on the global Facebook network, where it converges to AUC=0.9 after new users send + receive a total of just 20 not-yet-answered friend requests. For comparison, state-of-the-art benchmarks do not obtain this AUC even after observing additional data on new users' first 100 friend requests. Thus, unlike mainstream algorithms, PreAttacK converges before the median new fake account has made a single friendship (accepted friend request) with a human.


INTRODUCTION
Fake user accounts are the primary source of fake news and other malicious phenomena on social networks such as Facebook and Twitter.Organized campaigns of fake accounts have recently been used to influence public opinion, push propaganda, infiltrate political discourse, manipulate stock markets, steal personal data, and propagate scams [7, 13, 14, 16-18, 26, 34, 35, 37, 38, 40].Detecting these fake accounts and limiting their ability to interact maliciously with humans are core tasks for modern social networks [11,23,47].
The scale of fake accounts has increased commensurately with the rapid growth of online social networks.In the last year alone, Facebook disabled 6.1 billion fake accounts-more than double the number of active users on the Facebook network [16].This figure reflects immense recent progress in fake account classificationfor example, Facebook disabled the vast majority of these fakes during account registration.Nonetheless, the fraction of active social network users who are fake has remained at roughly 4-5% (for Facebook) or 8-15% (for Twitter) for the last several years [16,40].
The early detection paradox.These active fakes that evade registration-time classifiers and join a social network raise what we call the early detection paradox: Mainstream algorithms to detect active fake accounts rely on information about their friends or the content they share with others, yet these friendships and shared content are the very things we seek to prevent.Our goal in this paper is to design algorithms that overcome this paradox by classifying active fake accounts before they make friends or share content.Recent algorithms.This paradox is captured by the two mainstream approaches to fake account detection: • Network-structural algorithms.Network-structural algorithms classify long-tenured accounts via the Homophily Assumption, which states that users eventually tend to cluster together (i.e.make the majority of their steady-state friendships) with other users who share their same { ,  } label [20,44,49,53].Based on this assumption, networkstructural algorithms attempt to propagate a small number of known users' { ,  } labels across the friendship network to unknown users via either Random Walks [9,15,21,49,52,53] or Belief Propagation [19,20,[43][44][45].
• Feature-based classifiers.Recently, a variety of research has detected fake accounts in a supervised learning setting.Stateof-the-art algorithms such as DEC [47], Jodie [25] and Ties [30] accomplish this via embeddings of tens of thousands of features that capture sophisticated properties of a user's friendship network, such as the average account age of a user's friends-of-friends, or temporal trends in the content a user shares over time [22,24,25,30,36,41,47].While these algorithms have no theoretic guarantees, they are performant: Facebook now uses them to obtain high quality { ,  } labels (AUC>0.98)for virtually all of their longtenured users [11,23,47].
Notwithstanding these impressive results, neither approach is ideally suited to the early detection of new fake accounts that have not yet made many (or any) friendships: Because such accounts have just passed registration-time feature-based classifiers, they cannot be detected by other feature-based classifiers until their features evolve significantly.Also, many informative features are unknown until after a new user has made several friends or shared content with others.Similarly, it is well-known that mainstream networkstructural algorithms do not apply, as their theoretic guarantees rely critically on the Homophily Assumption, which only applies to long-tenured users who have had sufficient 'stabilization time' to make the majority of their eventual friendships [1,9,33,44,49].For this reason, evaluations of network-structural algorithms have often excluded new users with less than e.g. 1 to 6 months of tenure on the social network [9,10,49].Recent evaluations of these algorithms on the Facebook network suggest they perform poorly (AUC<0.6)on new users who have not yet made many friends [11].
Overcoming the paradox.To address this early detection paradox, we use data from the Facebook social network to provide some of the first distributional analyses of how fake (and real) accounts target their friend requests after joining a major social network (Figs.[1][2][3][4].This focus on friend requests is motivated by the fact that new fake accounts can only meaningfully interact with real users after they have sent friend requests to real users (or received requests from real users) and those requests have been seen and accepted.Fig. 1 shows that among the subset of new fake accounts that eventually obtain a friendship with a real user, the median new fake account sends 16 friend requests before obtaining a single friendship (accepted request) with a real user (note log-scale).If we also include the requests these new fake accounts receive from others (Fig. 2), the count increases to 29 requests (sent+received).
Can we leverage this small number of not-yet-answered friend requests to distinguish new fake accounts from new real users?-Class Directed Preferential Attachment model (CDPA).On the Facebook network, we observe that while fake and real users do differ slightly as a class in terms of the degree to which they send and receive requests from fakes and reals (red vs. blue distributions in Figs. 3 and 4, next pg.-seealso Sec. 5), these class-level differences are small in comparison to individual-level differences (spread of distributions).Specifically, some users are exponentially more likely to request (or be requested by) a real user (mass right of red lines in Figs. 3 and 4, resp.);Other users are exponentially more likely to request (or be requested by) a fake account (mass left of red lines).
This observation evokes the canonical Preferential Attachment (PA, i.e. rich-get-richer) generative model of social network growth [2,5,6,8,32].In a traditional PA model, each new user joins a social network and sends friend requests to recipients who are selected with probability proportional to the counts of requests that they have already received.This process results in a powerlaw distribution of users' in-degrees such that a small number of recipients become vastly more popular than others.PA models and their associated dynamic processes continue to motivate a variety of recent results across several machine learning subfields.In our problem setting, fake and real users' 'preferential attachment' to different individuals inspires a natural multi-class extension of the PA model, which we call CDPA: • Suppose we observe an arbitrary preexisting directed network of friend requests between existing users.Then, suppose some new fake and real users join this network.• New fakes and reals each send and receive friend requests to/from existing users who are chosen proportional to how many of the new user's  / class already did so.The CDPA model provides a principled foundation for a classifier that applies to new accounts.Specifically, recent research has highlighted various similar multi-class PA models as a theoretical mechanism for the emergence of homophily in social networks [3,4,27,29,54].As such, our CDPA model forms a natural antecedent to standard homophily-based fake account detection methods that are used to detect long-tenured fake accounts.
We emphasize that we use CDPA to model the friend request networks of a small batch of new users; we do not assume that the entire network emerged from this process (which would be a far stronger assumption), nor do we assume that the (distinct) network of accepted friend requests (i.e.friendships) adheres to PA.
Main contribution.Our main result is an algorithm, PreAt-tacK, that determines the posterior probability that a new user is a fake account based on the CDPA model of her (not-yet-answered) friend requests.Specifically, PreAttacK updates the probability that a new user is fake to the extent she (1) 'preferentially attaches' to specific recipients in keeping with their probabilities of being requested by fake accounts vs. by real ones, and also (2) to the extent existing users 'preferentially attach' to her in keeping with their probabilities of sending requests to fake accounts vs. real ones.
• Theoretic contribution.We derive instance-specific bounds that show PreAttacK near-optimally approximates each new user's posterior probability of being fake in relevant problem instances at lower computational cost than alternatives.These are the first provable guarantees for fake account detection that apply to new users, and that do not require strong homophily assumptions.Indeed, despite the enormous popularity of Preferential Attachment models, to our

MULTI-CLASS DIRECTED PA (𝑘CDPA)
Our core generative model is a simple but powerful extension of the canonical directed Preferential Attachment (rich-get-richer) model to the setting where there are =2 classes of new users, fakes and reals, who join a preexisting social network.Whereas traditional PA models capture how new users tend to seek out already-popular users, CDPA captures how new users tend to seek out (request and/or be requested by) users who are already popular with those of the new user's fake/real class: • New users' outgoing friend requests: We model that new fake users send friend requests to existing users drawn in proportion to the counts of requests existing users already received from fakes only, and new real users send friend requests to existing users drawn in proportion to the counts of requests existing users already received from reals only.• New users' incoming friend requests: Similarly, each new fake [or real] user receives requests from existing users who are drawn in proportion to the counts of requests existing users already sent to fakes [or reals].The CDPA model is formally described by the following generative process.Suppose we have a preexisting directed social network  ( ,  0 ,   ) where edges  0 capture friend requests (not friendships/accepted requests).We consider =2 classes: users  ∈  have known fake/real labels   :   ∈ {, } | | .We will denote a single user 's label by lowercase ℓ  .Finally, we have a small set of new users  :  = { 1 . . .  } who are each fake with probability .Some new users  are more likely than others to send a friend request and/or receive a friend request.To be as general as possible, suppose we have some distribution D that captures these probabilities (so D's domain includes 2| | entries-two for the probability that each new user  will [, ] a friend request).The CDPA model is then: Here,  is a small constant that captures e.g. the probability that a preexisting user  receives or sends her first-ever request (in Section 5 below, we consider a 'homophily-incorporating' extension where  depends on the sender and receivers' real/fake labels).By 1 we denote the indicator function that takes value 1 if the argument is true and 0 otherwise, so the sum under the if  =  statement counts the number of friend requests that existing user  has already received from users who have the same { ,  } label as new user .Note that this includes requests from the preexisting network ( 0 ) as well as requests from new users in previous iterations. 1hile we are interested in =2 classes, note that CDPA easily extends to the case where there are >2 classes of users,   ∈ {1, . . .,  } | | just by replacing Bernoulli() with Multinom( 1 , . . .,   ).This captures (for example) settings where there are multiple types of fake users: sockpuppets, false news bots [42], etc., and each has different preferences in terms of existing users they seek to befriend.
Very recently, similar 2-class (and multi-class) PA models have received much attention due to their ability to explain the generative process by which homophily and related properties emerge in social networks [3,4,27,29,54].However, for our purposes, we do not require the model to explain the full evolution of a social network; we merely require it to capture the friend request behavior of new users who join a long-established network (e.g.Facebook).

THE PREATTACK ALGORITHM
In this section, we derive a new algorithm, Preferential Attachment k-class Classifier (PreAttacK) that near-optimally approximates the posterior probability that each new social network user is a fake account under the CDPA model.Intuitively, PreAttacK updates the probability that a new user is fake to the extent she (1) 'preferentially attached' to specific recipients in keeping with their probabilities of being requested by fake accounts vs. by reals, and also (2) to the extent existing users 'preferentially attached' to her in keeping with their probabilities of sending requests to fake accounts vs. reals.Because CDPA models friend requests rather than friendships (accepted requests), PreAttacK can classify new users even before they make a single friendship.Surprisingly, despite the complex properties of CDPA (and PA processes in general), we show that PreAttacK is also computationally efficient on mature social networks containing billions of users.
PreAttacK considerations.We are interested in the =2 case where users are {,  }, but also show in Appendix B that PreAttacK also accommodates >2 to classify multiple types of fakes, such as sockpuppets and false news bots.Importantly, we will assume that the count of requests that each new user  sends and receives are independent of her label.This precludes the undesirable scenario where PreAttacK e.g.penalizes new real users who send many requests by increasing the posterior probability that they are fake.Finally, note that CDPA generates no requests between new accounts.It is easy to modify CDPA to generate such requests2 , but excluding them precludes a scenario where the posterior probability that one new account is fake depends only on other new accounts.This prevents malicious adversaries from manipulating PreAttacK by generating many new accounts at once (see Section 4.3).

PreAttacK part I: A new user's outgoing friend requests.
The conditional probability     + that new fake user  who sends a friend request at iteration  of CDPA draws preexisting user  for the recipient is proportional to the count of requests that  already received from fakes before iteration : Similarly, if new user  is real, this probability becomes: Given all new users' {, } labels and the sequence of all other new users' friend requests  1 , . . ., then the joint conditional probability of observing 's sequence of outgoing friend request recipients   is just the product of their individual probabilities (eqn. 2 or 3).Denote this sequence of 's recipients by N +  .If  is fake: And similarly, if  is real, this conditional probability is: PreAttacK part II: A new user's incoming requests.Noting the symmetry of the CDPA model with respect to requests that new users send and receive, we can also derive the cond.probability     − that a new user  who receives a friend request at iteration  draws preexisting user  for the request's sender.Similar to above, this probability is proportional to the count of requests that  has already sent to users who share the same label as .If  is fake: And if new user  is real, this conditional probability is: Similar to above, the joint conditional probability of 's sequence of incoming friend request senders (denoted by N −  ) if  is fake is: And similarly, if  is real, this conditional probability is: Posterior probability that a new user is fake.We are now able to derive the full posterior probability that new user  is fake as a function of the observed sequence of preexisting users to whom she sent friend requests and from whom she received requests.Leveraging Bayes' rule and the law of total probability we have: This posterior captures the idea that  is relatively more likely to be fake to the extent she 'preferentially' sent requests to recipients who are more preferred by fakes, and also to the extent she received requests from senders who are more likely to send to fakes.

Intractability
Unfortunately, this expression for the posterior probability P * u that a new user  is fake is intractable, as it requires knowledge of the (latent) real/fake label of all new users who sent requests before .Moreover, computing this posterior in expectation becomes infeasible as we consider more than a handful of new users, as this requires integrating over all possible label combinations.
A standard approach at this point would be to apply either linearized belief propagation or MCMC techniques.However, both are computationally expensive in large networks due to the need to e.g.iterate between inferring new users' posterior labels and updating all existing users' sending and receiving preferential attachment weights (i.e.sums within     + ,     + ,     − ,     − ) until (possible) convergence.They also typically lack convergence guarantees [20,51], or obtain guarantees only at the expense of the approximation (e.g. via linearization) [43,45] or significant complexity [50].

Fast approximation
In contrast to these approaches, we consider a fast approximation for P * u based on the following idea: PA probabilities in mature social networks are stable over small batches of new entrants.So, rather than account for small and intractable changes to one new user's posterior that accrue due to other new users' edges  1 , . . ., we ignore them and then bound their worst-case impact.Consider that given a large preexisting network, a small batch of new accounts who send and receive friend requests (probably) do not significantly change existing users' PA probabilities (i.e.sums in the Draw steps of CDPA).At a high level, there are three reasons why this is so: (1) Collisions are (probably) rare.Given a large preexisting network of 300 million (Twitter) or 2 billion (Facebook) users, a small batch of new users are unlikely to 'draw' the same recipients multiple times.When a new user sends a request to a recipient who was not previously requested by a new user, the numerators in     + and     + are equal to their (known) original values in  0 .The same is true of numerators in     − and     − when a new user receives a request from a not-previously-drawn sender.
(2) Collisions (probably) have negligible impact.In cases where multiple new accounts do send friend requests to the same preexisting recipient, that recipient was probably already very popular (i.e.already had a large PA probability) due to PA's 'rich-get-richer' dynamics.In that case, this preexisting recipient's PA probability only undergoes a small percentage change after each new request, so it is well-approximated by its original value in  0 .This argument also applies when multiple new accounts receive requests from the same preexisting recipient.(3) New users have a small number of friend requests.
A large preexisting social network of billions of users results from on the order of 10 And similarly: ≈ P We obtain approximations for the remaining PA probabilities ('receiving' probabilities) P  − , and P  − by making the identical substitution of  0 for   −1 in eqns.6, and 7 (note that these four approximations are constant for all new edges to/from the same preexisting user , so we drop  subscripts accordingly).
We now obtain an approximation Pu of the posterior probability P * u that new user  is fake by using these approximations in eqns.Below, we show that in our setting PreAttacK obtains nearoptimal approximations for the posterior probabilities P * u at low computational cost.We also show in Section 5 that it can be naturally extended to capture homophily or even monophily-scenarios where  = f (ℓ  , ℓ  ).These extensions incur no cost in terms of complexity, and they slightly improve the approximation bounds.

ANALYSIS OF PREATTACK
Our goal in this section is to show that PreAttacK results in improved computational complexity over alternatives, and that it admits instance-specific approximation bounds that confirm nearoptimal posterior inference for our problem instance.We note that these are some of the first theoretic guarantees for this problem that do not rely on homophily assumptions. + ,  − ⊆  respectively refer to the subset of preexisting users who receive and send requests in preexisting network  0 .Then, computing PreAttacK's posterior for all new accounts  requires 2|  \ 0 | + 2| | operations.Importantly, unlike state-of-the-art algorithms, PreAttacK can be computed for all new accounts in a single pass through all edges [20,43,45,49].This yields O (|  |) asymptotic complexity, which is O (| ∪  |) in (sparse) social networks [28].This improves on state-of-the-art algorithms such as SybilBelief, SybilRank, and SybilSCAR, which require O (| ′ |), where  is the number of iterations (at least O (log(| ∪  |))) and  ′ is the set of all accepted friend requests [20,45,49].

Instance-specific approximation guarantee
We formalize the three key intuitions from Section 3.2 to derive instance-specific and new-user-specific approximation guarantees.This is advantageous because it allows researchers to also obtain an upper-and lower-bound of the exact posterior for each new user, and also to determine the batch size (or subset) of new users that can be classified while maintaining a desired worst-case approximation bound for a specific problem instance.We give the key intuition for the proof here and defer full analysis to Appendix A.
One-sided approximation errors.It is acceptable for PreAt-tacK to overestimate the posterior probability that a new fake is fake and underestimate the probability that a new real is fake, but not the opposite.Therefore we seek, for each new user , two bounds: a worst-case approximation factor (underestimate factor)   ≤ Pu /P * u , which is useful if  is fake, and a factor (overestimate factor)   ≥ Pu /P * u that is useful in case  is real.
Avoiding the combinatorial problem of new users' labels.Consider   .The main difficulty is that we cannot know (without trying all combinations) the worst-case configuration of new users' latent labels that results in the worst underestimate Pu /P * u .This is because each new user before  may have sent multiple requests to recipients , some of which result in increases to P * u (e.g. if the other new user is also fake and targets some of the same recipients as ) and some in decreases (e.g. if the other new user is also fake and targets some recipients who are not among 's recipients).
We sidestep this combinatorial problem by imagining that each new edge to/from a new account prior to 's is sent by a unique 'phantom' new account  whose label is the worst-case label for the bound of interest.Thus, for   we assume ℓ  = if 's single new request is to/from the same preexisting recipient  as one of 's requests, and ℓ  = otherwise.Compute 's 'worst case underestimate if  is fake' posterior P F u,WC using these 'phantom labels' to obtain   = Pu /P F u,WC ≤ Pu /P * u .To then obtain the 'worst case overestimate if  is real' factor   , compute P R u,WC assuming the opposite: ℓ  = if 's single new request is to/from the same preexisting user  as one of 's requests, else ℓ  = (see Appendix A).
In Section 6, we show this yields useful approximation bounds for millions of new accounts in real data (  ≈ 0.85,   ≈ 1.1).

Adversarial robustness in practice
We also highlight an important property that PreAttacK shares with recent advances in practical adversarial robustness for this problem.The most performant recent algorithms for fake account detection at Facebook obtain adversarial robustness in practice by leveraging so-called 'deep network features' [47], which are features that capture aggregate properties of each user's friends-of-friends.Such aggregates have been shown to be practically difficult for even coordinated campaigns of fake accounts to manipulate, particularly when befriending (at least some) real users.PreAttacK similarly works by aggregating over the features (i.e.counts) of e.g.friendrequesters-of-friend-requestees.As such, PreAttacK's preferential attachment probabilities may also be considered 'deep network features'.Manipulating PreAttacK's prediction for a certain user would require an adversary to manipulate the counts of fake and real senders who send requests to the user's recipients, as well as the counts of known fake and real users to whom the user's requesters also send requests. 3See also Appendix C.
Below, we also consider a variant of PreAttacK called PreAt-tacK++ that also prevents sophisticated adversaries from avoiding detection by targeting only very unpopular (and thus uninformative) real users who have sent and received few friend requests.
Finally, we note that in practice on large scale social networks, new approaches to this problem that are practically vulnerable to attack (such as modifying a fake account classifier by adding a new and informative feature that can be manipulated by users) tend to prompt an observable response from sophisticated adversaries (see e.g.[47]).We have observed no such response to PreAttacK.

PREATTACK++ AND HOMOPHILY
We also consider a variant of PreAttacK, PreAttacK++, that incorporates homophily and/or monophily 4 to more rapidly detect fakes.PreAttacK++ captures scenarios where the CDPA prior probabilities 5  that an existing user  receives a request from (or sends a request to) a new user  depend on  and 's real/fake labels, and also on whether the new account is the sender or the recipient.This captures e.g. a typical case where a new real account is a priori much less likely to send a request to a preexisting fake account vs. a preexisting real account (even if neither has previously received any requests).It can also capture monophilic networks where e.g.new fakes prefer to target real users rather than other fakes.
Incorporating these label-dependent probabilities is advantageous because they allow the posterior to update even when a new user sends requests to (or receives requests from) preexisting recipients who have not received any requests, but whose label is known.This also prevents sophisticated fake accounts from avoiding detection by targeting only unpopular recipients. 3Alternatively, a sophisticated adversary might attempt to learn and then target the set of real users who are primarily targeted by real users and not fakes (i.e. who have small P  + / P + < 1).However, even if this were possible, selection bias dictates that these real users may be less receptive to accepting fakes' friend requests, and the adversary would have to severely limit its fake accounts' friend requests to each real user  to avoid increasing P  + (which would result in future detection by PreAttacK). 4 Recall that monophily occurs where one type of user prefers to connect to a specific other type of user, e.g. if fake users send requests to reals rather than other fakes. 5We refer to 's as 'probabilities' for readability, but note that in CDPA, PA probabilities are proportional to , so it is possible to choose parameters  ∈ [0, inf ).
And the probability a new fake receives a request from  becomes: Note that PreAttacK++'s new expressions for P + and P  − can be obtained by substituting  for  everywhere in eqns.17 and 18.
Note this change does not incur a penalty in terms of complexity.Also, because more informative  + ℓ  →ℓ  and  − ℓ  →ℓ  values reduce the marginal change in posterior that can accrue due to new edges in each existing user's PA weights, PreAttacK++ admits slightly improved instance-specific bounds compared to PreAttacK for identical problem instances (see Appendix A).

EVALUATIONS
Our goal in this section is to show that beyond its provable guarantees, PreAttacK performs well in practice on new fake accounts on the global Facebook network.Our goal is not to measure performance on all fake accounts, as the current generation of production classifiers already detect the vast majority of fakes during account registration [47,49].Similarly, PreAttacK is not an alternative to other production classifiers that detect longer-tenured fake accounts based on their longer timelines of friendships and shared content [30].Rather, we seek to overcome the early detection paradox by rapidly obtaining a good classification after an account passes registration, but before it can engage with real users.Thus, rather than measure performance on all new accounts (including those easily detected by existing means), we instead evaluate the degree to which PreAttacK improves upon state-of-the-art defenses already in place [11,30,47] by detecting new fake accounts that are not yet detected by those methods.This 'hardest-to-detect' class [11,23,47] of new fakes motivates our evaluations.
Our main empirical result is that PreAttacK converges to informative classifications (AUC ≈0.9) after new accounts send + receive a total of 20 not-yet-answered friend requests. 6For comparison, state-of-the-art network-based algorithms do not obtain this performance even after observing additional data on new users' first 100 friend requests.This means that unlike many state-of-the-art algorithms, PreAttacK converges before the median fake account makes a single friendship (accepted request) with a real user.To accomplish this, we conduct two sets of evaluations.In the first set, we evaluate PreAttacK and its variants on new accounts that joined the global Facebook network, and we show how PreAt-tacK converges to AUC ≈0.9 as each new account sends and receives its first handful of friend requests.In our second set of evaluations, we compare PreAttacK to four state-of-the-art networkbased benchmarks.Because these benchmarks are significantly more computationally intensive than PreAttacK, we restrict our data in this 2nd evaluation to a single country of ∼1 million users.

Evaluation 1 framework
To evaluate PreAttacK's performance on new fake accounts on the global Facebook network, we adopt the evaluation framework of [11].Specifically, we consider the set of all (>10 6 ) new accounts that joined the global Facebook network during a particular week last year, along with the time-ordered set of friend requests that they sent and received during that week.Our goal is to determine whether PreAttacK could have accurately classified these new accounts using just their initial 1, 2, . . .50 initial friend requests from this first week after they joined the network, based on the counts of requests that preexisting accounts had sent and received from real and fake accounts prior to the start of this week (i.e.preexisting users' PA probabilities).Because several months have passed since this 'historical evaluation week', we can now measure the accuracy of PreAttacK's 'early' classifications against high-confidence labels subsequently obtained from production classifiers.[23,47].
Homophily benchmark.We also consider a simplified variant of PreAttacK: Homophily.Homophily is identical to PreAt-tacK++ but with existing users' PA probabilities zeroed out except for  terms, such that the probability of each new user's edge to/from any existing user is proportional to the overall within-or cross-class rate  + ℓ  →ℓ  or  − ℓ  →ℓ  (see Appendix D).By comparing PreAttacK to Homophily, we ascertain the degree to which PreAt-tacK's performance is homophily-based (i.e.driven by real vs.fake users' different preferences for in-class vs. cross-class friends) versus the degree to which it is driven by differences between real and fake users' preferences for individuals (i.e.our 2-Class PA model).
PreAttacK-send, PreAttacK++-send, & Homophily-send.For each variant, we also compute a '-send' version that only considers the friend requests that new users sent (and ignores requests they received).By comparing (for example) PreAttacK-send to PreAttacK, we measure how PreAttacK's performance is driven by the requests that new users send vs. the requests they receive.
Fast implementation and practical scaling.We implement PreAttacK and its variants in PyTorch [31].On a 40-core 2GHz production virtual machine and even without GPUs, PreAttacK classifies more than a million new accounts-per-second.This efficiency permits us to recompute PreAttacK's posterior after each user's first friend request, second request, and so on in order to obtain real-time-updated classifications for all new accounts.

Evaluation 1 Results
Fig. 5 plots the AUC of PreAttacK-send versus the count of friend requests sent by new accounts.Each (, ) point in the plot represents the AUC of the corresponding variant of PreAttacK run on just the first  friend requests sent by new accounts during the 'evaluation week'.Here, we observe that PreAttacK-send and PreAttacK++-send already obtain an informative posterior (AUC>0.75)after a new account sends 2 friend requests-well less than the 16 requests it takes the median new fake to make a friendship (i.e.accepted request) with a real user.Note that the x-axis of Fig. 5 corresponds to our motivating plot, Fig. 1 in Section 1. PreAttacK-send and PreAttacK++-send then converge to approx.AUC≈0.85 after a new account sends ≈25 friend requests.
Fig. 6 plots the AUC of the full (send+receive) version of PreAt-tacK versus the total count of friend requests sent+received by new accounts.Here, the additional information regarding the friend requests that new accounts receive permits PreAttacK and PreAt-tacK++ to obtain AUC≈0.9 after each new account sends + receives a total of 20 requests.Thus, they converge before the median fake account makes a friendship (i.e.accepted request) with a single real user (which requires a total of 29 requests-see Fig. 2).PreAttacK vs. Homophily.Interestingly, Homophily-send performs only slightly better 7 than random (Fig. 5), and Homophily (Fig. 6) is only moderately informative.The large gap between Homophily vs. PreAttacK suggests that PreAttacK's performance is driven by differences between real and fake users' preferences for individuals (i.e.CDPA), rather than by real and fake users' different preferences for in-class vs. cross-class friends (i.e.homophily).
PreAttacK vs. PreAttacK++.In both Fig. 5 and Fig. 6, PreAt-tacK++ (or PreAttacK++-send) offers a small-but-consistent performance improvement of ∼0.01-0.02AUC over PreAttacK (or PreAttacK-send), which is considered nontrivial in this competitive domain [30,47].We compared them and found that '++' versions detected additional fakes that were targeting only 'unpopular' existing users whose PA probabilities for both reals and fakes were both small (and thus less informative).

Evaluation 2 framework
Evaluation 2 compares PreAttacK and its variants to four state-ofthe-art network-based fake account detection algorithms: GANG [43], SybilRank, [49], SybilBelief [20], and SybilSCAR [44].These benchmarks are significantly more computationally intensive than PreAttacK, so we follow [11] and restrict the network to a single country of ∼1 million users.This makes it practically feasible to run benchmarks using their papers' original C++ code and parameters.We provide details in Appendix E.

Evaluation 2 Results
Fig. 7 plots the AUC of PreAttacK-send and benchmarks vs. the count of friend requests that new accounts send, and Fig. 8 plots the AUC of full PreAttacK vs. the total count of requests that new accounts send+receive.Consistent with their performance on the global Facebook network (Figs. 5 & 6), PreAttacK-send and 7 This suggests that the 'hardest-to-detect' new fake accounts in our evaluation set are savvy enough to avoid 'suspicious' friendships with other fakes.PreAttacK obtain an informative signal of new accounts' authenticity before the median fake obtains a friendship (accepted request) with a single human.In contrast, benchmarks perform poorly on new users, consistent with [11].We theorize this is because the current generation of new fakes do not exhibit sufficient homophily.GANG-s is a partial exception: it uses the directed network of friend requests (like PreAttacK) to obtain a useful AUC of 0.75-0.85,albeit with high variance (Fig. 7).However, unlike PreAttacK, GANG-s often misclassifies new users that receive many requests (Fig. 8).

CONCLUSION
In this paper, we have studied a principled algorithmic approach to address what we call the early detection paradox: mainstream algorithms to detect fake accounts rely on the same behaviors they seek to prevent, such as fake accounts' friendships and the content they share with others.To overcome this paradox, we show some of the first distributional analyses of how fake (and real) accounts send and receive friend requests after joining a major social network, before they have made friends or shared content.We show that these friend request behaviors evoke a natural multi-class extension to the preferential attachment model of social network growth.We leverage this model to derive a new algorithm PreAttacK, and we show that in relevant problem instances, PreAttacK nearoptimally approximates the posterior probability that a new user is fake.This approach also provides some of the first theoretic guarantees for fake account detection that do not rely on homophily assumptions.We conduct a variety of evaluations on the global Facebook network, and we consistently find that PreAttacK obtains informative classifications of new accounts before the median fake account succeeds in making a single friendship (i.e.accepted friend request) with a real user.We note that, while impressive, PreAttacK's AUC does not match state-of-the-art feature-based classifiers such as DEC, which eventually obtains AUC>0.98 on the set of all active accounts by leveraging ∼20,000 user-features that describe users' friendships and shared content [47].Instead, PreAttacK complements such methods by obtaining informative and interpretable early classifications before fake accounts can populate a user-feature vector, share content, or interact with others.

APPENDIX A DEFERRED ANALYSIS FOR INSTANCE-SPECIFIC BOUNDS
Lower bound   .We seek a worst-case factor   ≤ Pu /P * u that bounds PreAttacK's underestimation of the posterior probability that  is fake, which is useful in case  is fake (Section 4).The main difficulty is that a new user who sent/received requests before  may have had both positive and negative effects (i.e. via its different edges) on 's posterior.To sidestep the problem of trying all combinations of new users' latent labels, we bound the worst-case by supposing each new edge before 's edges contained a unique new 'phantom' user whose latent label was the worst-case for its respective edge.Thus, any new user's edge before 's edges that contained the same preexisting user as the one in 's edge gets a fake phantom user; any other such edge gets a real phantom user.We compute the exact posterior (eqn.12) using these phantom users' labels (in place of the latent ones) to obtain the desired bound.The probabilities for each of 's observed edges become: This expression for   requires no knowledge of new users' latent labels, and conveniently, it can be computed during the same single pass through new edges that we use to compute PreAttacK with no penalty in asymptotic complexity (Note also that the expression for   can be further factored as in eqn.12).
Upper bound   .We also seek a worst-case overestimate factor   ≥ Pu /P * u that bounds PreAttacK's overestimation of the posterior probability that  is fake, which is useful in case  is real.Similar to before, we bound the worst-case by computing the exact posterior (eqn.12) supposing each new edge before 's edges contained a unique new 'phantom' user whose latent label was the worst-case for 's posterior: Note that   is strictly decreasing, and   strictly increasing in the number of new users' edges before 's.

Figure 1 :
Figure 1: Distribs. of counts of Facebook friend requests sent by new fake accounts before a single real user accepts any among new fake accounts who eventually befriend a real user.Median at blue solid line; mean at dashed red line.

Figure 2 :
Figure 2: Distribs. of counts of friend requests sent + received.

2 4 2 5 (Figure 3 :
Figure 3: Mass right of =1 line represents { ,  } Facebook users who receive disproportionately more of real users' requests vs. fakes' requests by the factor on the  axis.In our problem setting, fake and real users' 'preferential attachment' to different individuals inspires a natural multi-class extension of the PA model, which we call CDPA:• Suppose we observe an arbitrary preexisting directed network of friend requests between existing users.Then, suppose some new fake and real users join this network.• New fakes and reals each send and receive friend requests to/from existing users who are chosen proportional to how many of the new user's  / class already did so.The CDPA model provides a principled foundation for a classifier that applies to new accounts.Specifically, recent research has highlighted various similar multi-class PA models as a theoretical mechanism for the emergence of homophily in social networks[3,4,27,29,54].As such, our CDPA model forms a natural antecedent to standard homophily-based fake account detection methods that are used to detect long-tenured fake accounts.We emphasize that we use CDPA to model the friend request networks of a small batch of new users; we do not assume that the entire network emerged from this process (which would be a far stronger assumption), nor do we assume that the (distinct) network of accepted friend requests (i.e.friendships) adheres to PA.

fraction of reals' requests sent by user)/(fraction of fakes' requests sent by user) density Fake Sender Real Sender Users' Sending Rates to Fakes vs. to Reals
11friend requests.The new requests sent by a relatively small batch of new fake and real accounts has only a negligible impact on this preexisting count.Therefore, the denominators in     + ,     + ,     − ,     − are well-approximated by their original values in  0 .These three key intuitions, which we formalize in Section 4, suggest we can obtain a good approximation for the posterior P * u by holding all PA probabilities fixed at their values in the preexisting requests network  ( , E 0 ,   ).With this change, we can approximate the probability of observing the 'th request that a new (fake or real) account sends or receives,     + ,     + ,     − , and     − , without knowing the labels of other new accounts.For example, for the 'sending' probabilities     + and     + : 4, 5, 8, and 9 to approximate the joint probabilities of all of user 's outgoing & incoming edges conditional on her real/fake label, P PreAttacK input Preexisting  ( ,  0 ,   ); new users  ; new requests   \ 0 , then computing her posterior (eqn.12).This approach is formalized in the PreAttacK algorithm: for  ∈  who receives a new request, ( : { →  } ∈   \ 0 ) Compute P + and P + for  ∈  who sends a new request, ( : { →  } ∈   \ 0 ) Compute P  − and P  Compute posterior Pu return [ P1 , . . ., P|U| ] In the most general case,  can take 8 values: 4 probabilities that a new { ,  } user  sends a request to any preexisting { ,  } user , which we denote by  + ℓ  →ℓ  and 4 probabilities that a new { ,  } user  receives a request from any preexisting { ,  } , denoted by  − ℓ  →ℓ  .Estimates of these probabilities are known or easily obtainable from historical data.