Benfords Law can detect malicious social bots
First Monday

Benford's Law can detect malicious social bots by Jennifer Golbeck



Abstract
Social bots are a growing presence and problem on social media. There is a burgeoning body of work on bot detection, often based in machine learning with a variety of sophisticated features. In this paper, we present a simple technique to detect bots: adherence with Benford’s Law. Benford’s Law states that, in naturally occurring systems, the frequency of numbers first digits is not evenly distributed. Numbers beginning with a 1 occur roughly 30 percent of the time, and are six times more common than numbers beginning with a 9. In earlier work, we established that Benford’s Law holds for social connections across online social networks. In this paper, we show that this principle can be used to detect bots because they violate the expected distribution. In three studies — an analysis of a large Russian botnet we discovered, and studies of purchased retweets on Twitter and purchased likes on Facebook — we show that bots’ social patterns consistently violate Benford’s Law while legitimate users follow it closely. Our results offer a computationally efficient new tool for bot detection. There are also broader implications for understanding fraudulent online behavior. Benford’s Law is present in many aspects of online social interactions, and looking for violations of the distribution holds promise for a range of new applications.

Contents

1. Introduction
2. Related work
3. Methodology
4. Results
5. Discussion and limitations
6. Conclusions

 


 

1. Introduction

Benford’s Law describes the frequency distribution of first digits of numbers in natural systems. While one might expect all first significant digits (FSDs) to be equally common, this is not the case. Numbers beginning with 1 occur about 30 percent of the time, logarithmically decreasing to numbers that begin with 9 occurring less than 5 percent of the time (Benford, 1938). The exact frequency P predicted for a digit d is given by the following formula:

P(d)=log (1+1/d)

In 2015, we showed that Benford’s Law is present in online social networks (Golbeck, 2015). Specifically, we found that the degree distribution of friend counts for all users on a platform, like Twitter or Facebook, follows Benford’s distribution. We also observed that for a given person, Benford’s Law was present in the degree distribution of their egocentric network; the FSDs of friend counts of a person’s friends matched Benford’s distribution. We looked at blogging sites, microblogs, and large social networking sites and found strong evidence for Benford’s Law on all.

Statistically, it is expected that some accounts will diverge from Benford simply due to random chance. On Twitter, we were able to investigate the nodes that had a poor correlation with the expected distribution. Out of 20,000 accounts we looked at, about 100 strongly violated Benford’s Law and every one of them was a Russian bot — automated accounts, based out of Russia, and socially connected to one another presumably to appear “legitimate”. That result inspired the research question behind this work: can a violation of Benford’s Law in social networks be used to detect malicious or fraudulent behavior?

We have spent the three years since that study tracking down this large Russian bot network using Benford’s Law and looking at other ways Benford’s Law can be applied to detect “fraudulent” behavior in social media. The key insight we rely on is that an account acting “fraudulently” is likely to create abnormal patterns of social connection, whether to other users or to content. In this paper, we show the expected pattern of connection would follow Benford’s Law. Then, we look at fraudulent behavior and show that it sharply diverges from the expected distribution.

We present three studies: a bot network, purchased Facebook likes, and purchased Twitter retweets. We use Benford’s Law to analyze friend counts and identify fraudulent accounts. In all cases, we found that the FSD distribution of friend counts follows Benford’s Law when the interactions are legitimate, and consistently violates the law when the interactions are fraudulent. While we do not assert that Benford’s Law will be violated in every case of malicious or fraudulent activity online, these three studies show that it can be a valuable tool that can detect unnatural behavior across platforms.

 

++++++++++

2. Related work

Social media bot detection has been addressed in many ways in the literature. Perhaps the most popular system to do so is BotOrNot (Davis, et al., 2016), available online at http://truthy.indiana.edu/botornot/. The tool uses over a thousand features that measure sentiment, content, network features, friends, and temporal characteristics.

The presence of this law has been observed in many diverse systems, including genome data, the stock market, and flow rates of streams. Applications of Benford’s Law have largely been related to fraud detection. Researchers have shown that the law can detect fraud in scientific results by looking at the FSD distribution of scientific regression coefficients. It has been used to detect election fraud through campaign finance violations and with electoral vote counts in the 2009 Iranian election. It is frequently used to detect financial fraud (Nigrini, 1996) and it is legally admissible evidence. See Golbeck (2015) for a deeper review of previous applications of Benford’s Law.

In 2015, we showed that Benford’s Law is present in online social networks (Golbeck, 2015). Specifically, we found that the degree distribution of friend counts for all users on a platform. Our work empirically tested this across many platforms, but it is also theoretically supported. Since social network connections follow a power law distribution (Barabási and Albert, 1999), adherence to Zipf’s Law and Benford’s Law often naturally follows.

Our approach in this work uses graph-based features to detect bots. There are a number of other bot detection techniques that also analyze the social network. Examining the social connections of bots, particularly to legitimate users can help with detection (Cao, et al., 2012; Alvisi, et al., 2013). Some algorithms have carried this further into the network, looking at the communities within the graph and patterns of connections to those communities (Viswanath, et al., 2010). However, it turns out real users will often indiscriminately accept friend requests — including those from bots — or they will be duped into friending a bot (Boshmaf, et al., 2013), so friendships with legitimate users are not enough to indicate a legitimate account.

Research into specific features that characterize bots have taken place in several major categories: graph-based, crowdsourced, and feature-based. For an overview, of this broader space of work, see Ferrara, et al. (2016).

 

++++++++++

3. Methodology

An account on social media has a friend count. Benford’s Law describes a distribution of FSDs, not an expected value for a single number like friend count. To analyze the way Benford’s Law applies in social networks, we look the friend count for each of the account’s friends. This is described in depth in Golbeck (2015), but the core idea is that an account’s friends’ friend counts should follow Benford’s expected distribution. We know that friend counts in social networks follow this distribution, and it would likely follow that a person’s friends who are drawn from this distribution would also follow it. We demonstrated that this is the case in Golbeck (2015). We found for five major social networks (Facebook, Twitter, Google Plus, Pinterest, and LiveJournal), individuals’ social networks, the vast majority of users had friends’ friend counts that adhered to Benford’s Law. On Google Plus, 91.5 percent of users’ egocentric networks’ FSD distributions had a correlation of over 0.9 with Benford’s Law predictions. This was true for 85.1 percent of LiveJournal egocentric networks. In the Twitter data, 89.7 percent of users had a correlation of over 0.9. Of our 21,135 users, only 170 (<1 percent) had a correlation under 0.5.

Roughly 30 percent of friends should have friend counts beginning with a 1, and only around 5 percent should have friend counts beginning with a 9. Using friends’ friend counts is particularly intriguing for bot detection since a human cannot control their friends’ friend counts while bot networks can when the friends are other bots.

The hypothesis behind this work is that bots will make social connections in unnatural patterns that violate Benford’s Law. But what counts as a “violation” of Benford’s Law? In large scale systems like social media, there will certainly be normal statistical variation in connection patterns such that totally legitimate accounts will be statistically significantly different from what Benford would predict. If we were to run statistical tests and correct the (the significance level) value to account for the large number of tests (e.g., with a Bonferroni correction) then the likelihood of finding significant differences becomes extremely small. Most importantly, finding statistically significant variation from Benford’s distribution is not the standard for a Benford “violation”, nor is it the goal of this work. Our research question is simply whether Benford’s Law may be useful in detecting malicious or fraudulent behavior.

Thus, we have chosen to use a statistical test in a slightly different way. For a social media account, we compare the distribution of FSDs in an account’s network to the expected Benford’s Law distribution using the Pearson chi-square test. This is the most common test used to analyze goodness-of-fit with Benford’s Law (e.g., Durtschi, et al., 2004; Nigrini and Miller, 2009; Nigrini and Mittermaier, 1997; Swanson, et al., 2003).

The p-value returned by these tests is treated as a measure of how well the connections adhere to Benford’s Law. High p-values indicate a node’s FSD distribution is closely matched to the Benford distribution. Very low p-values indicate the match is poor. We do not correct the p-values or interpret them as measures of statistical significance; rather, we treat them as approximations of the goodness of fit.

 

++++++++++

4. Results

Study 1: Uncovering a Russian bot net on Twitter

In our previous study on Benford’s Law in social networks, the only Twitter accounts that had a very poor correlation between the FSD distribution of their friends’ friend counts and the expected distribution under Benford’s Law were part of the same Russian bot network (Golbeck, 2015). When we examined these accounts on Twitter, they all looked basically the same. They posted automatically with Russian text that was appeared to be fragment sentences taken from existing documents like novels and technical manuals. To a human analyst, this was clearly automated activity coming out of Russia. In our previous work (Golbeck, 2015) a team of five researchers qualitatively analyzed the posts from these accounts and developed a template to describe the core tweet structure. They then used that structure to verify all the accounts appeared to be operating based on the same template. Analysis of the JSON data for these tweets showed that all were posted by custom Twitter apps connected with various .ru (Russian) domains. In Study 1 of this current research, we probed this network more deeply to determine how accurately Benford’s Law could detect such a network.

1) Crawling the bot network: We re-checked the 114 Russian bot accounts that we discovered in our 2015 study to see if any were still active. Twelve accounts were still active on Twitter while the rest had been suspended. We knew from our previous work that these accounts followed many other Russian bots with the same behavior. We began by looking at those social connections to explore the extent of that bot network. We collected the friends list for each of the active bot accounts we discovered in 2015. These friends became our Suspect Set. For each suspect, we examined the accounts by hand and identified those that were posting in the same automated way as our seed set. If they did, we crawled their friends and added them to the Suspect Set.

As we grew the bot network, we discovered another common trait among the bots. These accounts were all posting from one of a handful apps based out of Russia. The posting platform for each tweet appeared in the metadata provided by the Twitter API. “Posting platform” on Twitter indicates the app or platform a user used to post a tweet. That could be the Twitter app for iPhone, the Twitter Web interface, a third-party platform like TweetDeck, or an app that a user created to post tweets for them. For example, if someone writes a perl script that tweets the score of a baseball game, they would have to register that script with Twitter, including a name and domain, and it would appear as the posting platform.

The URL of the posting platform apps ended in .ru domains and among the bots, there were only a few different platforms present. Over the course of the project, we identified 30 different app/URL platforms that were used by this botnet. We hypothesized that these platforms were created by the operators of the botnet and used only to control these bots and not for any legitimate purposes. We tested this hypothesis by randomly selecting 16,000 Twitter accounts from the Twitter garden hose stream and checking the posting platforms of their 200 most recent tweets. The only accounts we found posting from one of these Russian apps were indeed part of the Russian bot network. They were posting automated, fragment sentences from novels or technical manuals like the other bots. These results suggested that the apps we identified were not in use by legitimate accounts. Thus, if we found an account in our crawl that posted from one of these platforms, we classified it as a bot.

We used a breadth-first search approach to exhaustively crawl the Suspect accounts’ friends and follower lists, adding any new accounts we discovered that were posting from the known bot-controlling platforms. In the end, we found a total of 13,948 accounts that were part of this Russian bot net. This does not include accounts that had been suspended by Twitter. Given that only 12 of 114 bots we originally discovered were still active, it suggests the network may have been an order of magnitude larger at some point.

2) Comparing bot distributions to Benford’s Law: As predicted, almost all the bots we detected here diverged sharply in their connection patterns from what Benford’s Law would predict. For each bot in our final set, we collected its friend list and then obtained the friend count for each friend. We built the FSD distribution of friend counts for each bot and compared this to what Benford would predict using the chi-square test.

Of the 13,948 accounts we discovered, 111 were suspended between the time we discovered them and the time we began collecting friend data. Another 228 had fewer than 100 friends and we eliminated those to ensure stronger statistical power from the chi-square test. Of the remaining 13,609, 13,554 (99.6 percent) had a p-value of <0.05 in our chi-square test, indicating an extremely poor fit with the Benford distribution. Compare this to the randomly selected accounts we analyzed in our 2015 study, where only 27.3 percent (5,616 out of 20,572) had a chi-square p-value of <0.05, and we know bots were included among those sampled accounts. Varol, et al. (2017) suggests that 15 percent of Twitter accounts may be bots.

A. Study 2: Detecting retweet fraud

For only a couple dollars, anyone can buy hundreds of Twitter retweets. Some people think this will help their tweets be seen more widely (though in reality it generally only shows the tweets to the network of paid retweeters). More retweets can also create the perception of increased engagement, and this may make a tweet appear more impactful or popular. For example, a musician who regularly receives over 100 retweets may appear more established and popular than one who receives a few.

The process of purchasing retweets or likes is straight-forward. From the Web site of the provider, a buyer selects the number and type of interactions they want to buy (e.g., 100 retweets) and provides a link to the content. Then, they are directed to pay the fee via Paypal. Once the payment is processed, the provider has the bots under their control interact with the linked content.

In this study, we set out to investigate if Benford’s Law could detect retweet fraud. Our original work looked at unnatural patterns of social connection through following and friend relationships, and retweet fraud is a minor variation on that idea; here, we are looking for unnatural patterns of social connection among accounts who have interacted with given content. As with the Russian botnet in Study 1, we analyzed the FSD distribution of friend counts among accounts who automatically retweeted a post. We know that on Twitter, the FSD of friend counts among all users follows the predicted Benford’s Law distribution (i.e., about 30 percent of Twitter users have a friend count that begins with a 1, and about 4.6 percent have a friend count that begins with a 9). This is also true for the friends of a given user, as discussed above. Thus, we would expect the FSD of friend counts of a tweet’s legitimate retweeters would also generally follow Benford’s distribution. If that is not true when the retweets are purchased, it means Benford’s Law can be used to identify a tweet that has likely purchased its retweets.

To test this, we purchased 100 retweets for each of 143 tweets that we authored and posted on a new Twitter account. Once the automated retweets came through — generally within a few minutes of our US$2 payment — we collected the usernames of the retweeting accounts. We then obtained the friend count for each of these users. This allowed us to build a friend count FSD distribution for the retweeters of each tweet. Then, we used the same chi-square process as described above to estimate how well the distribution adhered to what Benford’s Law would predict.

Out of the 142 tweets, 140 had a p-value of <0.05 (98.6 percent). This is vastly different from what we would expect of the accounts were pulled from the general population. To further validate this, we collected a set of 148 tweets with over 100 retweets that were obtained legitimately. These were often tweets like engagement photos, jokes or popular memes, or celebrations of personal events. We contacted the tweet authors to verify they had not purchased the retweets. The authors looked at the retweet activity and confirmed it looked generally like they would expect, with retweets from followers or accounts that did not seem too distant. Among these tweets, when we calculated chi-square to compare the friend count distribution to Benford’s distribution, only 20 of 147 (13.6 percent) had a p-value of <0.05.

While these samples were necessarily small, due to the cost of buying retweets and the effort of verifying legitimately popular retweets, they are a good first indication that Benford’s Law can be helpful in uncovering paid retweet activity.

B. Study 3: Detecting like fraud on Facebook

To illustrate that this method works outside of Twitter, we put it to work on Facebook as well. Like on Twitter, it is easy to buy interactions with Facebook posts. For a few dollars, anyone can buy hundreds of likes for a Facebook post or image. In this study, we replicated the methodology from Study 2. We purchased Likes for 67 Facebook posts that we made with the author’s account. The posts were hidden from the author’s friends with privacy settings to ensure none of them accidentally received “real” likes. We also collected 52 posts with over 100 likes that were obtained legitimately, again checking with the posts’ authors to confirm they did not buy likes and that the likers appeared legitimate.

We then collected the friend counts for the real and purchased likers. Note that the Facebook API does not easily allow the collection of a post’s likers or those likers’ friend counts. Thus, we collected this data mostly by hand with the help of a few browser scripts as to not violate the Facebook terms of service. This also explains why we have relatively small sample sizes for this Facebook study.

Among the posts with purchased likes, the FSD distribution of likers’ friend counts was generally quite different than what Benford would predict. After running the chi-square test, 52 of 67 (77.6 percent) had a p-value of <0.05. Among the naturally liked posts, only 20 of 52 (38.5 percent) had p-values under 0.05. While these results are not quite as strong as those we saw on Twitter, they still illustrate that Benford can give a signal of paid likes on Facebook.

 

++++++++++

5. Discussion and limitations

It can be tempting to focus on the statistics alone when trying to understand how well Benford’s Law works here, but the reasons behind why it is an effective tool are worth discussing. Bots automatically form connections and they are often connecting to one another in order to avoid detection from graph-based bot detectors like those discussed in earlier. Having many social connections to other bots, which are themselves programmed to do the same thing, means the FSD distribution can be quite different from one that would occur naturally. Even when the bots connect to some legitimate accounts having many other bots in the network is still often enough to seriously change the FSD distribution.

So while it is not surprising that bots violate it, these results show that Benford’s Law — a simple and computationally efficient measure to calculate — can be used as a strong predictor of bot-ness. This is far from the only work on bot detection, but this adds a new tool to include in bot detectors. We also acknowledge that publication of this fact will allow bots to adjust their relationship formation patterns to match what Benford would predict. However, that significantly increases the complexity that goes into the bot development — something not all developers will be willing to do.

Also, Benford’s Law is likely to manifest in many other aspects of a bot’s profile. For example, bots post content and other bots often interact with that content, liking or sharing it. We did not investigate the FSD of friend counts among accounts that retweeted or liked a bot’s own posts, but Studies 2 and 3 suggest interactions with posts should follow Benford’s Law. Automated activity makes that more difficult to achieve. Accounting for all the places this distribution is likely to show up can become a time-consuming and burdensome task. In many ways, detecting bots is a similar task to detecting spam in that it is likely to follow an escalating “arms race” between bot developers and bot hunters. If Benford’s Law is used only in these intermediate days of bot hunting and eventually falls to the wayside as the technology advances, we believe that this work is still usefully pushing forward the state of the art.

Benford’s Law is closely related to other distributions, including Zipf’s Law and the Pareto Distribution. These distributions are also worth considering in future work as possible detectors of fraudulent behavior. Because we know these distributions are often present in complex networks, like social networks, there is promise that they can also provide useful data for detecting bots and other automated activity. With a suite of easy-to-calculate measures, this could substantially increase the difficulty of bots to engineer realistic distributions while adding very little overhead to detection analysis.

Beyond our application to bot detection, we believe this technique has promise for broader applications. Benford’s Law is present in many natural systems, including social systems. We expect future work will show the pattern holds in many aspects of social interaction, and deviations from Benford will be useful in detecting malicious or unnatural behavior beyond bots.

One of the main limitations of this work is sample size. The retweet and like bot samples were quite small due to cost and time required to validate the data. While our Russian botnet was much larger, it still took months of code runtime to pull all the data we needed under Twitter API rate limitations. Future work in this space will likely require partnering with a social media company to gain realistically fast access to enough data to test this on a very large scale. The authors did reach out to several major social media companies about partnering on this work, but did not receive any positive replies. We believe the results here, especially taken together, provide strong evidence that Benford will be a useful tool, but characterizing how useful will require further work with more data.

 

++++++++++

6. Conclusions

In this study, we demonstrated the Benford’s Law can be used to detect fraudulent behavior on social media. Using the first significant digit of friend count distributions, we showed that bot activity diverges from the expected Benford distribution the vast majority of the time, while legitimate accounts diverge infrequently. We showed this in a large Twitter-based Russian bot net, among retweet bots on Twitter, and among like bots on Facebook. While the signal given by Benford’s Law is not strong enough to be used entirely on its own, it appears that it can be an effective tool in the fight against malicious automated activity online. End of article

 

About the author

Jennifer Golbeck is Professor and Director of the Social Intelligence Lab in the College of Information Studies at the University of Maryland.
E-mail: jgolbeck [at] umd [dot] edu

 

References

L. Alvisi, A. Clement, A. Epasto, S. Lattanzi, and A. Panconesi, 2013. “SoK: The evolution of Sybil defense via social networks,” SP ’13: Proceedings of the 2013 IEEE Symposium on Security and Privacy, pp. 382–396.
doi: https://doi.org/10.1109/SP.2013.33, accessed 28 July 2019.

A.-L. Barabási and R. Albert, 1999. “Emergence of scaling in random networks,” Science, volume 286, number 5439 (15 October), pp. 509–512.
doi: https://doi.org/10.1126/science.286.5439.509, accessed 28 July 2019.

F. Benford, 1938. “The law of anomalous numbers,” Proceedings of the American Philosophical Society, volume 78, number 4, pp. 551–572.

Y. Boshmaf, I. Muslukhov, K. Beznosov, and M. Ripeanu, 2013. “Design and analysis of a social botnet,” Computer Networks, volume 57, number 2, pp. 556–578.
doi: https://doi.org/10.1016/j.comnet.2012.06.006, accessed 28 July 2019.

Q. Cao, M. Sirivianos, X. Yang, and T. Pregueiro, 2012. “Aiding the detection of fake accounts in large scale social online services,” NSDI ’12: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, at https://www.usenix.org/node/162978, accessed 28 July 2019.

C.A. Davis, O. Varol, E. Ferrara, A. Flammini, and F. Menczer, 2016. “BotOrNot: A system to evaluate social bots,” WWW ’16: Companion Proceedings of the 25th International Conference Companion on World Wide Web, pp. 273–274.
doi: https://doi.org/10.1145/2872518.2889302, accessed 28 July 2019.

C. Durtschi, W. Hillison, and C. Pacini, 2004. “The effective use of Benford’s law to assist in detecting fraud in accounting data,” Journal of Forensic Accounting, volume 5, number 1, pp. 17–34.

E. Ferrara, O. Varol, C. Davis, F. Menczer, and A. Flammini, 2016. “The rise of social bots,” Communications of the ACM, volume 59, number 7, pp. 96–104.
doi: https://doi.org/10.1145/2818717, accessed 28 July 2019.

J. Golbeck, 2015. “Benford’s law applies to online social networks,” PloS ONE, volume 10, number 8 (26 August), p. e0135169.
doi: https://doi.org/10.1371/journal.pone.0135169, accessed 28 July 2019.

M.J. Nigrini, 1996. “A taxpayer compliance application of Benford’s Law,” Journal of the American Taxation Association, volume 18, number 1, pp. 72–81.

M.J. Nigrini and S.J. Miller, 2009. “Data diagnostics using second-order tests of Benford’s Law,” Auditing, volume 28, number 2, pp. 305–324.
doi: https://doi.org/10.2308/aud.2009.28.2.305, accessed 28 July 2019.

M.J. Nigrini and L.J. Mittermaier, 1997. “The use of Benford’s Law as an aid in analytical procedures,” Auditing, volume 16, number 2, pp. 52–67.

D. Swanson, M.J. Cho, and J. Eltinge, 2003. “Detecting possibly fraudulent or error-prone survey data using Benford’s Law,” Proceedings of the Section on Survey Research Methods, American Statistical Association, pp. 937–941.

O. Varol, E. Ferrara, C.A. Davis, F. Menczer, and A. Flammini, 2017. “Online human-bot interactions: Detection, estimation, and characterization,” Proceedings of the Eleventh International AAAI Conference on Web and Social Media, pp. 280–289, and at https://aaai.org/ocs/index.php/ICWSM/ICWSM17/paper/view/15587/14817, accessed 28 July 2019.

B. Viswanath, A. Post, K.P. Gummadi, and A. Mislove, 2010. “An analysis of social network-based Sybil defenses,” ACM SIGCOMM Computer Communication Review, volume 40, number 4, pp. 363–374.
doi: https://doi.org/10.1145/1851275.1851226, accessed 28 July 2019.

 


Editorial history

Received 4 July 2019; revised 27 July 2019; accepted 28 July 2019.


Copyright © 2019, Jennifer Golbeck. All Rights Reserved.

Benford’s Law can detect malicious social bots
by Jennifer Golbeck.
First Monday, Volume 24, Number 8 - 5 August 2019
https://www.firstmonday.org/ojs/index.php/fm/article/view/10163/8063
doi: http://dx.doi.org/10.5210/fm.v24i8.10163





A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2019. ISSN 1396-0466.