A persistence paradox
First Monday

A persistence paradox

It is widely believed that persistence in most endeavors is essential to their success. By its very nature, persistence enhances both the quality of an outcome and its probability of success because people are willing to endure failures before achieving a desired goal. To test this hypothesis on a massive scale, we studied the production histories and success dynamics of 10 million videos uploaded to a popular video Web site. Our results reveal that while the average quality of submissions does increase with the number of uploads, the more frequently an individual uploads content the less likely it is that it will reach a popularity threshold. These paradoxical findings, which hold both at the aggregate and individual levels, throw light on the act of production in the attention economy.



Anyone trying to seek attention to a new idea, scientific result or media creation faces the challenge of having it attended to by a sizable audience. This age–old problem has become exacerbated with the advent of the Web, which allows content to be easily published by millions of people while increasing the competition that it faces for the attention of users [1]. Worse, attention over content is distributed in a highly skewed fashion, implying that while a few items receive a lot of attention, most receive a negligible amount (Huberman, 2001; Klamer and Van Dalen, 2002; Wu and Huberman, 2007). And yet people persistently upload content to social media sites, hoping for the highly unlikely outcome of reaching a wide audience. As in other endeavours, in the attention economy (Falkinger, 2007; Franck, 1999; Goldhaber, 1998; Hirshleifer and Teoh, 2006) persistence is construed by content producers as a key to their success.

In order to study the correlation between persistence and success, we analyzed data from YouTube, the most popular video Web site on the Internet where users can upload, view, and share video clips for free. In environments such as YouTube, the notion of success is measured by the amount of attention each video receives (Huberman, et al., 2009). Although YouTube serves more than 100 million video downloads per day on its site, neither the individual producers nor the Web site extract direct revenue from this business. What they gain instead is attention, which could in principle be transformed into revenue through advertisement or some other efforts. If we view YouTube as a producer of videos that outsources its production to millions of individual producers, then the customer life time value of each producer should be estimated by the total amount of attention generated by his or her uploaded videos. And while the motivations of individuals to upload videos are more diverse than those of the service provider, they nevertheless enjoy the attention they receive regardless of the number of downloads. For example, there are those who upload a video in order to share it with a number of family members and friends, and while getting their attention to their video is their intent, popularity in terms of a very high number of downloads is not. On the other hand, there are many others who do cherish high attention income in the form of large number of downloads for their videos, including those who attempt to make substantial profits through the YouTube partner program (Stelter, 2008).

Their Web site’s interface displays a view count number right next to each video title which measures how many times the video has been watched since it was uploaded. As a video is watched by more users, its view count grows accordingly. If a video has eventually been viewed a substantial number of times, it is promoted to YouTube’s front page, where it can reach a larger audience and receive even more attention.

Our data set contains 9,896,784 videos submitted by 579,470 producers as of 30 April 2008. For each video we obtained its date stamp, the uploader’s id, and its view count as of 30 April 2008. While typically videos get viewed a several thousand times, the top ones can receive more than one million views. Figure 1 plots the average and 99 percent quantile of view counts for videos submitted in each week. A declining trend is evident, showing that older videos tend to receive more view counts than newer videos, simply because they have been on the Web site longer. Thus, while the absolute view count offers some quantitative measure of the “attractiveness” of each video, it only makes sense to compare the view counts of two videos submitted not far apart in time.


Figure 1a: Average view count of videos uploaded in different weeks
Figure 1a: Average view count of videos uploaded in different weeks. The horizontal axis represents the number of weeks from 1 January 2006. The vertical axis marks the average view count of videos uploaded in the same week. All videos’ view counts are measured on 30 April 2008. As can be seen, older videos tend to receive more view counts than newer videos, because they have been on YouTube for a longer time.
Figure 1b: 99 percent quantile of videos uploaded in the same week, again exhibiting a declining trend
Figure 1b: (b) 99 percent quantile of videos uploaded in the same week, again exhibiting a declining trend.


We used a nonparametric procedure to remove the time dependence of view counts in our dataset. We defined the popularity of each video to be the quantile of its view count among all videos submitted in the same week. For example, a video with popularity 0.7 receives more view counts than 70 percent of the videos submitted in the same week. By mapping each video’s absolute view count into this relative popularity measure, statements such as “Alice’s fifth video is more popular than her fourth video” become meaningful [2]. All of our analysis is based on this relative popularity measure instead of absolute view count.

It is well known that in many information–rich economies including YouTube, the amount of attention received by each item follows a skewed distribution with a long tail (Wu and Huberman, 2007). In particular, it has been established that on YouTube the top one percent of videos are viewed several hundred more times than the bottom 50 percent combined, and about fifty times more often than the bottom 80 percent combined (Hancock, 2009). Thus, it is important for YouTube to identify the top contributors whose videos generate vastly more attention than all others.

In what follows, we will say that a video is a hit if its view count exceeds a popularity threshold. For ease of presentation we will concentrate on a particular value of the threshold, in our case one percent, but our results are valid for a wide range of threshold values (from 0.1 percent to 10 percent). We also define a producer to be a hit producer if at least one of her uploaded videos crosses the threshold.

By definition one percent videos in our dataset were hits. 7.5 percent of the 579,470 producers, or 43,508 producers, uploaded at least one hit video before 30 April 2008 and hence belong to the set of hit producers. We observed that about 21.0 percent of all hit producers earned the hit label with their first video. Given that only 5.9 percent of all videos are first submissions, this number is impressively large. What is even more impressive is that about 34.1 percent of the hit producers actually produced a hit in their first week (i.e., at least one of the videos they uploaded in their first week received top one percent popularity within that week). Figure 2 plots the distribution of the index and week of the hit producers’ first hit. Again, one can see from the figure that a considerable proportion of hit producers produced a hit early on, with only a few submissions within the first few weeks. In particular, more than half of the hit producers achieved a hit while uploading the first five videos and more than half of those achieved a hit within six weeks.


Figure 2a: Fraction of hit producers whose first hit video is their k'th submission
Figure 2a: (a) Fraction of hit producers whose first hit video is their k’th submission.
Figure 2b: Fraction of hit producers whose first hit was submitted in the k'th week since their first submission
Figure 2b: Fraction of hit producers whose first hit was submitted in the k’th week since their first submission.


We next studied how the popularity of submitted videos changes over time. One might think that a producer’s latter videos would tend to find more favor than earlier ones, especially when one considers the many learning advantages producers obtain from past submissions: improved video creation skills, better knowledge about the audiences’ taste, better marketing techniques, etc. [3] If this were true, YouTube should value its more experienced contributors more than newcomers.

To test this hypothesis we measured the popularity dynamics of each contributor’s videos at the aggregate level. We calculated the average hit ratio for all videos with submission index k (i.e., the k’th uploaded video) and plotted it as a function of k in Figure 3(a). Surprisingly, the Figure exhibits a clear decreasing trend, indicating that on average older contributors are less likely to create hits than new contributors. What is more surprising is that when we plot the average rating of videos with different submission index, an increasing trend is observed. As Figure 3(b) clearly illustrates, producers on average receive higher ratings for their later videos, while getting decreasing hit ratios with increasing number of submissions. If the quality of videos increases up, why should they receive less attention?


Figure 3a: Fraction of hit producers whose first hit video is their k'th submission
Figure 3a: (a) Average hit ratio of videos from those who submitted no less than five videos.
Figure 3b: Fraction of hit producers whose first hit was submitted in the k'th week since their first submission
Figure 3b: Fraction of hit producers whose first hit was submitted in the k’th week since their first submission.


Before we attempt to resolve this paradox, we rule out some possible explanations. First, the declining trend in Figure 3(a) is not caused by an ever increasing number of contributors and submissions. This is because the threshold we picked to define a “hit” is set at a fixed quantile which is independent of the total number of videos. Second, the trend is not a simple artefact of producer heterogeneity. To see this, consider two types of contributors: “professionals” who produce high–quality videos with high hit ratios, pH, which are constant over time, and “amateurs” who produce poor quality videos with low hit ratios, pL, that are also constant over time. Suppose professionals produce at a lower rate than amateurs, so that at any fixed time videos with a very large submission index are most likely to be submitted by amateurs, while videos with a small submission index may come from both amateurs and professionals. Such a model could account for the declining trend in Figure 3(a) without assuming any time dependence in the hit ratios. To show that this is not the case however, in Figure 4 we plot the empirical hit ratio for the first k videos from those who submitted no less than k videos, for k=5 and 10. As can be seen, the declining trend still persists.


Figure 4a: Average hit ratio of videos from those who submitted no less than five videos
Figure 4a: (a) Average hit ratio of videos from those who submitted no less than five videos, as a function of the submission index.
Figure 4b: Average hit ratio of videos from those who submitted no less than 10 videos
Figure 4b: Average hit ratio of videos from those who submitted no less than 10 videos, as a function of the submission index.


We now turn our focus from the service provider (YouTube) to the actual content producers. In many cases the goal of individual producers seems to align with YouTube, which is to attain a large level of attention for their videos.

According to the dictionary, the word “persist” is defined as “to go on resolutely or stubbornly in spite of opposition, importunity, or warning.” In our setting this corresponds to those producers who keep uploading despite the fact that their previous uploads never achieved the top one percent popularity.

Formally, let us define a producer’s persistence level to be the number of failures he/she is willing to endure before the first hit. Thus, a producer with persistence level k (k≥0) would upload no less than k videos before coming to a stop, and would refrain from uploading the (k+1)’th video if all of his/her first k attempts fail. We also say that user A is more/less persistent than user B if A’s persistence level is higher/lower than B.

Notice that there are two reasons why a producer’s persistence level is not a directly observable quantity within our dataset. First, our data ends on 30 April 2008. If a producer uploaded k videos before 30 April 2008 and all of them failed, we do not know whether she would still upload a (k+1)’th video after 30 April 2008. Second, if a producer failed k times and then got a first success, all we can infer is that her persistence level must be no less than k. In fact, if a producer was lucky enough to have succeeded in the first video, we cannot make any inferences as to her persistence level.

Although it is not possible to measure the persistence level of each individual producer, one can measure the conditional hit probability or hazard function of the whole population, which is defined as the conditional probability that the k’th attempt is a hit given the first k-1 are not. For our data this amounts to


Equation 1 (1)


Figure 5(a) plots the conditional hit probability for YouTube producers. We see that h(k) declines with the submission index, indicating that later submissions are less likely to be hits. This result is somewhat surprising for one might expect the contributor to learn from past experiences and improve the chance to succeed. As the data shows, this is not the case.


Figure 5a: The conditional hit probability of YouTube producers
Figure 5a: (a) The conditional hit probability of YouTube producers. The blue line is the conditional winning probability of a lottery with 0.01 winning probability.
Figure 5b: The conditional hit probability of YouTube producers
Figure 5b: The red line plots the success probability of YouTube producers as a function of their persistence level. The blue line plots the success probability of a producer who participates in a lottery with 0.01 winning probability for each draw, as a function of her persistence level.


A possible explanation of this surprising result could stem from the often implicit, but crucial assumption, that improvements along the way are what make persistence the cause of success. Indeed, if the same mistake is repeated over and over again, what is the point of being persistent? In order to test whether success due to persistence with improvement is more than a matter of luck we compare our results with the benchmark case of a lottery with constant winning probability. While someone with no budget constraints can try to win a lottery as often as possible, the number of attempts will neither improve nor worsen the winning probability of the next lottery purchase. In this sense the lottery can be regarded as a game of “pure luck”, in which persistence does not play a role in determining winning probability.

In order to compare our results from YouTube with the benchmark case of a lottery, we plot in Figure 5(a) the conditional winning probability of a lottery with winning probability 0.01, where 0.01 is the fraction of hit videos in our study of YouTube. Naturally the conditional winning probability for such a lottery is always 0.01, independent of past history. As can be seen in the figure, the conditional hit probability for YouTube is worse than the lottery from the second video thereafter.

More insight into this result can be obtained by calculating the empirical hit probability p(k) of a producer with persistence level k, i.e., the probability that the producer hits at least once before he/she quits. It is not hard to show that for the lottery case p(k)=1-0.99k, while for YouTube it is given by Equation 2, where h(k) is estimated by Equation (1). These two functions are plotted in Figure 5(b). As can be seen, the empirical hit probability for a YouTube producer is lower than that of a lottery buyer regardless of the persistence level. In particular, while the success probability of a lottery buyer with persistence level 100 is 63.4 percent that of a YouTube producer with the same persistence level is only 22.6 percent. Thus, the benefit of persistence on YouTube is much worse for content producers than that of a lottery.

We point out that while this analysis of empirical conditional probabilities at the aggregate level is standard in the literature, it is not without its flaws. In particular, under certain circumstances the decline in the conditional hit probability may be due to population heterogeneity and not because there is any underlying time dynamics at the individual level (Fader and Hardie, forthcoming). Thus, the analysis should only be treated as a benchmark model which assumes that the population under study is homogeneous at large. This well–known “ruse of heterogeneity” (Vaupel and Yashin, 1985) can be resolved by analyzing our data at a refined individual level, as we do now.

We want to test if each individual contributor follows the same dynamics as the whole population, i.e., her later videos tend to receive less popularity than the earlier ones, even though their quality does improve. We start our analysis with the first two submissions. For each contributor who has submitted no less than two videos, we measured whether the popularity (measured by the quantile of view count) and average rating of her first two videos increase or decrease. Since there are four possible outcomes, we counted the number of contributors falling into each category. The result is listed in Table 1. As one can see, the majority of contributors fall into the lower–left quadrant, i.e., the popularity of the uploaded videos decreases even though their quality improves.


Table 1: Change in popularity and average rating for all contributors who have submitted at least two videos.
 Rating increasesRating decreases
Popularity increases64,13167,663
Popularity decreases82,41268,540


In order to test the robustness of this result, we repeated the measurement for all 105 ordered pairs {(i, j) : 1 ≤ i < j ≤ 15} in addition to the first two. We found out that the lower–left quadrant dominates in all 105 tables, and that the sum of the bottom row (popularity decreases) is always larger than the sum of the top row (popularity increases). Also, for those who submitted no less than 2n videos, we measured whether the median popularity of their first n videos (denoted by pm1) is greater or less than the median popularity of her second n videos (denoted by pm2). We found that the majority (i.e., more than half) of contributors have pm1 > pm2, for all n = 1, …, 10. From these facts it is safe to conclude that if we randomly draw a contributor from the population, it is more likely that his later videos will receive a lower popularity than his earlier ones.

Next we studied how each producer’s chance of producing a hit evolves with time. In particular, we examined whether it took each contributor more or fewer attempts to achieve a second hit than a first one. Our data set contained 43,508 hit producers. Suppose a producer achieved her first hit after n1 attempts (i.e., her n1’th video is her first hit). Furthermore, assume that she would receive a second hit after n2 attempts, where n2 may or may not be directly observable in our data set. If we consider her uploads after n1 there are three possibilities: (1) She uploaded no less than n1 videos after the first hit and did not achieve a second hit with those n1 videos, in which case we know that n2 > n1. (2) She achieved a second hit with no more than n1 additional videos, in which case we know that n2n1. (3) She uploaded less than n1 videos after the first hit and still had not achieved a second hit, in which case the available information is not enough for us to deduce the relationship between n1 and n2. The number of hit producers falling into these three categories is 23,620, 9,146 and 10,742, respectively. We see that category 1 is more than twice as large as category 2 and larger than category 2 and 3 combined. Thus, even if all producers in category 3 end up achieving a second hit in no more than n1 attempts (which is highly improbable), overall it takes the majority of contributors more attempts to achieve a second hit than the first hit, if a second hit is achieved at all. In other words, on average it is harder for each contributor to achieve a second hit than a first hit.

In this paper we have shown that in a pure attention economy (Falkinger, 2007) the more frequently an individual uploads content the less likely it is that it will reach a success threshold. In contrast to the extensive literature on attention at the individual level (Kahneman, 1973), this persistence paradox is a collective effect rather than a single individual one, since the success we measured is a competitive outcome. Furthermore, this paradoxical result is compounded by the fact that the average quality of submissions does increase with the number of uploads, with the likelihood of success less than that of playing a lottery.

The paradox itself does not seem to have at present a definitive explanation. A possible one is that as producers submit several videos over time their novelty, and hence their appeal to a wide audience, tends to decrease. As an example, the all time no. 1 YouTube video, “Evolution of Dance” by judsonlaipply achieved more than 122 million views as of July 2009. This happens to be the author’s first video. Encouraged by the huge success the author released two subsequent videos “EOD2 Info” and “Stop Pain Start Dancing” both on the same subject “dancing”, but neither of them received more than 250,000 views.

Finally there is the behavioural issue of why people persist in light of the small odds of making it in the charts. A possible answer is that in many other instances, individuals overestimate the true probabilities of winning when they are small, thus increasing their efforts while their chances of becoming successful decrease over time. This is similar to the long shot anomalies observed and discussed in the context of gambling (Thaler and Ziemba, 1988). While in those cases plausible explanations have been suggested, we do not have enough behavioral data from content providers to elucidate their motivations. End of article


About the authors

Fang Wu is a Researcher, Social Computing Lab, Hewlett Packard Laboratories.

Bernardo A. Huberman is Senior Fellow and Director, Social Computing Lab, Hewlett Packard Laboratories.
Direct comments to bernardo [dot] huberman [at] hp [dot] com



1. This is because information–rich regimes are characterized by keen competition for the user’s attention (Falkinger, 2007; Simon, 1971).

2. Strictly speaking, we have made the implicit assumption that videos submitted close in time evolve according to some “universal” pattern, so that their view count quantile is relatively stable. This has been verified empirically (Szabo and Huberman, forthcoming; Wu and Huberman, 2007).

3. For a counterexample, see Audia, et al., 2000.



Pino G. Audia, Edwin A. Locke, and Ken G. Smith, 2000. “The paradox of success: An archival and a laboratory study of strategic persistence following radical environmental change,” Academy of Management Journal, volume 43, number 5, pp. 837–853.http://dx.doi.org/10.2307/1556413

Peter S. Fader and Bruce G.S. Hardie, forthcoming. “Customer–base valuation in a contractual setting: The perils of ignoring heterogeneity,” Marketing Science; version at http://www.brucehardie.com/papers/022/customer_base_valuation_final.pdf, accessed 28 December 2009.

Josef Falkinger, 2007. “Attention economies,” Journal of Economic Theory, volume 133, number 1, pp. 266–294.http://dx.doi.org/10.1016/j.jet.2005.12.001

Georg Franck, 1999. “Science communication: A vanity fair,” Science, volume 286, number 5437, pp. 53–55.http://dx.doi.org/10.1126/science.286.5437.53

Michael H. Goldhaber, 1997. “The attention economy and the Net,” First Monday, volume 2, number 4, at http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/519/440, accessed 28 December 2009.

Denis Hancock, 2009. “Putting the YouTube long tail in perspective,” Wikinomics (4 March), at http://www.wikinomics.com/blog/index.php/2009/03/04/putting-the-youtube-long-tail-in-perspective/, accessed 28 December 2009.

David A. Hirshleifer and Siew Hong Teoh, 2006. “Limited investor attention and stock market misreactions to accounting information,” Working Paper, at http://www.cob.ohio-state.edu/fin/dice/papers/2005/2005-24.pdf, accessed 28 December 2009.

Bernardo A. Huberman, 2001. The laws of the Web: Patterns in the ecology of information. Cambridge, Mass.: MIT Press.

Bernardo A. Huberman, Daniel M. Romero, and Fang Wu, 2009. “Crowdsourcing, attention and productivity,” Journal of Information Sciences, volume 35, number 6, pp. 758–765.http://dx.doi.org/10.1177/0165551509346786

Daniel Kahneman, 1973. Attention and effort. Englewood Cliffs, N.J.: Prentice–Hall.

Arjo Klamer and Hendrik P. Van Dalen, 2002. “Attention and the art of scientific publishing,” Journal of Economic Methodology, volume 9, number 3, pp. 289–315.http://dx.doi.org/10.1080/1350178022000015104

H.A. Simon, 1971. “Designing organizations for an information rich world,” In: M. Greenberger (editor). Computers, communications and the public interest. Baltimore, Md.: Johns Hopkins Press, pp. 38–52.

Brian Stelter, 2008. “YouTube videos pull in real money,” New York Times (10 December), p. A1, and at http://www.nytimes.com/2008/12/11/business/media/11youtube.html, accessed 28 December 2009.

Gabor Szabo and Bernardo A. Huberman, forthcoming. “Predicting the popularity of online content,” Communications of the ACM; version at http://www.hpl.hp.com/research/idl/papers/predictions/predictions.pdf, accessed 28 December 2009.

Richard H. Thaler and William T. Ziemba, 1988. “Parimutuel betting markets: Racetracks and lotteries,” Journal of Economic Perspectives, volume 2, number 2, pp. 161–174.http://dx.doi.org/10.1257/jep.2.2.161

James W. Vaupel and Anatoli I. Yashin, 1985. “Heterogeneity’s ruses: Some surprising effects of selection on population dynamics,” American Statistician, volume 39, number 3, pp. 176–185.

Fang Wu and Bernardo A. Huberman, 2007. “Novelty and collective attention,” Proceedings of the National Academy of Sciences, volume 104, number 45, pp. 17,599–17,601.http://dx.doi.org/10.1073/pnas.0704916104


Editorial history

Paper received 3 December 2009; accepted 28 December 2009.

Creative Commons License
“A persistence paradox” by Fang Wu and Bernardo A. Huberman is licensed under a Creative Commons Attribution 3.0 United States License.

A persistence paradox
by Fang Wu and Bernardo A. Huberman.
First Monday, Volume 15, Number 1 - 4 January 2010

A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2019. ISSN 1396-0466.