Collaboratively written by thousands of people, Wikipedia produces entries which are consistent with criteria agreed by Wikipedians and of high quality. This article focuses on Wikipedia’s Featured Articles and shows that not every contribution can be considered as being of equal quality. Two groups of articles are analysed by focusing on the edits distribution and the main editors’ contribution. The research shows how these aspects of the revision patterns can change dependent upon the category to which the articles belong.
2. Current research and literature
3. Research structure and methods
4. Data presentation and discussion
Wikipedia is an online encyclopedia based on software for collaborative writing called a wiki. This software allows easy and fast editing of the encyclopedia’s content. Moreover, Wikipedia grants anyone with Internet access the right to modify its content.
This encyclopedia is a young project characterised by fast growth and pervasive diffusion. Started in 2001, it contained more than 11 million articles in 262 languages at the end of 2008. The English version of the encyclopedia alone accounts for more than 2.5 million articles and 8.3 million registered users .
This vast collection of articles is categorised according to two main criteria: the field or subject of the article and its development status. While the first follows the traditional criteria of encyclopedic categorisation, the second is related to the technological medium that Wikipedia uses. Indeed, the articles develop in a gradual way thanks to the contribution made by several people and, usually, they pass through different stages.
Articles start as a simple stub (i.e., a very short entry containing few sentences) and grow as a developing article, when more paragraphs are added to form a consistent discourse. In advanced stages they can be submitted for evaluation and awarded the status of good or featured article . The latter “are considered to be the best articles in Wikipedia, as determined by Wikipedia’s editors”. Accordingly to the Featured article criteria, a FA is: well written, comprehensive, factually accurate, neutral, stable and it follows the style guidelines .
The open–editing process of Wikipedia raised some scepticism about the quality and reliability of its content. In some cases the use of Wikipedia as a source has been banned from college departments (Cohen, 2007). Moreover, the fact that anyone can freely edit its content could lead to the misuse of the medium, as already documented (CBC, 2007; Wikinews, 2006). However, these cases seem to be the exception rather than the rule, and the very existence of articles that are assessed through a formal process, represent an interesting case to study. Indeed, FA are promoted through a formal review process — Featured article candidate (FAC) — which introduces a degree of formality into a project that, at a first glance, appears as an anarchic space.
The focus of this article is on the revision pattern of the Featured articles and its aim is to investigate possible correlations between the pattern and the resulting quality of the article. In order to do this, I identify two main groups of FA: the high density (HD) and the low density (LD) ones, I then compare their revision patterns to spot any similarities or differences that suggest relevant correlation. More details about the two groups and the research design are provided below.
The article is structured as follows. The next section provides an overview of the academic understanding of Wikipedia and, in particular, of the Featured articles. The following section outlines the research design and the methods used to gather and analyse the data. Then, the relevant data and tables are presented and discussed, while the final section summarises the findings and suggests further directions for inquiry.
2. Current research and literature
The interest that Wikipedia provokes amongst many researchers has resulted in a growing body of literature which tries to address the most recurring questions about this encyclopedia. Is such an encyclopedia reliable? Why do people contribute to this project? How can uncoordinated and voluntary contributions produce consistent articles? Although the answers to these questions are far from complete, several indicators already exist.
Elvebakk (2008) compared the content of Wikipedia with two more traditional encyclopedias written and edited by academics — Stanford Encyclopaedia of Philosophy and the Internet Encylopaedia of Philosophy. She analysed and compared the articles related to twentieth century philosophers and found that Wikipedia presents the field of philosophy in a similar way to the other two sources. However, Wikipedia also presents the discipline as more dynamic and lively than do the more authoritative sources, for example by including a higher number of philosophers born after 1940.
The use of citations to scientific journals in Wikipedia has been investigated by Nielsen (2007). He examined the outbound citations of Wikipedia, from the database of April 2007, and confronted it with the statistics provided by the Journal Citations Report (JCR). The study suggests an increasing usage of citations to scientific journals, though different subjects make different use of citations and the average is still below the JCR values.
On a more social level, Bryant, et al. (2005) investigated the heterogeneous community of editors working on Wikipedia. They pointed out that novice editors have little perception of the complexity of the community, of its policies and its norms. However, as they become experienced they start recognising it and feeling part of this community. Moreover, the aims and scope of the edits change accordingly to experience: they focus more on the overall consistency of the encyclopedia, rather than on individual articles.
Forte and Bruckman (2005) offer an explanation of the motivations behind editors’ participation in the community. Their interviews with the Wikipedians showed very similar reasons for contributing to those of the scientific community. They also showed two important differences: the attribution of authorship in Wikipedia is indirect, while it is direct in the scientific community, and the online encyclopedia is not involved in primary research. Other than this, the two communities are associated with a similar ‘cycle of credit’ system (Latour and Woolgar, 1986).
When we focus on questions specific to FA it is important to mention Huberman and Wilkinson (2007), who demonstrated a close relation between the number of edits, the visibility of the articles and their quality. Their research included a comparison between all Featured Articles and an equal sample of normal ones. The major finding holds that more edits lead, on average, to an increase of quality. Moreover, the articles that attract more edits are the ones that deal with topics of high relevance and visibility, leaving a large part of Wikipedia articles with lower visibility and fewer revisions.
However, the equation ‘more edits equal higher quality’ that this study suggests, is not the whole picture. Indeed, according to Jones (2008) the revision patterns of articles of different quality present interesting characteristics. He analysed both promoted and rejected FAs and compared his findings with other revision studies. The general pattern that emerged is that articles start as short stubs dominated by lists of information and grow over time. Later editors tend to follow the earlier structure of the article and to expand sentences into paragraphs, and major paragraphs into sub–sections, showing a pattern biased towards additions rather than deletions and adjustments. Moreover, the study questioned previous research claiming that microstructure revisions would be more frequent in low quality articles (i.e., rejected FA), while according to Jones, both groups presented more macrostructure than microstructure revisions, with the highest percentage traceable in the rejected FA.
3. Research structure and methods
In Wikipedia all FA are listed on a dedicated page and grouped by topics and subjects  (e.g., History, Computing, Music, …). Although the classiffcation is arbitrary and will likely be subject to recategorization in future, some subjects contain nearly ten times more featured articles than other ones . Apparently, articles belonging to certain topics are easier to bring to a featured status while others are not. The aim of this article is to investigate this ‘anomaly’ to find specific loci in the revision patterns that are related to the likelihood of an article becoming Featured.
The first step is the identication of two suitable groups to investigate. Here, I define the High Density (HD) and the Low Density (LD) groups. The former includes articles from subjects that have more than 150 FA, while in the latter are considered only subjects with less than 20 FA. Each group contains four subjects, and each subject includes four articles. The whole sample accounts for 32 Featured Articles . The subjects belonging to LD are: Computing, Mathematics, Language & Linguistics and Philosophy & Psychology, while to HD belong: History, Media, Music and Warfare. Since the criteria for FA tend to change over time, I selected articles that have been featured after October 2005 or, in case they have been featured before this date, that successfully passed the Featured Article Review (FAR) process.
The second step is to select aspects to investigate in the revision pattern. Since previous research highlighted the relevance of the number of edits and different editors’ commitment, this article will focus on the typology of edits and on the editors’ distribution. These aspects will be investigated through the metadata included in the revision history of each article.
This page is also known as “history page” and contains all the older versions of the article, plus four metadata elements for each revision: the date and time of the edit; the nickname of the user who edited the article, or the related IP address  when the user is not registered; the “minor edit” box that the editor can check if only superficial differences exist between the current and previous version (e.g., typo corrections, formatting and presentational changes); and, the optional space for commenting the edit purpose.
I gathered data from the revision history of each article and, with the help of a simple computer script, I coded them to highlight information about the contributors and the kind of edits. Some of the data have been coded and entered manually into a database. From there, I started building the relevant tables and averages that will be discussed in the next section.
For each article I produced the total number of editors and the number of their edits, as well as the number of edits and their distribution between major (M) and minor (m) edits. I gathered data for the whole article life span, from its creation until the time of conducting the research in April 2008. In addition, I coded a subset of the same data limited to the Featured Article Candidate (FAC) period. This is the timeframe in which articles formally starts the peer–based examination process, which generally lasts about two weeks and which determines whether the article can be promoted to featured status or not. Finally, I collected information from the current version of each article to have an overview of the analysed sample.
The data have been collected for each article and the averages have been built separately for each subject. However, due to the scope of this article, only the group averages and the aggregate one will be presented.
There are some limitations to this approach. Firstly, the location of the two main groups responds to the Wikipedians’ classification of FA categories. This is an arbitrary categorisation that does not respond to the canonical one. Perhaps another definition of these categories could allow a more even distribution of the Featured Articles, proving that I try to find correlations where there are none. Secondly, the small sample of articles used for this research will not allow generalisation, since the total number of Featured Articles  is in a bigger order of magnitude. And finally, as it is true for all research about Wikipedia and the Web in general, the rapid changes to which these phenomena are subject would make both the data and the results of this study obsolescent relatively soon.
4. Data presentation and discussion
In this section the editing process of the articles is discussed along two main axes to provide information about the types of edits and the editors’ pool. The first axis analyses the distribution of major, minor and bot edits  and describes the articles’ revision pattern. The second focuses on the main editors’ contribution. However, before approaching these two aspects, I provide an overview of the typical FA contained in the sample to contextualize the analysis of the editing process. Table 1 presents the average articles for the LD and HD groups. It provides information about the composition, the structure and the revision process of the articles, and it shows that the typical FA is:
Table 1: Average values for key aspects on the composition of the Featured Articles sample. Low density Aggregate average High density Size (bytes) 50635.00 55592.78 60550.56 Word count 5262.00 5968.78 6675.56 Word count (lead section) 266.50 290.06 313.63 References used 63.00 90.88 118.75 Unique references 41.19 47.22 53.25 External links 6.63 6.09 5.56 Internal links 2.81 3.66 4.50 ‘Milestones’ undergone 2.75 2.72 2.69 Revisions 1378.50 1323.44 1268.38 Editors 516.25 498.56 480.88
long — According to the Manual of Style of Wikipedia, articles between 40 and 60 kilobytes in length should be considered for splitting , therefore with an average size higher than 50 kB, these articles represent borderline cases;
referenced — 47.22 reliable sources, each of which is used nearly twice, allow an article of less than 6,000 words to be considered well documented;
reviewed — In addition to the FAC process, the featured articles have already been subject to at least one more peer–based process;
collaboratively written — Nearly 500 people helped in the development of the article.
If we focus on the differences between the two groups, the usage of references is the first aspect worth noticing. Although both have a similar number of unique references, they make rather different use of them. In HD, the ratio between references used and unique references is 2.2, while in LD it is only 1.5.
From a size perspective, HD includes longer and bigger articles, however the additional work required to produce these articles does not find a correspondigly higher number of contributors and contributions. On the contrary, HD articles have a slightly lower number of editors and edits. This suggests that there is something more to investigate beside the amount of edits and editors in relation to the quality of the articles.
The indicator “Milestones undergone” refers to the number of processes to which the article was subject. These processes can vary from the general peer–review to the more specific copy editing. The articles from the sample took advantage of these processes by undergoing more than one of them, beside the FAC. This seems to confirm previous research that showed how articles enter these processes in preparation for the FAC evaluation (Viégas, et al., 2007).
Table 2: Ratio of major and minor edits.
Data are provided for the overall article lifetime and for the FAC evaluation process period.
Overall article lifetime FAC time frame Major rev. Minor rev. Major rev. Minor rev. M (%) m (%) bots (%) M (%) m (%) bots (%) HD group avg. 72.35 25.61 2.05 67.20 31.74 1.06 LD group avg. 68.04 29.44 2.53 70.91 28.21 0.88 Aggregate averages 70.19 27.52 2.29 69.05 29.97 0.97
Table 2 addresses the composition of edits in the FA sample. It distinguishes between major (M), minor (m) and bots’ edits, with the latter being minor edits as well, as a consequence of the policy on userbot accounts. A minor edit denotes superficial differences between the current and previous version (e.g., typo corrections, formatting and presentational changes …).
From Table 2 it emerges that FA are built upon a large percentage of major edits (70.19 percent) and this does not seem surprising since by definition, minor edits are unimportant from a content accretion perspective. What does look interesting is that nearly a third (29.81 percent) of the overall work consists of adjustments and that a tiny part (2.20 percent) of this maintenance work is automated and performed by bots. However, during the evaluation process, the presence of edits made by bots drops to 0.97 percent, and this decrease does not benefit the major edits, but the minor ones. This highlights two aspects: the importance of minor edits and the difficulty to automate these edits in a delicate time frame.
The relevance of minor edits as has emerged here seems to complement the findings in Jones (2008). He uses a different coding scheme which distinguishes revisions as microstructural and macrostructural changes, but although ‘microstructural changes’ do not equal the ‘minor edits’ used in this work, the two categories have similarities in that both focus on polishing and fixing the articles.
It is also important to note that this distribution does not deviate significantly in the two groups, and it can be assumed that the ratio of 2.3 major edits per minor edit, it is a characteristic of the sample used for this study. It remains constant across groups, subjects and different time frames. In the next table I introduce the distribution of the main and the three main editors per number of edits. Table 3 shows how this distribution varies across groups and time frames, suggesting a correlation between the consistent work of a small group of editors and the likelihood of an article to become featured.
Table 3: Percentages of edits by the first editor (per number of edits), and by the first three editors (per number of edits).
Data are provided for the overall article lifetime and for the FAC evaluation process period.
Overall article lifetime FAC time frame Overall article lifetime FAC time frame main editor (%) 3 main eds. (%) main editor (%) 3 main eds. (%) HD group avg. 30.73 39.31 44.07 72.11 LD group avg. 19.86 28.37 52.12 72.45 Aggregate averages 25.29 33.84 48.09 72.28
Overall, a quarter (25.29 percent) of the editing work is handled by a single editor and the first three editors together account for one–third (33.84 percent) of the work. The values rise during the FAC period when the main editor takes care of half of the total number of revisions (48.09 percent), while the three main editors together are responsible for more than two–thirds of the work (72.28 percent).
The increase of the main editor’s contribution during FAC, can be partially explained by the way the evaluation process takes place. The editor who nominates the article is supposed to follow the process and address the comments that are raised. However, this does not exclude the other editors from participating as usual. Since this work requires a good knowledge of the article, it often happens that the nominator is also one of the main contributors. Indeed, 20 out of 32 articles have been nominated by one of the three main editors.
As a clarification, it should be noted that the main editors in the two time frames are not necessarily the same users. Moreover, for methodological reasons and due to the limits of this work, tracking the first three editors seemed a good compromise, but in some articles this group might consist of more than three editors, or less.
Looking at the differences between the two groups, it seems that articles where a small group of editors take care of a greater part of the work, tend to be featured more easily. Indeed, in HD the main and the first three editors together account for 30.73 percent and 39.31 percent of the work respectively. The main contributor in HD works more than the three main editors (28.37 percent) in LD. However, during the FAC the situation changes. On the one hand, the main contributor in LD takes care of more work (52.12 percent) than the one in HD (44.07 percent). On the other hand, the three main editors perform the same amount of edits in both groups (72 percent). With this picture in mind it is possible to argue two things. First, articles in which few editors perform more revisions present a smaller variety of writing styles and, probably, manage to develop the article in a more consistent way. Second, during FAC a bigger effort for the main editor is required, if this did not happen during the article lifetime.
Despite the differences between the two groups, the overall trend seems clear. Beside the contributions of hundreds of editors, Featured Articles need a small group of editors regularly and consistently working on them.
This study analysed the revision pattern of a sample of Featured Articles to understand the correlations between the patterns and the quality of the articles. In particular, it identifies two distinct groups of subjects: those with a small number of Featured Articles (less than 20) and those with many more FA (more than 150). By focusing on the distribution of the different types of edits and on the main editors’ contribution the study revealed interesting findings.
At first glance, the Featured Articles are relatively long, make consistent use of references, take advantage of hundreds of editors’ contributions, and are built on thousands of revisions. However, a closer look reveals that not every contribution has the same weight and major edits do not necessarily equate to ‘better edit’ for the article quality. Indeed, the minor edits play an important role in the revision pattern of FA: they consitute 30 percent of the whole editing work either in the HD group and in the LD one. This is consistent with previous finding (Jones, 2008) holding that polishing the article and focusing on small improvements is not a prerogative of low–quality articles.
This article also analysed the role of the main editors (per number of edits) in relation to the article quality. In this regard, the two groups showed interesting differences. Articles where the presence of the main editors is higher tend to become featured more easily. This suggests the importance of a consistent style or a clear imprint during the evolution of the article.
To conclude, this article suggests that future research on revision patterns and quality correlations should not neglect the heterogeneous aspects that exist within the class of objects under investigation.
About the author
Giacomo Poderi obtained his Master’s degree with distinction in the European Studies of Society, Science and Technology programme in 2007/2008. His main research interests concerns virtual communities and online collaborative efforts with a particular focus on free and open source software (FOSS). He recently started his PhD research on the social structures of FOSS communities at the University of Stuttgart.
A special thanks goes to Professor Sally Wyatt who kindly encouraged this article and provided feedbacks along its writing. Any mistakes or omissions in this article however are the author’s sole responsibility.
1. See http://meta.wikimedia.org/wiki/List_of_Wikipedias.
2. This is not a complete list of categories and status of the articles in Wikipedia.
3. See http://en.wikipedia.org/wiki/Wikipedia:Featured_articles and http://en.wikipedia.org/wiki/Wikipedia:Featured_article_criteria.
4. See http://en.wikipedia.org/wiki/Wikipedia:Featured_articles.
5. Cf. Warfare with Language & Linguistics.
6. See Table 4 in Appendix for the complete list of articles.
7. It is a unique address that certain electronic devices use in order to identify and communicate with each other on a computer network using the Internet Protocol (IP) standard.
8. 2,320 FA at the moment of writing.
9. Bots (short for “robots”) are automated, or semi-automated, accounts for making repetitive edits without the necessity of human decision–making.
10. Although the scope of a topic can sometimes justify their length. See http://en.wikipedia.org/wiki/Wikipedia:SIZE#A_rule_of_thumb.
S.L. Bryant, A. Forte, and A. Bruckman, 2005. “Becoming wikipedian: Transformation of participation in a collaborative online encyclopedia,” GROUP ’05: Proceedings of the 2005 international ACM SIGGROUP conference on supporting group work. New York: ACM, pp. 1–10.
CBC, 2007. “A question of authority,” at http://www.cbc.ca/news/background/tech/wikipedia2.html, accessed 10 November 2008.
N. Cohen, 2007. “A history department bans citing Wikipedia as a research source,” New York Times (21 February), at http://www.nytimes.com/2007/02/21/education/21wikipedia.html, accessed 10 November 2008.
B. Elvebakk, 2008. “Philosophy democratized? A comparison between Wikipedia and two other Web–based philosophy resources,” First Monday, volume 13, number 2, at http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2091/1938, accessed 10 November 2008.
A. Forte and A. Bruckman, 2005. “Why do people write for Wikipedia? Incentives to contribute to open–content publishing,” Georgia Institute of Technology, College of Computing, at http://www.cc.gatech.edu/~aforte/ForteBruckmanWhyPeopleWrite.pdf, accessed 10 November 2008.
B.A. Huberman and D.M. Wilkinson, 2007. “Assessing the value of cooperation in Wikipedia,” First Monday, volume 12, number 4, at http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1763/1643, accessed 10 November 2008.
J. Jones, 2008. “Patterns of revision in online writing: A study of Wikipedia’s Featured Articles,” Written Communication, volume 25, number 2, pp. 262–289.http://dx.doi.org/10.1177/0741088307312940
B. Latour and S. Woolgar, 1986. Laboratory life: The construction of scientific facts. Reprint edition. Princeton, N.J.: Princeton University Press.
F.Å. Nielsen, 2007. “Scientific citations in Wikipedia,” First Monday, volume 12, number 8, at http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1997/1872, accessed 10 November 2008.
F.B. Viégas, M. Wattenberg, and M.M. McKeon, 2007. “The hidden order of Wikipedia,” 12th International Conference on Human–Computer Interaction, at http://www.research.ibm.com/visual/papers/hidden_order_wikipedia.pdf, accessed 10 November 2008.
Wikinews, 2006. “Congressional staff actions prompt Wikipedia investigation” (30 January), at http://en.wikinews.org/wiki/Congressional_staff_actions_ prompt_Wikipedia_investigation, accessed 10 November 2008.
Appendix: Complete list of Featured Articles
Table 4: Complete sample of Featured Articles used for this research.
Latest current revision taken on 5 April 2008.
Subject Article Size (bytes) Revisions Editors HD group History Brown Dog affair 58,081 742 145 History of Puerto Rico 55,039 840 324 Slavery in ancient Greece 62,647 500 210 Yellowstone fires of 1988 57,115 435 39 Media Blade Runner 69,556 3,248 1,171 Jabba the Hutt 40,847 1,629 846 Reese Witherspoon 62,754 2,605 1,274 Philadelphia Inquirer 38,167 412 202 Music Beijing opera 49,345 570 227 Blues 61,195 2,935 1,462 Stereolab 50,795 707 192 Woody Guthrie 58,393 1,467 550 Warfare 1994 Black Hawk shootd. inc. 69,496 430 42 F–4 Phantom II 72,341 1,890 533 Struct. hist. of the Roman mil. 71,120 862 88 Warsaw uprising 91,918 1,022 389 LD group Computing Delrina 28,830 209 99 HTTP cookie 56,163 1,341 721 OpenBSD 40,623 1,791 486 Opera (Web browser) 48,421 2,242 877 Lang & Linguist. Bengali language 62,802 1,621 541 Mayan language 77,838 1,357 337 Thou 37,657 789 452 Turkish language 71,080 2,543 1,009 Mathematics 0.999 67,068 2,506 876 Georg Cantor 51,644 1,089 408 Infinite monkey theorem 35,331 1,119 553 Polar coordinate system 31,203 732 240 Philos. & Psych. Eric A. Havelock 27,944 289 140 Hilary Putnam 55,493 845 263 Parapsychology 56,898 2,384 628 Søren Kierkegaard 61,165 1,199 630
Paper received 1 Feburary 2009; accepted 10 April 2009.
“Comparing featured article groups and revision patterns correlations in Wikipedia” by Giacomo Poderi is licensed under a Creative Commons Attribution–Non–Commercial–Share Alike 2.0 UK: England & Wales License.
Comparing featured article groups and revision patterns correlations in Wikipedia
by Giacomo Poderi
First Monday, Volume 14, Number 5 - 4 May 2009