Wikipedia:Size of Wikipedia
The Wikipedia:Statistics referring to the size of Wikipedia; mostly page and article count.
Most of the earlier entries were drawn from Wikipedia Announcements. Later entries are taken from observations of the new software's built-in article count features.
See also: Wikipedia:Size comparisons
How it's measured
The general goal has been to make a conservative count, by excluding pages that are obviously not articles or articles in progress. The sudden jump in article count in October 2002 is due to roughly 30,000 stub articles on U.S. towns and cities generated from a database being added by an auto-posting robot: it is controversial as to whether these are "real" encylopedia articles or merely as yet only "stub" articles containing raw data. The single outlier point is the result of a software glitch which affected the counter, now corrected.
The growth rate graph below was generated as follows: first, the erroneous article counts due to the software glitch were removed and replaced with a straight line. A daily estimated article count was produced by linear interpolation between nearest neighbour data points. This data was then smoothed using a 1-month rolling average. Forward differences were then taken. The data is shown first with the Rambot edits included, and then scaled so that we can see some of the non-Rambot detail.
See Fonzy's Predicted Graph for the end of May: User:Fonzy/Wikipedia Size Prediction
The most recent growth data shows a steady increase in the rate of article creation for the first 14 months of the project, followed by an unexplained dip and an explained sharp rise. It's not clear how much of this can be accounted for by automatically generated articles on geographical subjects. Since Rambot finished, we can see a generation rate of about 160-180 articles per day.
Note that the previous slow-down in the growth of the site coincided with the phase II server performance problems, and that according to recent data the previous quasi-exponential growth pattern appears to have resumed. Bearing in mind how severe these problems were, the continued growth of Wikipedia during that period is a testament to Wikipedia's strength.
Chronology of software versions:
- Phase I UseMod Wiki-based software: January 10, 2001 - January 25, 2002
- Phase II PHP-based software: January 25, 2002 - July 20, 2002
- Phase III PHP-based software: July 20, 2002 - present
This data set notes the fact that these figures are drawn from multiple data sources and different estimates (see the key below for details), and presents them as a spreadsheet-ready table for graphing. The original data sets are archived: see the links below. Note also that the figures are sampled at random times of day.
Key to the data below:
- approx: this figure is an approximation
- lowerbound indicates that there were at least this many pages
- mpacIII: main page article count from the Phase III software
- mpacII: main page article count from the Phase II software
- spII: stats page article count from the Phase II software
- all: total of all pages of any sort
- commapp: pages which include a comma, a crude way of finding "real" articles
- conscnt: "conservative count" taken by removing the count of various types of non-article from the comma page count
- MF: Malcolm Farmer
- LMS: Larry Sanger
- WA: Wikipedia:Announcements
Now extended and annotated with (somewhat gnomic) source information.
Note: The current mpacIII article count is 7,023,425 articles
Date Articles Source Notes 2003 Apr 23, 116408, mpacIII 2003 Apr 22, 116259, mpacIII 2003 Apr 21, 116066, mpacIII 2003 Apr 20, 115879, mpacIII 2003 Apr 18, 115361, mpacIII 2003 Apr 15, 114861, mpacIII 2003 Apr 12, 114469, mpacIII 2003 Apr 11, 114274, mpacIII 2003 Apr 1, 112564, mpacIII 2003 Mar 28, 112035, mpacIII 2003 Mar 15, 109933, mpacIII 2003 Mar 7, 108593, mpacIII 2003 Mar 5, 108298, mpacIII 2003 Feb 27, 107243, mpacIII 2003 Feb 26, 107027, mpacIII 2003 Feb 24, 106700, mpacIII 2003 Jan 22, 100321, mpacIII 2003 Jan 12, 98475, mpacIII 2003 Jan 8, 97913, mpacIII 2003 Jan 4, 97137, mpacIII 2003 Jan 2, 96664, mpacIII 2002 Dec 26, 95735, mpacIII 2002 Dec 23, 95452, mpacIII 2002 Dec 16, 94497, mpacIII, (so; is the counter fixed now?) 2002 Dec 13, 93834, mpacIII 2002 Dec 10, 93301, mpacIII, Corrected counter drift
2002 Dec 10, 100046, mpacIII, RAMBOT causing Error in counter. 2002 Dec 4, 94165, mpacIII 2002 Dec 1, 93880, mpacIII 2002 Nov 27, 93263, mpacIII 2002 Nov 25, 92777, mpacIII, ~775 bot generated articles added. 2002 Nov 22, 91580, mpacIII, some recent performance problems 2002 Nov 18, 90905, mpacIII, article counter is back, after being switched off 2002 Nov 9, 90266, mpacIII 2002 Nov 7, 90003, mpacIII 2002 Nov 6, 89375, mpacIII 2002 Nov 1, 88597, mpacIII 2002 Oct 30, 88292, mpacIII 2002 Oct 27, 87285, mpacIII 2002 Oct 26, 87206, mpacIII 2002 Oct 25, 87037, mpacIII, rambot in operation 2002 Oct 24, 80887, mpacIII, rambot in operation 2002 Oct 23, 76958, mpacIII, rambot in operation 2002 Oct 22, 74005, mpacIII, rambot in operation 2002 Oct 21, 66738, mpacIII, rambot in operation 2002 Oct 20, 66372, mpacIII, rambot in operation 2002 Oct 19, 61128, mpacIII, rambot in operation 2002 Oct 17, 54339, mpacIII 2002 Oct 14, 53174, mpacIII 2002 Oct 11, 52571, mpacIII
2002 Oct 10, 52435, mpacIII 2002 Oct 8, 52092, mpacIII 2002 Oct 4, 50953, mpacIII 2002 Oct 3, 50804, mpacIII 2002 Sep 30, 49724, mpacIII 2002 Sep 27, 47448, mpacIII 2002 Sep 26, 47152, mpacIII
2002 Sep 25, 46133, mpacIII 2002 Sep 24, 45707, mpacIII 2002 Sep 23, 45462, mpacIII 2002 Sep 22, 45159, mpacIII 2002 Sep 21, 44920, mpacIII 2002 Sep 17, 43962, mpacIII 2002 Sep 16, 43762, mpacIII 2002 Sep 10, 42268, mpacIII 2002 Sep 9, 42021, mpacIII 2002 Sep 7, 41559, mpacIII 2002 Sep 5, 41141, mpacIII 2002 Sep 4, 40934, mpacIII 2002 Sep 3, 40718, mpacIII 2002 Aug 31, 40093, mpacIII 2002 Aug 22, 38780, mpacIII 2002 Aug 14, 37508, mpacIII 2002 Aug 12, 37259, mpacIII, upgraded to phase III software after major performance problems 2002 May 17, 33333, spII approx (WA) 2002 Apr 16, 32000, spII approx (WA) 2002 Mar 28, 30000, spII approx (WA) 2002 Feb 4, 23000, mpacII approx (WA) 2002 Jan 9, 20000, estimate approx (LMS WA) 2001 Dec 14, 19000, conscnt approx 2001 Dec 7, 18000, conscnt approx 2001 Nov 6, 16000, conscnt lowerbound 2001 Oct 25, 15053, conscnt (LMS WA) 2001 Oct 19, 14000, commacnt lowerbound 2001 Oct 4, 13182, conscnt (MF WA) 2001 Sep 19, 12502, conscnt (LMS) 2001 Sep 9, 11208, conscnt (MF) 2001 Sep 7, 10000, conscnt lowerbound 2001 Aug 22, 9043, conscnt 2001 Aug 7, 8000, conscnt 2001 Jul 27, 7243, conscnt 2001 Jul 26, 6947, conscnt 2001 Jul 8, 6000, conscnt lowerbound 2001 May 20, 4985, commapp 2001 May 10, 3969, commapp 2001 Apr 27, 3281, commapp 2001 Mar 30, 2221, commapp 2001 Mar 24, 1910, commapp 2001 Mar 7, 1323, commapp 2001 Feb 12, 1000, all lowerbound 2001 Feb 8, 900, all lowerbound 2001 Jan 31, 617, all 2001 Jan 25, 270, all 2001 Jan 10, 0, all
These pages hold the earlier source data in its original ad-hoc tabular format:
- See also : Wikipedia:Statistics, Wikipedia:Non-English Wikipedias