Jump to content

Wikipedia:Size of Wikipedia

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 217.158.106.14 (talk) at 13:01, 30 October 2002 (another post-rambot point). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The Wikipedia:Statistics referring to the size of Wikipedia; mostly page and article count.

So far, most of this has been drawn from Wikipedia Announcements.

See also: Wikipedia:Size comparisons

How it's measured

The general goal has been to make a conservative count, by excluding pages that are obviously not articles or articles in progress. The spike in article count in October 2002 is due to roughly 30,000 stub articles on U.S. towns and cities generated from a database being added by an auto-posting robot: it is controversial as to whether these are "real" encylopedia articles or merely as yet only "stub" articles containing raw data.

File:Wikipedia article count graph jan 2001 - oct 2002.png

The growth rate graph below has been obtained by only using the first data point in each month, working out the average rate of growth over each remaining interval (typically roughly a month), and associating this rate with the date in the middle of the interval. This is a coarse enough view to smooth out artifacts from the varying bases and estimates in the data to reveal the underlying trends.


The most recent growth data shows a steady increase in the rate of article creation, some but not all of which can be accounted for by automatically generated articles on geographical subjects.

Note that the previous slow-down in the growth of the site coincided with the phase II server performance problems, and that according to recent data the previous quasi-exponential growth pattern appears to have resumed. Bearing in mind how severe these problems were, the continued growth of Wikipedia during that period is a testament to Wikipedia's strength.

Chronology of software versions:

  • Phase I UseMod Wiki-based software: January 10, 2001 - January 25, 2002
  • Phase II PHP-based software: January 25, 2002 - July 20, 2002
  • Phase III PHP-based software: July 20, 2002 - present

This data set notes the fact that these figures are drawn from multiple data sources and different estimates (see the key below for details), and presents them as a spreadsheet-ready table for graphing. The original data sets are archived: see the links below.


Key to the data below:

  • approx: this figure is an approximation
  • lowerbound indicates that there were at least this many pages
  • mpacIII: main page article count from the Phase III software
  • mpacII: main page article count from the Phase II software
  • spII: stats page article count from the Phase II software
  • all: total of all pages of any sort
  • commapp: pages which include a comma, a crude way of finding "real" articles
  • conscnt: "conservative count" taken by removing the count of various types of non-article from the comma page count

Now extended and annotated with (somewhat gnomic) source information.

Date          Articles    Source   Notes
2002 Oct 30,   88292,     mpacIII
2002 Oct 27,   87285,     mpacIII
2002 Oct 26,   87206,     mpacIII
2002 Oct 25,   87037,     mpacIII, rambot in operation
2002 Oct 24,   80887,     mpacIII, rambot in operation
2002 Oct 23,   76958,     mpacIII, rambot in operation
2002 Oct 22,   74005,     mpacIII, rambot in operation
2002 Oct 21,   66738,     mpacIII, rambot in operation
2002 Oct 20,   66372,     mpacIII, rambot in operation
2002 Oct 19,   61128,     mpacIII
2002 Oct 17,   54339,     mpacIII
2002 Oct 14,   53174,     mpacIII
2002 Oct 11,   52571,     mpacIII
2002 Oct 10,   52435,     mpacIII
2002 Oct 8,    52092,     mpacIII
2002 Oct 4,    50953,     mpacIII
2002 Oct 3,    50804,     mpacIII
2002 Sep 30,   49724,     mpacIII
2002 Sep 27,   47448,     mpacIII
2002 Sep 26,   47152,     mpacIII
2002 Sep 25,   46133,     mpacIII
2002 Sep 24,   45707,     mpacIII
2002 Sep 23,   45462,     mpacIII
2002 Sep 22,   45159,     mpacIII
2002 Sep 21,   44920,     mpacIII
2002 Sep 17,   43962,     mpacIII
2002 Sep 16,   43762,     mpacIII
2002 Sep 10,   42268,     mpacIII
2002 Sep 9,    42021,     mpacIII
2002 Sep 7,    41559,     mpacIII
2002 Sep 5,    41141,     mpacIII
2002 Sep 4,    40934,     mpacIII
2002 Sep 3,    40718,     mpacIII
2002 Aug 31,   40093,     mpacIII
2002 Aug 22,   38780,     mpacIII
2002 Aug 14,   37508,     mpacIII
2002 Aug 12,   37259,     mpacIII
2002 May 17,   33333,     spII approx (WA)
2002 Apr 16,   32000,     spII approx (WA)
2002 Mar 28,   30000,     spII approx (WA)
2002 Feb 4,    23000,     mpacII approx (WA)
2002 Jan 9,    20000,     estimate approx (LMS WA)
2001 Dec 14,   19000,     conscnt approx
2001 Dec 7,    18000,     conscnt approx
2001 Nov 6,    16000,     conscnt lowerbound
2001 Oct 25,   15053,     conscnt (LMS WA)
2001 Oct 19,   14000,     commacnt lowerbound 
2001 Oct 4,    13182,     conscnt (MF WA)
2001 Sep 19,   12502,     conscnt (LMS)
2001 Sep 9,    11208,     conscnt (MF)
2001 Sep 7,    10000,     conscnt lowerbound
2001 Aug 22,    9043,     conscnt
2001 Aug 7,     8000,     conscnt
2001 Jul 27,    7243,     conscnt
2001 Jul 26,    6947,     conscnt
2001 Jul 8,     6000,     conscnt lowerbound
2001 May 20,    4985,     commapp
2001 May 10,    3969,     commapp
2001 Apr 27,    3281,     commapp    
2001 Mar 30,    2221,     commapp
2001 Mar 24,    1910,     commapp
2001 Mar  7,    1323,     commapp
2001 Feb 12,    1000,     all lowerbound
2001 Feb  8,     900,     all lowerbound
2001 Jan 31,     617,     all
2001 Jan 25,     270,     all
2001 Jan 10,       0,     all




These pages hold the earlier source data in its original ad-hoc tabular format:


See also : Wikipedia:Statistics, Wikipedia:Non-English Wikipedias