Jump to content

Multimedia:Demographics and content dynamics

From Wikimedia Usability Initiative

Content and community growth

There has been much interest in the academic world about "Who writes Wikipedia?" and whether most of the content is contributed by an elite group of participants or by occasional visitors[1] [2]. In particular, Roth studied the factors influencing wiki viability and noted a "dynamical intertwinement of population and content growth"[3]; they had earlier suggested that a wiki's success was linked to "a virtuous demographic path with content and contributors co-evolving"[4].

In a media repository like Wikimedia Commons, however, the focus of activity is on contributing new media files, rather than improving the existing ones. Once a file has been uploaded, improvements are mostly limited to metadata and peripheral information (description of the media file, copyright information, general topics, location, etc.); the files themselves are rarely edited. As a consequence, it is particularly interesting to study the dynamics of population & content in this special case. For this purpose, we studied the temporal evolution of the Files-to-active Participants ratio (F:P) and compared it to the Articles-to-active Participants (A:P) ratio on the English Wikipedia ({fig:content-vs-participants-chart}).

image

{\label{fig:content-vs-participants-chart}Temporal evolution of the ratio of media files on Wikimedia Commons per active participant ($\blacksquare$) and evolution of the ratio of articles on the English-language Wikipedia per active participant ($\blacktriangle$). }

While the articles-to-participants ratio has remained stable on Wikipedia after the first few years of existence, the files-to-participants ratio has been steadily increasing since the creation of Wikimedia Commons. F:P has exceeded A:P since then and is now ten times higher than A:P. This result is not surprising when one considers the fundamental difference between a media repository, where files are accumulated, and a collaborative encyclopedia, where content is improved over time. However, it does demonstrate that Wikimedia Commons, despite being successful in terms of content, does not follow the usual model of "viable" or "successful" wikis, and requires new metrics and new models in order to assess its health.

Isotype-style diagram showing the temporal evolution of the number of media files & participants on Wikimedia Commons.

Content inflow management

Because of the fundamental difference between a text-based encyclopedia and a media repository, a more interesting approach is to compare the capacity of the community of participants to "absorb" the inflow of new content contributed to the platform. Thus, we studied the temporal evolution of the ratio of persistent new media files uploaded each month, per very active participants on Wikimedia Commons ({fig:content-inflow-commons-enwp}). We compared it to the ratio of persistent new articles per very active participants on the English-language Wikipedia. "Persistent" means that we only count media files and articles that are not deleted during the patrolling process; the actual number of files uploaded and articles created is higher. We chose to consider only very active participants (more than 100 edits per month) since they are the more likely participants to engage into patrolling activities, such as checking newly uploaded files or newly created pages.

image

{\label{fig:content-inflow-commons-enwp}Temporal evolution of the ratio of persistent new media files on Wikimedia Commons per very active participant ($\blacksquare$) and evolution of the ratio of persistent new articles on the English-language Wikipedia per very active participant ($\blacktriangle$).}

We found that the ratio of persistent new media files per very active participants has doubled since the creation of Wikimedia Commons and still continues to increase. This ratio is now more than ten times higher than the one for articles on the English-language Wikipedia. Our conclusion is that the content of Wikimedia Commons is growing much faster than the community of very active participants. Because of this imbalance between the growth of the content and the growth of the community, Wikimedia Commons faces a peculiar challenge.

The MediaWiki software provides various maintenance and patrolling tools that allow participants to check newly contributed content; one of these tools is the "watchlist", a personal page listing the recent changes made to pages of interest selected by each participant. Watchlists are appropriate for text-based wikis like Wikipedia where participants want to check new edits to existing pages, rather than new pages. However, maintenance activities on Wikimedia Commons mainly consist of checking new files (especially their copyright status) and classifying them appropriately. For this purpose, the usefulness of the watchlist is limited. Some ad-hoc tools have been developed by experienced participants, but the software itself does not (yet) provide dedicated features to help the limited community of participants absorb the inflow of new media files.

Notes and references

  1. Long tail of user participation in Wikipedia. E. H. Chi, N. Kittur, B. Pendleton, and B. Suh. May 2007.
  2. Creating, destroying, and restoring value in Wikipedia. R. Priedhorsky, J. Chen, S. T. K. Lam, K. Panciera, L. Terveen, and J. Riedl. In GROUP '07: Proceedings of the 2007 international ACM conference on Supporting group work, pages 259-268, New York, NY, USA, 2007. ACM.
  3. Measuring wiki viability. An empirical assessment of the social dynamics of a large sample of wikis. C. Roth, D. Taraborelli, and N. Gilbert. In Proceedings of the 4th International Symposium on Wikis - WikiSym 2008, New York, NY, USA, 2008. ACM.
  4. Viable wikis: struggle for life in the wikisphere. C. Roth. In WikiSym '07: Proceedings of the 2007 international symposium on Wikis, pages 119-124, New York, NY, USA, 2007. ACM.