Multimedia:Metrics
Appearance
There are several reasons why we want to develop metrics regarding multimedia content on/from Wikimedia websites:
- We need data about users to design or improve the software, as part of the User research, or for ethnographic reasons.
- We need data to facilitate communication / outreach / relationships with GLAMs.
- We need data to measure the changes induced by the Multimedia Usability project, and more generally assess the impact of changes we try.
Questions:
- What do we need/want to measure, and why?
- Participation: How many people contribute multimedia content? How much? How often? How many people improve Commons (classification, description, etc.)
- Impact: How many people use and reuse our multimedia content? How much? How often?
Content
Ideally with evolution over time. We would need to record this on a regular basis, probably on the toolserver.
Inventory & temporal evolution
Question | Implementation | Notes |
---|---|---|
How many files do we host on Commons? | Can be measured with countless tools, e.g. here or here. | |
How many files are uploaded on Commons each month? | Here and here (experimental!). | |
How many files are uploaded to all Wikimedia projects each month? | ||
What are the topics covered by our content? (distribution) | Count files in the subcategories of Category:Topics | Breakdown of media files by super topical category, e.g. art, science etc. We may need to do some clean-up in the Category:Topics first though. |
What are the media types of our content? (distribution) | E.g. medium: maps, technical drawings, animations, photos. This may be derived from . We could count files in the subcategories of Category:Media types but it doesn't seem to be reliable. We also have historical data in Commons:MIME type statistics. We might have to wait until we actually record this somewhere (and extract/migrate existing data). | |
Where does our content come from ? (own work, etc.) (distribution) | Count files in the subcategories of Category:Pictures and images by source? Doesn't look reliable | |
What location does our content come from? (map) | Extract location information from geotagging templates on file pages & plot it | See http://poulpy.blogspot.com/2010/02/elles-sont-ou-les-photos-de-commons.html |
Under which licenses is our content released? (distribution) | Count files in the subcategories of Category:Copyright statuses, Category:Free licenses, Category:Creative Commons licenses | When we count in categories, we should probably automatically count separately subcategories that contain more than XX% files of the parent category. It would allow us to have a more accurate overview without having to manually decide which subcategories to count separately. |
What is the size of our files? (distribution) | DB query | |
What upload medium was used? (distribution) | "Old" upload form, new upload form, API (bot, desktop applications, add-media-wizard, etc.). Needs schema change & minor change to the upload API. |
Maintenance
Question | Implementation | Notes |
---|---|---|
How many edits are performed on Commons? And in which namespace? | Broken down by namespace | |
How many edits are performed on all Wikimedia projects? Particularly, in the File namespace? | in order to be able to compare the evolution of Commons | |
How fast are files categorized on Commons? | User:Multichill/Categorization stats | |
How many files are deleted on Commons each month? | Similar to the upload deletion ratio but where deletions would be only those of files uploaded during that same period. | |
How long have files been online before they were deleted? |
Relevance (where is it shown)
How often do we serve images in their original huge size? Breakdown of image hosted/served by thumbnail size
Internal (Wikimedia projects)
Question | Implementation | Notes |
---|---|---|
How are files from a specific category used across Wikimedia projects? | glamorous | GlobalUsage-based |
How are files from cultural partnerships used across Wikimedia projects? | AmalGLAMate | a GLAM-specific aggregation of images-in-category usage statistics |
See also:
- Discussion about other GlobalUsage weekly stats.
External (other sites)
Question | Implementation | Notes |
---|---|---|
Can we track external use of content? | There is no reliable way to record usage from websites that use a local copy of files they found on Wikimedia Commons. As a consequence, we can only track usage from websites that fetch media files directly from Commons. | |
How many websites use Commons as file repository? | ||
How many files from Commons are used on Wikimedia websites? | ||
How many files from Commons are used on MediaWiki websites using InstantCommons? | needs development to integrate InstantCommons with GlobalUsage | |
How many files from Commons are used on websites using other CMSes? | to be discussed when we actually find a way to extend InstantCommons to other CMSes. |
Users
Typology
We use a typology similar to the one already used on Wikistats & the report card:
- active participants: 5+ edits per month (Report Card)
- very active participants: 100+ edits per month (Report Card)
Participation
Question | Implementation | Notes |
---|---|---|
How many new accounts are created at Commons each month? | with distinction between accounts created directly on Commons and SUL accounts created automatically | |
How often do uploads succeed? | Ratio upload screen requested / actual transfers | |
Who uploads files? | user: new, active, very active participants on Commons, also depending on whether they're new, active or very active on another Wikimedia project |
Reach
Question | Implementation | Notes |
---|---|---|
What is the language used by our viewers? (distribution) | example | |
What is the location of our viewers? (map) | ||
How many viewers see a given image, and at what resolution? | image view statistics (image usage coupled to page views); see mw:Hit stats aggregation. Something in apparently in the works with Domas & WMDE. |
- ideally, we would be able to break down results using all filters, e.g. for a given file, see how many people viewed it, from where, using what language
- ideally, we would also be able to collect similar statistics for a set of files (e.g. inside a category).
We can measure this only for Wikimedia websites